What’s New in the AWS Neuron SDK#
Explore detailed posts about the latest features, updates, and upcoming changes to the AWS Neuron SDK.
AWS Neuron SDK 2.27.0: Trainium3 Support, Enhanced NKI, and Unified Profiling with Neuron Explorer#
Posted on: December 19, 2025
Today we are releasing AWS Neuron SDK 2.27.0. This release adds support for Trainium3 (Trn3) instances. Enhanced NKI with new NKI Compiler introduces the nki.* namespace with updated APIs and language constructs. The NKI Library provides pre-optimized kernels for common model operations including attention, MLP, and normalization. Neuron Explorer delivers a unified profiling suite with AI-driven optimization recommendations. vLLM V1 integration is now available through the vLLM-Neuron Plugin. Deep Learning Containers and AMIs are updated with vLLM V1, PyTorch 2.9, JAX 0.7, Ubuntu 24.04, and Python 3.12.
In addition to this release, we are introducing new capabilities and features in private beta access (see Private Beta Access section). We are also announcing our transition to PyTorch native support starting with PyTorch 2.10 in Neuron 2.28, plans to simplify NxDI in upcoming releases, and other important updates. See the End of Support and Migration Notices section for more details.
Private Beta Access#
We are also opening access to the following private betas:
Native PyTorch (TorchNeuron) - Native PyTorch (TorchNeuron)
Enhanced Neuron Kernel Interface (NKI) - Enhanced Neuron Kernel Interface (NKI) with open source NKI Compiler
vLLM support for Trn3 - vLLM support for Trn3
Neuron DRA for Kubernetes - Neuron DRA for Kubernetes
To request access, visit the Neuron Private Beta signup form.
Neuron Kernel Interface (NKI)#
NKI Compiler - The new nki.* namespace replaces the legacy neuronxcc.nki.* namespace. Top-level kernel functions now require the @nki.jit annotation. Neuron 2.27 supports both namespaces side by side; the legacy namespace will be removed in Neuron 2.28. A kernel migration guide is available in the documentation.
For more details, see AWS Neuron SDK 2.27.0: Neuron Kernel Interface (NKI) release notes.
NKI Library#
The NKI Library provides pre-optimized kernels: Attention CTE, Attention TKG, MLP, Output Projection CTE, Output Projection TKG, QKV, and RMSNorm-Quant. Kernels are accessible via the nkilib.* namespace in neuronx-cc or from the GitHub repository.
For more details, see AWS Neuron SDK 2.27.0: NKI Library release notes.
Developer Tools#
Neuron Explorer - A a suite of tools designed to support ML engineers throughout their development journey on AWS Trainium. This release features improved performance and user expereince for device profiling, with four core viewers to provide insights into model performance:
Hierarchy Viewer: Visualizes model structure and component interactions
AI Recommendation Viewer: Delivers AI-driven optimization recommendations
Source Code Viewer: Links profiling data directly to source code
Summary Viewer: Displays high-level performance metrics
Neuron Explorer is available through UI, CLI, and VSCode IDE integration. Existing NTFF files are compatible but require reprocessing for new features.
New tutorials cover profiling NKI kernels, multi-node training jobs, and vLLM inference workloads. The nccom-test tool now includes fine-grained collective communication support.
For more details, see AWS Neuron SDK 2.27.0: Developer Tools Release Notes.
Inference Updates#
vLLM V1 - The vLLM-Neuron Plugin enables vLLM V1 integration for inference workloads. vLLM V0 support ends in Neuron 2.28.
NxD Inference - Model support expands with beta releases of Qwen3 MoE (Qwen3-235B-A22B) for multilingual text and Pixtral (Pixtral-Large-Instruct-2411) for image understanding. Both models use HuggingFace checkpoints and are supported on Trn2 and Trn3 instances.
For more details, see AWS Neuron SDK 2.27.0: NxD Inference release notes.
Neuron Graph Compiler#
Default accuracy settings are now optimized for precision. The --auto-cast flag defaults to none (previously matmul), and --enable-mixed-precision-accumulation is enabled by default. FP32 models may see performance impacts; restore previous behavior with --auto-cast=matmul and --disable-mixed-precision-accumulation. Python 3.10 or higher is now required.
For more details, see AWS Neuron SDK 2.27.0: Neuron Compiler release notes.
Runtime Improvements#
Neuron Runtime Library 2.29 adds support for Trainium3 (Trn3) instances and delivers performance improvements for Collectives Engine overhead, NeuronCore branch overhead, NEFF program startup, and all-gather latency.
For more details, see AWS Neuron SDK 2.27.0: Neuron Runtime release notes.
Deep Learning AMIs and Containers#
Platform Updates - All DLCs are updated to Ubuntu 24.04 and Python 3.12. DLAMIs add Ubuntu 24.04 support for base, single framework, and multi-framework configurations.
Framework Updates:
vLLM V1 single framework DLAMI and multi-framework virtual environments
PyTorch 2.9 single framework DLAMIs and multi-framework virtual environments (Amazon Linux 2023, Ubuntu 22.04, Ubuntu 24.04)
JAX 0.7 single framework DLAMI and multi-framework virtual environments
New Container - The pytorch-inference-vllm-neuronx 0.11.0 DLC provides a complete vLLM inference environment with PyTorch 2.8 and all dependencies.
For more details, see AWS Neuron SDK 2.27.0: Neuron Deep Learning AWS Machine Images release notes and AWS Neuron SDK 2.27.0: Neuron Deep Learning Containers release notes.
End of Support and Migration Notices#
Effective this release:
Neuron no longer supports Python 3.9 starting with Neuron version 2.27
Neuron no longer supports PyTorch 2.6 starting with Neuron 2.27
Neuron no longer supports Inf1 virtual environments and AMIs starting with Neuron 2.27
Neuron no longer supports parallel_model_trace API starting with Neuron 2.27
Announcing End of Support for TensorBoard Plugin for Neuron Profiler in Neuron 2.27
Effective Neuron 2.28:
Announcing End of Support for neuronxcc.nki Namespace Starting with Neuron 2.28
Announcing NKI Library Kernel Migration to New nki.* Namespace in Neuron 2.28
Announcing End of Support for vLLM V0 starting with Neuron 2.28
Effective with PyTorch 2.10 support:
Future Releases:
Detailed Release Notes#
Read the Neuron 2.27.0 component release notes for specific Neuron component improvements and details.
AWS Neuron Expands with Trainium3, Native PyTorch, Faster NKI, and Open Source at re:Invent 2025#
Posted on: 12/02/2025
At re:Invent 2025, AWS Neuron introduces support for Trainium3 UltraServer with expanded open source components and enhanced developer experience. These updates enable standard frameworks to run unchanged on Trainium, removing barriers for researchers to experiment and innovate. For developers requiring deeper control, the enhanced Neuron Kernel Interface (NKI) provides direct access to hardware-level optimizations, enabling customers to scale AI workloads with improved performance.
Expanded capabilities and enhancements include:
Trainium3 UltraServer support: Enabling customers to scale AI workloads with improved performance
Native PyTorch support: Standard PyTorch runs unchanged on Trainium without platform-specific modifications
Enhanced Neuron Kernel Interface (NKI) with open source NKI Compiler: Improved programming capabilities with direct access to Trainium hardware instructions and fine-grained optimization control, compiler built on MLIR
NKI Library: Open source collection of optimized, ready-to-use kernels for common ML operations
Neuron Explorer: Tools suite to support developers and performance engineers in their performance optimization journey from framework operations to hardware instructions
Neuron DRA for Kubernetes: Kubernetes-native resource management eliminating custom scheduler extensions
Expanded open source components: Open sourcing more components including NKI Compiler, Native PyTorch, NKI Library, and more released under Apache 2.0
AI development requires rapid experimentation, hardware optimization, and production scale workloads. These updates enable researchers to experiment with novel architectures using familiar workflows, ML developers to build AI applications using standard frameworks, and performance engineers to optimize workloads using low-level hardware optimization.
Looking to try out our Beta features?
Submit your beta access request through this form and the Neuron Product team will get back to you.
Native PyTorch Support#
Private Preview
AWS Neuron now natively supports PyTorch through TorchNeuron, an open source native PyTorch backend for Trainium. TorchNeuron integrates with PyTorch through the PrivateUse1 device backend mechanism, registering Trainium as a native device alongside other backends and allowing researchers and ML developers to run their code without modifications.
TorchNeuron provides eager mode execution for interactive development and debugging, native distributed APIs including FSDP and DTensor for distributed training, and torch.compile support for optimization. TorchNeuron enables compatibility with minimal code changes with ecosystem tools like TorchTitan and HuggingFace Transformers.
Use TorchNeuron to run your PyTorch research and training workloads on Trainium without platform-specific code changes.
Learn more: documentation, and TorchNeuron GitHub repository.
Access: Contact your AWS account team for access.
Enhanced NKI#
Public Preview
The enhanced Neuron Kernel Interface (NKI) provides developers with complete hardware control through advanced APIs for fine-grained scheduling and allocation. The enhanced NKI enables instruction-level programming, memory allocation control, and execution scheduling with direct access to the Trainium ISA.
We are also releasing the NKI Compiler as open source under Apache 2.0, built on MLIR to enable transparency and collaboration with the broader compiler community. NKI integrates with PyTorch and JAX, enabling developers to use custom kernels within their training workflows.
Use Enhanced NKI to innovate and build optimized kernels on Trainium. Explore the NKI Compiler source code to inspect and contribute to the MLIR-based compilation pipeline.
Note
The NKI Compiler source code is currently in Private Preview, while the NKI programming interface is in Public Preview.
Learn more: NKI home page and NKI Language Guide.
NKI Library#
Public Preview
The NKI Library provides an open source collection of optimized, ready-to-use kernels for common ML operations. The library includes kernels for dense transformer operations, MoE-specific operations, and attention mechanisms, all with complete source code, documentation, and benchmarks.
Use NKI Library kernels directly in your models to improve performance, or explore the implementations as reference for best practices of performance optimizations on Trainium.
Learn more: GitHub repository and API documentation.
Neuron Explorer#
Public Preview
Neuron Explorer is a tools suite that supports developers and performance engineers in their performance optimization journey. It provides capabilities to inspect and optimize code from framework operations down to hardware instructions with hierarchical profiling, source code linking, IDE integration, and AI-powered recommendations for optimization insights.
Use Neuron Explorer to understand and optimize your model performance on Trainium, from high-level framework operations to low-level hardware execution.
Learn more: Neuron Explorer documentation.
Kubernetes-Native Resource Management with Neuron DRA#
Private Preview
Neuron Dynamic Resource Allocation (DRA) provides Kubernetes-native resource management for Trainium, eliminating custom scheduler extensions. DRA enables topology-aware scheduling using the default Kubernetes scheduler, atomic UltraServer allocation, and flexible per-workload configuration.
Neuron DRA supports EKS, SageMaker HyperPod, and UltraServer configurations. The driver is open source with container images in AWS ECR public gallery.
Use Neuron DRA to simplify Kubernetes resource management for your Trainium workloads with native scheduling and topology-aware allocation.
Learn more: Neuron DRA documentation.
Access: Contact your AWS account team to participate in the Private Preview.
Resources and Additional Information#
For more information visit the AWS Trainium official page, the AWS Neuron Documentation, and the AWS Neuron GitHub repositories.