This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2
What’s New#
Neuron 2.22.0 (04/03/2025)#
What’s New#
The Neuron 2.22 release includes performance optimizations, enhancements and new capabilities across the Neuron software stack.
For inference workloads, the NxD Inference library now supports Llama-3.2-11B model and supports multi-LoRA serving, allowing customers to load and serve multiple LoRA adapters. Flexible quantization features have been added, enabling users to specify which model layers or NxDI modules to quantize. Asynchronous inference mode has also been introduced, improving performance by overlapping Input preparation with model execution.
For training, we added LoRA supervised fine-tuning to NxD Training to enable additional model customization and adaptation.
Neuron Kernel Interface (NKI): This release adds new APIs in nki.isa, nki.language, and nki.profile. These enhancements provide customers with greater flexibility and control.
The updated Neuron Runtime includes optimizations for reduced latency and improved device memory footprint. On the tooling side, the Neuron Profiler 2.0 (beta) has added UI enhancements and new event type support.
Neuron DLCs: this release reduces DLC image size by up to 50% and enables faster build times with updated Dockerfiles structure. On the Neuron DLAMI side, new PyTorch 2.5 single framework DLAMIs have been added for Ubuntu 22.04 and Amazon Linux 2023, along with several new virtual environments within the Neuron Multi Framework DLAMIs.
More release content can be found in the table below and each component release notes.
What’s New |
Details |
Instances |
---|---|---|
NxD Core (neuronx-distributed) |
Trn1/Trn1n,Trn2 |
|
NxD Inference (neuronx-distributed-inference) |
Inf2, Trn1/Trn1n,Trn2 |
|
NxD Training (neuronx-distributed-training) |
Trn1/Trn1n,Trn2 |
|
PyTorch NeuronX (torch-neuronx) |
Trn1/Trn1n,Inf2,Trn2 |
|
NeuronX Nemo Megatron for Training |
Trn1/Trn1n,Inf2 |
|
Neuron Compiler (neuronx-cc) |
Trn1/Trn1n,Inf2,Trn2 |
|
Neuron Kernel Interface (NKI) |
Trn1/Trn1n,Inf2 |
|
Neuron Tools |
Inf1,Inf2,Trn1/Trn1n,Trn2 |
|
Neuron Runtime |
Inf1,Inf2,Trn1/Trn1n,Trn2 |
|
Transformers NeuronX (transformers-neuronx) for Inference |
Inf2, Trn1/Trn1n |
|
Neuron Deep Learning AMIs (DLAMIs) |
Inf1,Inf2,Trn1/Trn1n |
|
Neuron Deep Learning Containers (DLCs) |
Inf1,Inf2,Trn1/Trn1n |
|
Release Annoucements |
|
Inf1, Inf2, Trn1/Trn1n |
For detailed release artificats, see Release Artifacts.
Previous Releases#
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2