This document is relevant for: Inf1, Inf2, Trn1, Trn2

What’s New#

Neuron 2.23.0 (05/20/2025)#

What’s New#

With the Neuron 2.23 release, we move NxD Inference (NxDI) library out of beta. It is now recommended for all multi-chip inference use-cases. In addition, Neuron has new training capabilities, including Context Parallelism and ORPO, NKI improvements (new operators and ISA features), and new Neuron Profiler debugging and performance analysis optimizations. Finaly, Neuron now supports PyTorch 2.6 and JAX 0.5.3.

Inference: NxD Inference (NxDI) moves from beta to GA. NxDI now supports Persistent Cache to reduce compilation times, and optimizes model loading with improved weight sharding performance.

Training: NxD Training (NxDT) added Context Parallelism support (beta) for Llama models, enabling sequence lengths up to 32K. NxDT now supports model alignment, ORPO, using DPO-style datasets. NxDT has upgraded supports for 3rd party libraries, specifically: PyTorch Lightning 2.5, Transformers 4.48, and NeMo 2.1.

Neuron Kernel Interface (NKI): New support for 32-bit integer nki.language.add and nki.language.multiply on GPSIMD Engine. NKI.ISA improvements include range_select for Trainium2, fine-grained engine control, and enhanced tensor operations. New performance tuning API no_reorder has been added to enable user-scheduling of instructions. When combined with allocation, this enables software pipelining. Language consistency has been improved for arithmetic operators (+=, -=, /=, *=) across loop types, PSUM, and SBUF.

Neuron Profiler: Profiling performance has improved, allowing users to view profile results 5x times faster on average. New features include timeline-based error tracking and JSON error event reporting, supporting execution and OOB error detection. Additionally, this release improves multiprocess visualization with Perfetto.

Neuron Monitoring: Added Kubernetes context information (pod_name, namespace, and container_name) to neuron monitor prometheus output, enabling resource utilization tracking by pod, namespace, and container.

Neuron DLCs: This release updates containers with PyTorch 2.6 support for inference and training. For JAX DLC, this release adds JAX 0.5.0 training support.

Neuron DLAMIs: This release updates MultiFramework AMIs to include PyTorch 2.6, JAX 0.5, and TensorFlow 2.10 and Single Framework AMIs for PyTorch 2.6 and JAX 0.5.

What’s New

Details

Instances

NxD Core (neuronx-distributed)

Trn1/Trn1n,Trn2

NxD Inference (neuronx-distributed-inference)

Inf2, Trn1/Trn1n,Trn2

NxD Training (neuronx-distributed-training)

Trn1/Trn1n,Trn2

PyTorch NeuronX (torch-neuronx)

Trn1/Trn1n,Inf2,Trn2

Neuron Compiler (neuronx-cc)

Trn1/Trn1n,Inf2,Trn2

Neuron Kernel Interface (NKI)

Trn1/Trn1n,Inf2

Neuron Tools

Inf1,Inf2,Trn1/Trn1n,Trn2

Neuron Runtime

Inf1,Inf2,Trn1/Trn1n,Trn2

Transformers NeuronX (transformers-neuronx) for Inference

Inf2, Trn1/Trn1n

Neuron Deep Learning AMIs (DLAMIs)

Inf1,Inf2,Trn1/Trn1n

Neuron Deep Learning Containers (DLCs)

Inf1,Inf2,Trn1/Trn1n

Release Annoucements

Inf1, Inf2, Trn1/Trn1n

For detailed release artificats, see Release Artifacts.

Previous Releases#

This document is relevant for: Inf1, Inf2, Trn1, Trn2