Skip to main content
Ctrl+K
Neuron 2.26.1 is released! Check What's New and Announcements for more details.
Logo image
Ctrl+K
Search Engine: Default Google
  • About Neuron
    • App Notes
      • Neuron Runtime Library
      • Performance
      • Parallel execution
      • PyTorch for Neuron
        • Running inference on variable input shapes with bucketing
        • Running R-CNNs on Inf1
        • Data Parallel Inference on Torch Neuron
      • PyTorch for NeuronX
        • Introducing PyTorch 2.6 Support
        • Introducing PyTorch 2.7 Support
        • Introducing PyTorch 2.8 Support
        • Introducing PyTorch 2.5 Support
        • Migration From XLA_USE_BF16/XLA_DOWNCAST_BF16
        • Data Parallel Inference on torch_neuronx
        • Graph Partitioner on torch_neuronx
    • Ask Amazon Q
    • Benchmarks
      • Inf1 Inference Performance
      • Inf2 Inference Performance
      • Trn1/Trn1n Inference Performance
      • Trn1/Trn1n Training Performance
    • Neuron FAQ
    • Neuron Features
      • Collective communication
      • Custom C++ operators
      • Data types
      • Logical NeuronCore configuration
      • Neuron persistent cache
      • NeuronCore batching
      • NeuronCore pipeline
      • Rounding modes
    • Open Source
    • SDK Maintenance Policy
    • Security
    • Term Glossary
    • Troubleshooting
    • What is AWS Neuron?
  • Neuron Architecture
    • AWS Inferentia
    • AWS Inferentia2
    • AWS Trainium
    • AWS Trainium2
    • AWS Trainium3
    • NeuronCore v1
    • NeuronCore v2
    • NeuronCore v3
    • NeuronCore v4
    • Inf1 Architecture
    • Inf2 Architecture
    • Trn1 Architecture
    • Trn2 Architecture
    • Trn3 Architecture
  • What's New
  • Announcements
  • Contribute

Get Started

  • Quickstarts
  • Setup Guides
    • Launching Inf/Trn instances on Amazon EC2
      • Inference
        • Compile with Framework API and Deploy on EC2 Inf1
        • Compile with Framework API and Deploy on EC2 Inf2
      • Training
        • Train your model on EC2
    • PyTorch NeuronX (torch-neuronx)
      • PyTorch NeuronX on Multi-Framework DLAMI (Ubuntu 22)
      • PyTorch NeuronX on Ubuntu 22
      • PyTorch NeuronX on Amazon Linux 2023
      • PyTorch NeuronX on Rocky Linux 9
    • PyTorch Neuron (torch-neuron)
      • PyTorch Neuron on Ubuntu 20
      • PyTorch Neuron on DLAMI Base (Ubuntu 20)
      • PyTorch Neuron on DLAMI PyTorch (Ubuntu 20)
      • PyTorch Neuron on Multi-Framework DLAMI (Ubuntu 22)
      • PyTorch Neuron on Ubuntu 22
      • PyTorch Neuron on Amazon Linux 2023
      • PyTorch Neuron on Rocky Linux 9
    • JAX NeuronX
      • JAX NeuronX plugin Setup
      • JAX NeuronX Known Issues
      • API Reference Guide for JAX Neuronx
        • JAX NeuronX Environment Variables
      • JAX NeuronX (jax-neuronx) release notes
    • Tensorflow NeuronX (tensorflow-neuronx)
      • TensorFlow NeuronX on Multi-Framework DLAMI (Ubuntu 22)
      • TensorFlow NeuronX on Ubuntu 22
      • TensorFlow NeuronX on Amazon Linux 2023
    • Tensorflow Neuron (tensorflow-neuron)
      • TensorFlow Neuron on Ubuntu 20
      • TensorFlow Neuron on DLAMI Base (Ubuntu 20)
      • TensorFlow Neuron on Multi-Framework DLAMI (Ubuntu 22)
      • TensorFlow Neuron on Ubuntu 22
      • TensorFlow Neuron on Amazon Linux 2023
    • MxNet Neuron (mxnet-neuron)
      • MXNet Neuron on Ubuntu 20
      • MXNet Neuron on DLAMI Base (Ubuntu 20)
      • MXNet Neuron on Ubuntu 22
      • MXNet Neuron on Amazon Linux 2023
    • Troubleshooting
  • Models
    • Training on Trn1
    • Inference on Inf2/Trn1/Trn2
    • Inference on Inf1
  • Developer Flows
    • Amazon EKS
      • Using Neuron with Amazon EKS
      • Deploy Neuron Container on Elastic Kubernetes Service (EKS) for Inference
      • Deploy a simple mlp training script as a Kubernetes job
    • Amazon ECS
      • Neuron Problem Detector And Recovery
      • Deploy Neuron Container on Elastic Container Service (ECS) for Inference
      • Deploy Neuron Container on Elastic Container Service (ECS) for Training
    • AWS ParallelCluster
      • Parallel Cluster Flows- Training
        • Train your model on ParallelCluster
    • AWS Batch
      • Train your model on AWS Batch
    • Amazon SageMaker
    • Third-party Solutions

Use ML Frameworks

  • Home
  • Native PyTorch
  • PyTorch NeuronX
    • Pytorch Neuron Setup
    • Native PyTorch for AWS Trainium
    • Inference (Inf2, Trn1, Trn2)
      • Tutorials
        • Compiling and Deploying HuggingFace Pretrained BERT on Trn1 or Inf2
        • BERT TorchServe Tutorial
        • LibTorch C++ Tutorial
        • Compiling and Deploying ResNet50 on Trn1 or Inf2
        • T5 model inference on Trn1 or Inf2
      • Additional Examples
        • AWS Neuron Samples GitHub Repository
        • Transformers Neuron GitHub samples
      • API Reference Guide
        • PyTorch NeuronX Tracing API for Inference
        • PyTorch Neuron (torch-neuronx) Weight Replacement API for Inference
        • PyTorch NeuronX NeuronCore Placement APIs
        • PyTorch NeuronX Analyze API for Inference
        • PyTorch NeuronX DataParallel API
      • Developer Guide
        • NeuronCore Allocation and Model Placement for Inference (torch-neuronx)
        • Comparison of Traced Inference versus XLA Lazy Tensor Inference (torch-neuronx)
        • Data Parallel Inference on torch_neuronx
      • Misc
        • PyTorch Neuron (torch-neuronx) release notes
    • Inference (Inf1)
      • Tutorials
        • Computer Vision Tutorials
        • Natural Language Processing (NLP) Tutorials
        • Utilizing Neuron Capabilities Tutorials
      • Additional Examples
        • AWS Neuron Samples GitHub Repository
      • API Reference Guide
        • PyTorch Neuron trace Python API
        • torch.neuron.DataParallel API
        • PyTorch Neuron (torch-neuron) Core Placement API
      • Developer Guide
        • Running Inference on Variable Input Shapes with Bucketing
        • Data Parallel Inference on PyTorch Neuron
        • Developer Guide - PyTorch Neuron (torch-neuron) LSTM Support
        • PyTorch Neuron (torch-neuron) Core Placement
      • Misc
        • PyTorch Neuron (torch-neuron) Supported operators
        • Troubleshooting Guide for PyTorch Neuron (torch-neuron)
        • PyTorch Neuron (torch-neuron) release notes
    • Training
      • Tutorials
        • Hugging Face BERT Pretraining Tutorial (Data-Parallel)
        • Multi-Layer Perceptron Training Tutorial
        • PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API
        • ZeRO-1 Tutorial
        • Analyze for Training Tutorial
        • Neuron Custom C++ Operators in MLP Training
        • Neuron Custom C++ Operators Performance Optimization
      • Additional Examples
        • AWS Neuron Reference for Nemo Megatron GitHub Repository
        • AWS Neuron Samples for EKS
        • AWS Neuron Samples for AWS ParallelCluster
        • AWS Neuron Samples GitHub Repository
      • API Reference Guide
        • PyTorch NeuronX neuron_parallel_compile CLI
        • PyTorch NeuronX Environment Variables
        • Neuron Persistent Cache
        • PyTorch NeuronX Profiling API
      • Developer Guide
        • Developer Guide for Training with PyTorch NeuronX
        • How to debug models in PyTorch NeuronX
        • Developer Guide for Profiling with PyTorch NeuronX
      • Misc
        • PyTorch Neuron (torch-neuronx) - Supported Operators
        • How to prepare trn1.32xlarge for multi-node execution
        • PyTorch Neuron (torch-neuronx) for Training Troubleshooting Guide
        • PyTorch Neuron (torch-neuronx) release notes
  • JAX NeuronX
    • JAX NeuronX plugin Setup
    • JAX NeuronX Known Issues
    • API Reference Guide for JAX Neuronx
      • JAX NeuronX Environment Variables
    • JAX NeuronX (jax-neuronx) release notes
  • TensorFlow NeuronX
    • Tensorflow Neuron Setup
    • Inference (Inf2 & Trn1)
      • Tutorials
        • HuggingFace Roberta-Base
        • Using NEURON_RT_VISIBLE_CORES with TensorFlow Serving
      • API Reference Guide
        • TensorFlow 2.x (tensorflow-neuronx) Tracing API
        • TensorFlow 2.x (tensorflow-neuronx) Auto Multicore Replication (Beta)
        • TensorFlow 2.x (tensorflow-neuronx) analyze_model API
      • Misc
        • TensorFlow 2.x (tensorflow-neuronx) Release Notes
    • Inference (Inf1)
      • Tutorials
        • Natural Language Processing (NLP) Tutorials
        • Utilizing Neuron Capabilities Tutorials
      • Additional Examples
        • AWS Neuron Samples GitHub Repository
      • API Reference Guide
        • TensorFlow 2.x (tensorflow-neuron) Tracing API
        • TensorFlow 2.x (tensorflow-neuron) analyze_model API
        • TensorFlow 2.x (tensorflow-neuron) Auto Multicore Replication (Beta)
      • Misc
        • TensorFlow 2.x (tensorflow-neuron) Release Notes
        • TensorFlow 2.x (tensorflow-neuron) Accelerated (torch-neuron) Python APIs and Graph Ops

Training Libraries

  • NxD Training
    • Overview
    • Setup
    • App Notes
      • Introducing NxD Training
      • Tensor Parallelism Overview
      • Pipeline Parallelism Overview
      • Activation Memory Reduction
    • API Reference Guide
      • YAML Configuration Settings
    • Developer Guides
      • Integrating a new model
      • Integrating a new dataset/dataloader
      • Registering an optimizer and LR scheduler
      • Migrating from Neuron-NeMo-Megatron to Neuronx Distributed Training
      • NxD Training Compatibility with NeMo
      • CPU Mode Developer Guide
    • Tutorials
      • HuggingFace Llama3.1/Llama3-8B Pretraining
      • HuggingFace Llama3.1/LLama3-8B Supervised Fine-tuning
      • HuggingFace Llama3.1/Llama3-8B Efficient Supervised Fine-tuning with LoRA (Beta)
      • HuggingFace Llama3.1/Llama3-8B Direct Preference Optimization (DPO) and Odds Ratio Preference Optimization (ORPO) based Fine-tuning (Beta)
      • HuggingFace Llama3.1/Llama3-70B Pretraining
      • Checkpoint Conversion
    • Misc
      • NxD Training Release Notes (neuronx-distributed-training)
      • Known Issues and Workarounds
  • NxD Core (Training)
    • Setup
    • App Notes
      • Tensor Parallelism Overview
      • Pipeline Parallelism Overview
      • Activation Memory Reduction
      • Context Parallelism Overview
    • API Reference Guide
      • Distributed Strategies APIs
      • Training APIs
      • Inference APIs
      • ModelBuilderV2 API Reference
    • Developer Guide
      • Developer guide for Tensor Parallelism
      • Developer guide for Pipeline Parallelism
      • Developer guide for Activation Memory reduction
      • Developer guide for save/load checkpoint
      • Developer guide for Neuron-PT-Lightning
      • Developer guide for model and optimizer wrapper
      • Developer guide for LoRA finetuning
    • Tutorials
      • Training Tutorials
        • Training using Tensor Parallelism
        • Training Llama 3.1 8B/Llama 3 8B using TP and ZeRO-1
        • Training Llama 3.1 70B/Llama 3 70B using TP and PP
        • Fine-tuning Llama3 8B with tensor parallelism and LoRA using Neuron PyTorch-Lightning
      • Inference Tutorials
        • T5 inference with Tensor Parallelism
    • Misc
      • NxD Core Release Notes (neuronx-distributed)

Inference Libraries

  • Overview
  • vLLM
    • Quickstart: Offline Model Serving
    • Quickstart: Online Model Serving
    • vLLM on Neuron User Guide
    • Deploy Llama4 with vLLM
  • NxD Inference
    • Overview
    • Setup
    • vLLM
      • Quickstart: Offline Model Serving
      • Quickstart: Online Model Serving
      • vLLM on Neuron User Guide
      • Deploy Llama4 with vLLM
    • Tutorials
      • Disaggregated Inference (1P1D)
      • Disaggregated Inference
      • Flux Inference
      • Generating Results with Performance CLI
      • GPT-OSS 120B
      • Llama3.1 405B on Trn2
      • Llama3.1 405B with Speculative Decoding
      • Llama3.1 70B Instruct Accuracy Evaluation
      • Llama3.1 8B with Multi-LoRA
      • Llama3.2 Multimodal
      • Llama3.3 70B with APC
      • Llama3.3 70B with Data Parallelism
      • Llama3.3 70B with Speculative Decoding
      • Llama4
      • Llama4 Legacy
      • Pixtral
      • Speculative Decoding
    • Developer Guides
      • Accuracy Evaluation
      • Custom Quantization
      • Disaggregated Inference
      • Feature Guide
      • Using FPEM
      • LLM Benchmarking
      • Migrate from TNX
      • Model Reference
      • MoE Architecture
      • Examples Migration
      • Onboarding Models
      • Performance Parameters
      • vLLM Guide (Legacy)
      • vLLM Guide v1
      • Weights Sharding
      • Writing Tests
    • API Reference Guide
      • NxD Inference API Reference
    • App Notes
      • Introducing NeuronX Distributed (NxD) Inference
      • Parallelism Techniques for LLM Inference
    • Models
      • Training on Trn1
      • Inference on Inf2/Trn1/Trn2
      • Inference on Inf1
    • Misc
      • NxD Inference Release Notes (neuronx-distributed-inference)
      • Troubleshooting Guide for NxD Inference
  • NxD Core (Inference)
    • Setup
    • App Notes
      • Tensor Parallelism Overview
      • Pipeline Parallelism Overview
      • Activation Memory Reduction
      • Context Parallelism Overview
    • API Reference Guide
      • Distributed Strategies APIs
      • Training APIs
      • Inference APIs
      • ModelBuilderV2 API Reference
    • Developer Guide
      • About NeuronX-Distributed (NxD) Inference
    • LoRA Guide
    • Tutorials
      • Training Tutorials
        • Training using Tensor Parallelism
        • Training Llama 3.1 8B/Llama 3 8B using TP and ZeRO-1
        • Training Llama 3.1 70B/Llama 3 70B using TP and PP
        • Fine-tuning Llama3 8B with tensor parallelism and LoRA using Neuron PyTorch-Lightning
      • Inference Tutorials
        • T5 inference with Tensor Parallelism
    • Misc
      • NxD Core Release Notes (neuronx-distributed)

NxD Core Libraries

  • Overview
    • HF Transformers
    • NeMo Megatron

Developer Tools

  • Home
    • Third-party Tools
    • Tutorials
      • Profiling a vLLM Inference Workload on AWS Trainium
      • Profiling Multi-Node Training Jobs with Neuron Explorer
      • Profiling PyTorch NeuronX with TensorBoard
      • Track Training Progress in TensorBoard using PyTorch Neuron
      • Track System Resource Utilization during Training with neuron-monitor using PyTorch Neuron
  • Neuron Explorer
    • Get Started
    • Launch Profiles via UI, CLI, IDE
    • Device Viewer
    • Hierarcy Viewer
    • Source Code Viewer
    • Summary Viewer
    • AI Recommendation Viewer
  • Neuron Profiler 2.0
  • Neuron Profiler
  • System Tools
    • Neuron-Monitor User Guide
    • Neuron-Top User Guide
    • Neuron-LS User Guide
    • Neuron-Sysfs User Guide
    • NCCOM-TEST User Guide
    • TensorBoard
      • TensorBoard for NeuronX

Orchestrate and Deploy

  • AWS Workload Orchestration
    • Amazon EKS
      • Using Neuron with Amazon EKS
      • Deploy Neuron Container on Elastic Kubernetes Service (EKS) for Inference
      • Deploy a simple mlp training script as a Kubernetes job
    • Amazon ECS
      • Neuron Problem Detector And Recovery
      • Deploy Neuron Container on Elastic Container Service (ECS) for Inference
      • Deploy Neuron Container on Elastic Container Service (ECS) for Training
    • AWS ParallelCluster
      • Parallel Cluster Flows- Training
        • Train your model on ParallelCluster
    • AWS Batch
      • Train your model on AWS Batch
    • Amazon SageMaker
    • Third-party Solutions
  • Neuron DLAMI
  • Neuron Containers
    • Quickstart: Deploy a DLC with vLLM
    • Getting started with Neuron DLC using Docker
    • Neuron Deep Learning Containers
    • Customize Neuron DLC
    • Neuron Plugins for Containerized Environments
    • How to schedule MPI jobs to run on Neuron UltraServer on EKS
    • Neuron Containers FAQ
    • Containers - Tutorials
      • Inference
        • Run Inference in PyTorch Neuron Container
        • Deploy a TensorFlow Resnet50 model as a Kubernetes service
      • Training
        • Run Training in PyTorch Neuron Container
        • Deploy a simple mlp training script as a Kubernetes job
    • DRA Beta

Runtime & Collectives

  • Neuron Runtime
    • Overview
    • Get Started
    • Deep Dives
      • Understand NEFF Files
      • Compute-Communication Overlap
      • Neuron Device Memory
      • Direct HBM Tensor Allocation
      • Runtime Performance Tips
      • Neuron Runtime Core Dumps
      • Inter-node Collectives
      • Intra-node Collectives
    • Configuration Guide
      • Runtime Configuration
    • API Reference Guide
      • Runtime API
    • Runtime API
    • NRT Debug Stream
    • Resources
      • Troubleshooting on Inf1 and Trn1
      • FAQ
      • Neuron Runtime Release Notes
      • Neuron Driver Release Notes
      • Neuron Collectives Release Notes
  • Collectives

Compilers

  • Graph Compiler
    • NeuronX Compiler for Trn1 & Inf2
      • API Reference Guide
        • Neuron Compiler CLI Reference Guide
      • Developer Guide
        • Mixed Precision and Performance-accuracy Tuning (neuronx-cc)
        • How to Use Convolution Kernels in UNet Training Models
      • Misc
        • FAQ
        • What's New
    • Neuron Compiler for Inf1
      • API Reference Guide
        • Neuron compiler CLI Reference Guide (neuron-cc)
      • Developer Guide
        • Mixed precision and performance-accuracy tuning (neuron-cc)
      • Misc
        • FAQ
        • What's New
        • Neuron Supported operators
    • Error codes
      • NCC_EARG001
      • NCC_EBVF030
      • NCC_EHCA005
      • NCC_EOOM001
      • NCC_EOOM002
      • NCC_ESFH002
      • NCC_ESPP004
      • NCC_ESPP047
      • NCC_EUOC002
      • NCC_EVRF001
      • NCC_EVRF004
      • NCC_EVRF005
      • NCC_EVRF006
      • NCC_EVRF007
      • NCC_EVRF009
      • NCC_EVRF010
      • NCC_EVRF011
      • NCC_EVRF013
      • NCC_EVRF015
      • NCC_EVRF016
      • NCC_EVRF017
      • NCC_EVRF018
      • NCC_EVRF019
      • NCC_EVRF022
      • NCC_EVRF024
      • NCC_EVRF031
      • NCC_EXSP001
      • NCC_EXTP004
  • NKI Compiler
    • About the NKI Compiler
    • Graph Compiler Integration
  • Neuron C++ Custom Operators
    • API Reference Guide
      • Custom Operators API Reference Guide [Beta]
    • Developer Guide
      • Neuron Custom C++ Operators Developer Guide [Beta]
    • Tutorials
      • Neuron Custom C++ Operators in MLP Training
      • Neuron Custom C++ Operators Performance Optimization
    • Misc (Neuron Custom C++ Operators)
      • Neuron Custom C++ Tools Release Notes
      • Neuron Custom C++ Library Release Notes

Neuron Kernel Interface (NKI)

  • Home
    • NKI Release Notes
  • Concepts
    • Data Representation
    • Direct Memory Access
    • Indexing
    • Memory Hierarchy
    • Tiling
    • Trainium/Inferentia2 Architecture
    • Trainium2 Architecture
    • Trainium3 Architecture
    • NKI Beta Features
    • Known Issues
    • FAQ
  • NKI Setup
    • Get Started with NKI (legacy document)
  • Quickstart: Build and Run a Kernel
  • How-To Guides
    • Introduction to NKI Kernel Optimization
    • NKI Kernel as a Framework Custom Operator
    • How to Profile a NKI Kernel
    • Profiling NKI kernels with Neuron Profile (Legacy)
    • NKI Performance Guide
    • NKI Direct Allocation Developer Guide
    • NKI Block Dimension Migration Guide
  • Tutorials
    • Matrix multiplication
    • LayerNorm
    • RMSNorm
    • AveragePool2D
    • Transpose2D
    • Fused Self Attention
    • Fused Mamba
    • SPMD Tensor Addition
    • Multi-core SPMD Addition
  • Deep Dives
    • NKI Language Guide (Beta 2)
    • NKI Programming Model (Legacy)
  • API Reference
    • nki
      • nki.jit
    • nki.isa
      • nki.isa.nc_matmul
      • nki.isa.nc_matmul_mx
      • nki.isa.nc_transpose
      • nki.isa.activation
      • nki.isa.activation_reduce
      • nki.isa.tensor_reduce
      • nki.isa.tensor_partition_reduce
      • nki.isa.tensor_tensor
      • nki.isa.tensor_tensor_scan
      • nki.isa.scalar_tensor_tensor
      • nki.isa.tensor_scalar
      • nki.isa.tensor_scalar_reduce
      • nki.isa.tensor_copy
      • nki.isa.tensor_copy_dynamic_src
      • nki.isa.tensor_copy_dynamic_dst
      • nki.isa.tensor_copy_predicated
      • nki.isa.reciprocal
      • nki.isa.quantize_mx
      • nki.isa.iota
      • nki.isa.dropout
      • nki.isa.affine_select
      • nki.isa.range_select
      • nki.isa.select_reduce
      • nki.isa.sequence_bounds
      • nki.isa.memset
      • nki.isa.bn_stats
      • nki.isa.bn_aggr
      • nki.isa.local_gather
      • nki.isa.dma_copy
      • nki.isa.dma_transpose
      • nki.isa.dma_compute
      • nki.isa.max8
      • nki.isa.nc_find_index8
      • nki.isa.nc_match_replace8
      • nki.isa.nc_stream_shuffle
      • nki.isa.register_alloc
      • nki.isa.register_load
      • nki.isa.register_move
      • nki.isa.register_store
      • nki.isa.core_barrier
      • nki.isa.sendrecv
      • nki.isa.engine
      • nki.isa.reduce_cmd
      • nki.isa.dge_mode
      • nki.isa.nc_version
      • nki.isa.get_nc_version
    • nki.language
      • nki.language.ndarray
      • nki.language.zeros
      • nki.language.ds
      • nki.language.static_range
      • nki.language.affine_range
      • nki.language.sequential_range
      • nki.language.psum
      • nki.language.sbuf
      • nki.language.hbm
      • nki.language.private_hbm
      • nki.language.shared_hbm
      • nki.language.program_id
      • nki.language.num_programs
      • nki.language.program_ndim
      • nki.language.bool_
      • nki.language.uint8
      • nki.language.uint16
      • nki.language.uint32
      • nki.language.int8
      • nki.language.int16
      • nki.language.int32
      • nki.language.float4_e2m1fn_x4
      • nki.language.float8_e4m3
      • nki.language.float8_e4m3fn_x4
      • nki.language.float8_e5m2
      • nki.language.float8_e5m2_x4
      • nki.language.float16
      • nki.language.bfloat16
      • nki.language.float32
      • nki.language.tfloat32
      • nki.language.tile_size
    • NKI API Common Fields
    • Legacy NKI APIs
  • NKI Library
    • Overview
    • Tutorial: Use a NKI Library Kernel
    • Kernel Design Specs
      • RMSNorm-Quant
    • Kernel API Reference
      • RMSNorm-Quant
      • QKV
      • Attention CTE
      • Attention TKG
      • MLP
      • Output Projection CTE
      • Output Projection TKG

Other Content

  • Release Notes
    • Neuron 2.26.1
    • Previous versions
  • Archived content
    • Fine-tune T5 model on Trn1
    • Running SSD300 with AWS Neuron
    • Megatron GPT Pretraining
    • Training GPT-NeoX 20B with Tensor Parallelism and ZeRO-1 Optimizer
    • Fine-tuning Llama2 7B with tensor parallelism and ZeRO-1 optimizer using Neuron PyTorch-Lightning
    • Training Llama-2-7B/13B/70B using Tensor Parallelism and Pipeline Parallelism with Neuron PyTorch-Lightning
    • Training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer
    • Training GPT-NeoX 6.9B with Tensor Parallelism and ZeRO-1 Optimizer
    • Neuron Plugin for TensorBoard (Inf1)
    • NeuronPerf (Beta)
      • Overview
      • Terminology
      • Examples
      • Benchmark Guide
      • Evaluate Guide
      • Compile Guide
      • Model Index Guide
      • API
      • Framework Notes
      • FAQ
      • Troubleshooting
      • What’s New
        • NeuronPerf 1.x Release Notes
    • Helper Tools
      • Check Model
      • GatherInfo
    • Transformers NeuronX (transformers-neuronx)
      • Setup
      • Developer Guide
        • Transformers NeuronX (transformers-neuronx) Developer Guide
        • Transformers NeuronX (transformers-neuronx) Developer Guide for Continuous Batching
      • Tutorials
        • Hugging Face meta-llama/Llama-2-13b autoregressive sampling on Inf2 & Trn1
        • Hugging Face facebook/opt-13b autoregressive sampling on Inf2 & Trn1
        • Hugging Face facebook/opt-30b autoregressive sampling on Inf2 & Trn1
        • Hugging Face facebook/opt-66b autoregressive sampling on Inf2
      • Misc
        • Transformers Neuron (transformers-neuronx) release notes
  • Repository
  • Suggest edit
  • Open issue
  • .rst

Neuron no longer supports NeMo Megatron starting this release

Neuron no longer supports NeMo Megatron starting this release#

Starting with Neuron release 2.23, Neuron no longer supports NeMo Megatron.

All users of AWS Neuron Reference for NeMo Megatron are requested to migrate their training workloads to NxD Training. Please refer to Neuron NeMo Megatron to NeuronX Distributed Training Migration Guide for guidance.

By AWS

© Copyright 2025, Amazon.com.