Neuron Documentation Release Notes
Contents
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn1n
Neuron Documentation Release Notes#
Table of contents
Neuron 2.18.0#
Date: 04/01/2024
Updated PyTorch NeuronX developer guide with Snapshotting support. See Snapshotting With Torch-Neuronx 2.1.
Updated API Reference Guide (neuronx-distributed ) and Developer guide for Pipeline Parallelism (neuronx-distributed ) with support for
auto_partition
API.Updated API Reference Guide (neuronx-distributed ) with enhanced checkpointing support with
load
API andasync_save
API.Updated documentation for
PyTorch Lightning
to train models usingpipeline parallelism
. See API guide and Developer Guide.Updated NeuronX Distributed developer guide with support for Autobucketing
Added PyTorch NeuronX developer guide for Autobucketing.
Updated API Reference Guide (neuronx-distributed ) and Training Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism (neuronx-distributed ) with support for asynchronous checkpointing.
Updated Transformers NeuronX Developer guide with support for streamer and stopping criteria APIs. See Developer Guide.
Updated Transformers NeuronX Developer guide with instructions for
Repeating N-Gram Filtering
. See Developer Guide.Updated Transformers NeuronX developer guide with Top-K on-device sampling support [Beta]. See Developer Guide.
Updated Transformers NeuronX developer guide with Checkpointing support and automatic model selection. See Developer Guide.
Updated Transformers NeuronX Developer guide with support for speculative sampling [Beta]. See Developer Guide.
Added sample for training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer with
neuronx-distributed
. See Training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer (neuronx-distributed).Added Tutorial for codellama/CodeLlama-13b-hf model inference with 16K seq length using Transformers Neuronx. See sample.
Added Mixtral-8x7B Inference Sample/Notebook using TNx. See sample.
Added Mistral-7B-Instruct-v0.2 Inference inference sample using TNx. See sample.
Added announcement for Maintenance mode of TensorFlow 1.x. See Tensorflow-Neuron 1.x enters maintenance mode.
Updated PyTorch 2.1 documentation to reflect stable (out of beta) support. See Introducing PyTorch 2.1 Support.
Updated PyTorch NeuronX environment variables to reflect stable (out of beta) support. See PyTorch NeuronX Environment Variables.
Updated Release Artifacts with supported HuggingFace Transformers versions.
Added user guide instructions for
Neuron DLAMI
. See Neuron DLAMI User Guide.Updated PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API tutorial with latest Hugging Face Trainer API.
Updated Neuron Runtime API guide with support for
nr_tensor_allocate
. See Developer’s Guide - NeuronX Runtime.Updated Neuron Sysfs User Guide with support for
serial_number
unique identifier.Updated Custom Operators API Reference Guide [Beta] limitations and fixed nested sublists. See Neuron Custom C++ Operators Developer Guide [Beta].
Fixed issue in ZeRO-1 Tutorial.
Fixed potential hang during synchronization step in
nccom-test
. See NCCOM-TEST User Guide.Updated troubleshooting guide with an additional hardware error messaging. See Neuron Runtime Troubleshooting on Inf1, Inf2 and Trn1.
Updated DLC documentation. See Customize Neuron DLC and Deploy Neuron Container on EC2.
Neuron 2.16.0#
Date: 12/21/2023
Added setup guide instructions for
AL2023
OS. See Setup GuideAdded announcement for name change of Neuron Components. See Announcing Name Change for Neuron Components
Added announcement for End of Support for
PyTorch 1.10
. See Announcing End of Support for PyTorch Neuron version 1.10Added announcement for End of Support for
PyTorch 2.0
Beta. See Announcing End of Support for PyTorch NeuronX version 2.0 (beta)Added announcement for moving NeuronX Distributed sample model implementations. See Announcing deprecation for NeuronX Distributed Training Samples in Neuron Samples Repository
Updated Transformers NeuronX developer guide with support for Grouped Query Attention(GQA). See developer guide
Added sample for
Llama-2-70b
model inference. See tutorialAdded documentation for
PyTorch Lightning
to train models usingtensor parallelism
anddata parallelism
. See api guide , developer guide and tutorialAdded documentation for Model and Optimizer Wrapper training API that handles the parallelization. See api guide and Developer guide for model and optimizer wrapper (neuronx-distributed )
Added documentation for New
save_checkpoint
andload_checkpoint
APIs to save/load checkpoints during distributed training. See Developer guide for save/load checkpoint (neuronx-distributed )Added documentation for a new
Query-Key-Value(QKV)
module in NeuronX Distributed for Training. See api guide and tutorialAdded new developer guide for Inference using NeuronX Distributed. developer guide
Added
Llama-2-7B
model inference script ([html] [notebook])Added App note on Support for
PyTorch 2.1
(Beta) . See Introducing PyTorch 2.1 SupportAdded developer guide for
replace_weights
API to replace the separated weights. See PyTorch Neuron (torch-neuronx) Weight Replacement API for InferenceAdded [Beta] script for training
stabilityai/stable-diffusion-2-1-base
andrunwayml/stable-diffusion-v1-5
models . See scriptAdded [Beta] script for training
facebook/bart-large
model. See scriptAdded [Beta] script for
stabilityai/stable-diffusion-2-inpainting
model inference. See scriptAdded documentation for new
Neuron Distributed Event Tracing (NDET) tool
to help visualize execution trace logs and diagnose errors in multi-node workloads. See Neuron Distributed Event Tracing (NDET) User GuideUpdated Neuron Profile User guide with support for multi-worker jobs. See Neuron Profile User Guide
Minor updates to Custom Ops API reference guide.See Custom Operators API Reference Guide [Beta]
Neuron 2.15.0#
Date: 10/26/2023
New Introducing PyTorch 2.0 Support (End of Support) application note with
torch-neuronx
New llama2_70b_tp_pp_tutorial and (sample script) using
neuronx-distributed
New Model Samples and Tutorials documentation for a consolidated list of code samples and tutorials published by AWS Neuron.
New Neuron Software Classification documentation for alpha, beta, and stable Neuron SDK definitions and updated documentation references.
New Pipeline Parallelism Overview and Developer guide for Pipeline Parallelism (neuronx-distributed ) documentation in
neuronx-distributed
Updated Neuron Distributed API Guide regarding pipeline-parallelism support and checkpointing
New Activation Memory Reduction application note and Developer guide for Activation Memory reduction (neuronx-distributed ) in
neuronx-distributed
New
Weight Sharing (Deduplication)
notebook scriptAdded Finetuning script for google/electra-small-discriminator with
torch-neuronx
Added ResNet50 training (Beta) tutorial and scripts with
torch-neuronx
Added Vision Perceiver training sample with
torch-neuronx
Added
flan-t5-xl
model inference tutorial usingneuronx-distributed
Added
HuggingFace Stable Diffusion 4X Upscaler model Inference on Trn1 / Inf2
sample script withtorch-neuronx
Updated GPT-NeoX 6.9B and 20B model scripts to include selective checkpointing.
Added serialization support and removed
-O1
flag constraint toLlama-2-13B
model inference script tutorial withtransformers-neuronx
Updated
BERT
script andLlama-2-7B
script with Pytorch 2.0 supportAdded option-argument
llm-training
to the existing--distribution_strategy
compiler option to make specific optimizations related to training distributed models in Neuron Compiler CLI Reference Guide (neuronx-cc)Updated Neuron Sysfs User Guide to include mem_ecc_uncorrected and sram_ecc_uncorrected hardware statistics.
Updated PyTorch NeuronX Tracing API for Inference to include io alias documentation
Updated Transformers NeuronX (transformers-neuronx) Developer Guide with serialization support.
Upgraded
numpy
version to1.22.2
for various scriptsUpdated
LanguagePerceiver
fine-tuning script tostable
Announcing End of Support for OPT example in
transformers-neuronx
Announcing End of Support for “nemo” option-argument
Known Issues and Limitations#
Following tutorials are currently not working. These tutorials will be updated once there is a fix.
Neuron 2.14.0#
Date: 09/15/2023
Neuron Calculator now supports multiple model configurations for Tensor Parallel Degree computation. See Neuron Calculator
Announcement to deprecate
--model-type=transformer-inference
flag. See Announcing deprecation for --model-type=transformer-inference compiler flagUpdated HF ViT benchmarking script to use
--model-type=transformer
flag. See [script]Updated
torch_neuronx.analyze
API documentation. See PyTorch NeuronX Analyze API for InferenceUpdated Performance benchmarking numbers for models on Inf1,Inf2 and Trn1 instances with 2.14 release bits. See _benchmark
New tutorial for Training Llama2 7B with Tensor Parallelism and ZeRO-1 Optimizer using
neuronx-distributed
Training Llama2 7B with Tensor Parallelism and ZeRO-1 Optimizer (neuronx-distributed )New tutorial for
T5-3B
model inference usingneuronx-distributed
(tutorial)Updated
Neuron Persistent Cache
documentation regarding clarification of flags parsed byneuron_cc_wrapper
tool which is a wrapper overNeuron Compiler CLI
. See Neuron Persistent CacheAdded
tokenizers_parallelism=true
in various notebook scripts to supress tokenizer warnings making errors easier to detectUpdated Neuron device plugin and scheduler YAMLs to point to latest images. See yaml configs
Added notebook script to fine-tune
deepmind/language-perceiver
model usingtorch-neuronx
. See sample scriptAdded notebook script to fine-tune
clip-large
model usingtorch-neuronx
. See sample scriptAdded
SD XL Base+Refiner
inference sample script usingtorch-neuronx
. See sample scriptUpgraded default
diffusers
library from 0.14.0 to latest 0.20.2 inStable Diffusion 1.5
andStable Diffusion 2.1
inference scripts. See sample scriptsAdded
Llama-2-13B
model training script usingneuronx-nemo-megatron
( tutorial )
Neuron 2.13.0#
Date: 08/28/2023
Added tutorials for GPT-NEOX 6.9B and 20B models training using neuronx-distributed. See more at Tutorials for NeuronX Distributed (neuronx-distributed )
Added TensorFlow 2.x (
tensorflow-neuronx
) analyze_model API section. See more at TensorFlow 2.x (tensorflow-neuron) analyze_model APIUpdated setup instructions to fix path of existing virtual environments in DLAMIs. See more at setup guide
Updated setup instructions to fix pinned versions in upgrade instructions of setup guide. See more at setup guide
Updated tensorflow-neuron HF distilbert tutorial to improve performance by removing HF pipeline. See more at [html] [notebook]
Updated training troubleshooting guide in torch-neuronx to describe network Connectivity Issue on trn1/trn1n 32xlarge with Ubuntu. See more at PyTorch Neuron (torch-neuronx) for Training Troubleshooting Guide
Added “Unsupported Hardware Operator Code” section to Neuron Runtime Troubleshooting page. See more at Neuron Runtime Troubleshooting on Inf1, Inf2 and Trn1
Removed ‘beta’ tag from
neuronx-distributed
section for training.neuronx-distributed
Training is now considered stable andneuronx-distributed
inference is considered as beta.Added FLOP count(
flop_count
) and connected Neuron Device ids (connected_devices
) to sysfs userguide. See Neuron Sysfs User GuideAdded tutorial for
T5
model inference. See more at [notebook]Updated neuronx-distributed api guide and inference tutorial. See more at API Reference Guide (neuronx-distributed ) and Inference with Tensor Parallelism (neuronx-distributed) [Beta]
Announcing End of support for
AWS Neuron reference for Megatron-LM
starting Neuron 2.13. See more at AWS Neuron reference for Megatron-LM no longer supportedAnnouncing end of support for
torch-neuron
version 1.9 starting Neuron 2.14. See more at Announcing end of support for torch-neuron version 1.9Upgraded
numpy
version to1.21.6
in various training scripts for Text ClassificationAdded license for Nemo Megatron to SDK Maintenance Policy. See more at SDK Maintenance Policy
Updated
bert-japanese
training Script to usemultilingual-sentiments
dataset. See `hf-bert-jp <https://github.com/aws-neuron/aws-neuron-samples/tree/master/torch-neuronx/training/hf_bert_jp> `_Added sample script for LLaMA V2 13B model inference using transformers-neuronx. See neuron samples repo
Added samples for training GPT-NEOX 20B and 6.9B models using neuronx-distributed. See neuron samples repo
Added sample scripts for CLIP and Stable Diffusion XL inference using torch-neuronx. See neuron samples repo
Added sample scripts for vision and language Perceiver models inference using torch-neuronx. See neuron samples repo
Added camembert training/finetuning example for Trn1 under hf_text_classification in torch-neuronx. See neuron samples repo
Updated Fine-tuning Hugging Face BERT Japanese model sample in torch-neuronx. See neuron samples repo
See more neuron samples changes in neuron samples release notes
Added samples for pre-training GPT-3 23B, 46B and 175B models using neuronx-nemo-megatron library. See aws-neuron-parallelcluster-samples
Announced End of Support for GPT-3 training using aws-neuron-reference-for-megatron-lm library. See aws-neuron-parallelcluster-samples
Updated bert-fine-tuning SageMaker sample by replacing amazon_reviews_multi dataset with amazon_polarity dataset. See aws-neuron-sagemaker-samples
Neuron 2.12.0#
Date: 07/19/2023
Added best practices user guide for benchmarking performance of Neuron Devices Benchmarking Guide and Helper scripts
Announcing end of support for Ubuntu 18. See more at Announcing end of support for Ubuntu 18
Improved sidebar navigation in Documentation.
Removed support for Distributed Data Parallel(DDP) Tutorial.
Neuron 2.11.0#
Date: 06/14/2023
New Neuron Calculator Documentation section to help determine number of Neuron Cores needed for LLM Inference.
Added App Note Generative LLM inference with Neuron
New
ML Libraries
Documentation section to have NeuronX Distributed and Transformers NeuronX (transformers-neuronx)Improved Installation and Setup Guides for the different platforms supported. See more at Setup Guide
Added Tutorial How to prepare trn1.32xlarge for multi-node execution
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn1n