This document is relevant for: Inf1, Inf2, Trn1, Trn1n

Neuron Documentation Release Notes#

Table of contents

Neuron 2.18.0
Neuron 2.16.0
Neuron 2.15.0
Neuron 2.14.0
Neuron 2.13.0
Neuron 2.12.0
Neuron 2.11.0

Neuron 2.18.0 #

Date: 04/01/2024

Updated PyTorch NeuronX developer guide with Snapshotting support. See Snapshotting With Torch-Neuronx 2.1.
Updated API Reference Guide (neuronx-distributed ) and Developer guide for Pipeline Parallelism (neuronx-distributed ) with support for auto_partition API.
Updated API Reference Guide (neuronx-distributed ) with enhanced checkpointing support with load API and async_save API.
Updated documentation for PyTorch Lightning to train models using pipeline parallelism . See API guide and Developer Guide.
Updated NeuronX Distributed developer guide with support for Autobucketing
Added PyTorch NeuronX developer guide for Autobucketing.
Updated API Reference Guide (neuronx-distributed ) and Training Llama-2-13B/70B with Tensor Parallelism and Pipeline Parallelism (neuronx-distributed ) with support for asynchronous checkpointing.
Updated Transformers NeuronX Developer guide with support for streamer and stopping criteria APIs. See Developer Guide.
Updated Transformers NeuronX Developer guide with instructions for Repeating N-Gram Filtering. See Developer Guide.
Updated Transformers NeuronX developer guide with Top-K on-device sampling support [Beta]. See Developer Guide.
Updated Transformers NeuronX developer guide with Checkpointing support and automatic model selection. See Developer Guide.
Updated Transformers NeuronX Developer guide with support for speculative sampling [Beta]. See Developer Guide.
Added sample for training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer with neuronx-distributed. See Training CodeGen2.5 7B with Tensor Parallelism and ZeRO-1 Optimizer (neuronx-distributed).
Added Tutorial for codellama/CodeLlama-13b-hf model inference with 16K seq length using Transformers Neuronx. See sample.
Added Mixtral-8x7B Inference Sample/Notebook using TNx. See sample.
Added Mistral-7B-Instruct-v0.2 Inference inference sample using TNx. See sample.
Added announcement for Maintenance mode of TensorFlow 1.x. See Tensorflow-Neuron 1.x enters maintenance mode.
Updated PyTorch 2.1 documentation to reflect stable (out of beta) support. See Introducing PyTorch 2.1 Support.
Updated PyTorch NeuronX environment variables to reflect stable (out of beta) support. See PyTorch NeuronX Environment Variables.
Updated Release Artifacts with supported HuggingFace Transformers versions.
Added user guide instructions for Neuron DLAMI. See Neuron DLAMI User Guide.
Updated PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API tutorial with latest Hugging Face Trainer API.
Updated Neuron Runtime API guide with support for nr_tensor_allocate. See Developer’s Guide - NeuronX Runtime.
Updated Neuron Sysfs User Guide with support for serial_number unique identifier.
Updated Custom Operators API Reference Guide [Beta] limitations and fixed nested sublists. See Neuron Custom C++ Operators Developer Guide [Beta].
Fixed issue in ZeRO-1 Tutorial.
Fixed potential hang during synchronization step in nccom-test. See NCCOM-TEST User Guide.
Updated troubleshooting guide with an additional hardware error messaging. See Neuron Runtime Troubleshooting on Inf1, Inf2 and Trn1.
Updated DLC documentation. See Customize Neuron DLC and Deploy Neuron Container on EC2.

Neuron 2.16.0 #

Date: 12/21/2023

Added setup guide instructions for AL2023 OS. See Setup Guide
Added announcement for name change of Neuron Components. See Announcing Name Change for Neuron Components
Added announcement for End of Support for PyTorch 1.10 . See Announcing End of Support for PyTorch Neuron version 1.10
Added announcement for End of Support for PyTorch 2.0 Beta. See Announcing End of Support for PyTorch NeuronX version 2.0 (beta)
Added announcement for moving NeuronX Distributed sample model implementations. See Announcing deprecation for NeuronX Distributed Training Samples in Neuron Samples Repository
Updated Transformers NeuronX developer guide with support for Grouped Query Attention(GQA). See developer guide
Added sample for Llama-2-70b model inference. See tutorial
Added documentation for PyTorch Lightning to train models using tensor parallelism and data parallelism . See api guide , developer guide and tutorial
Added documentation for Model and Optimizer Wrapper training API that handles the parallelization. See api guide and Developer guide for model and optimizer wrapper (neuronx-distributed )
Added documentation for New save_checkpoint and load_checkpoint APIs to save/load checkpoints during distributed training. See Developer guide for save/load checkpoint (neuronx-distributed )
Added documentation for a new Query-Key-Value(QKV) module in NeuronX Distributed for Training. See api guide and tutorial
Added new developer guide for Inference using NeuronX Distributed. developer guide
Added Llama-2-7B model inference script ([html] [notebook])
Added App note on Support for PyTorch 2.1 (Beta) . See Introducing PyTorch 2.1 Support
Added developer guide for replace_weights API to replace the separated weights. See PyTorch Neuron (torch-neuronx) Weight Replacement API for Inference
Added [Beta] script for training stabilityai/stable-diffusion-2-1-base and runwayml/stable-diffusion-v1-5 models . See script
Added [Beta] script for training facebook/bart-large model. See script
Added [Beta] script for stabilityai/stable-diffusion-2-inpainting model inference. See script
Added documentation for new Neuron Distributed Event Tracing (NDET) tool to help visualize execution trace logs and diagnose errors in multi-node workloads. See Neuron Distributed Event Tracing (NDET) User Guide
Updated Neuron Profile User guide with support for multi-worker jobs. See Neuron Profile User Guide
Minor updates to Custom Ops API reference guide.See Custom Operators API Reference Guide [Beta]

Neuron 2.15.0 #

Date: 10/26/2023

New Introducing PyTorch 2.0 Support (End of Support) application note with torch-neuronx
New llama2_70b_tp_pp_tutorial and (sample script) using neuronx-distributed
New Model Samples and Tutorials documentation for a consolidated list of code samples and tutorials published by AWS Neuron.
New Neuron Software Classification documentation for alpha, beta, and stable Neuron SDK definitions and updated documentation references.
New Pipeline Parallelism Overview and Developer guide for Pipeline Parallelism (neuronx-distributed ) documentation in neuronx-distributed
Updated Neuron Distributed API Guide regarding pipeline-parallelism support and checkpointing
New Activation Memory Reduction application note and Developer guide for Activation Memory reduction (neuronx-distributed ) in neuronx-distributed
New Weight Sharing (Deduplication) notebook script
Added Finetuning script for google/electra-small-discriminator with torch-neuronx
Added ResNet50 training (Beta) tutorial and scripts with torch-neuronx
Added Vision Perceiver training sample with torch-neuronx
Added flan-t5-xl model inference tutorial using neuronx-distributed
Added HuggingFace Stable Diffusion 4X Upscaler model Inference on Trn1 / Inf2 sample script with torch-neuronx
Updated GPT-NeoX 6.9B and 20B model scripts to include selective checkpointing.
Added serialization support and removed -O1 flag constraint to Llama-2-13B model inference script tutorial with transformers-neuronx
Updated BERT script and Llama-2-7B script with Pytorch 2.0 support
Added option-argument llm-training to the existing --distribution_strategy compiler option to make specific optimizations related to training distributed models in Neuron Compiler CLI Reference Guide (neuronx-cc)
Updated Neuron Sysfs User Guide to include mem_ecc_uncorrected and sram_ecc_uncorrected hardware statistics.
Updated PyTorch NeuronX Tracing API for Inference to include io alias documentation
Updated Transformers NeuronX (transformers-neuronx) Developer Guide with serialization support.
Upgraded numpy version to 1.22.2 for various scripts
Updated LanguagePerceiver fine-tuning script to stable
Announcing End of Support for OPT example in transformers-neuronx
Announcing End of Support for “nemo” option-argument

Known Issues and Limitations#

Following tutorials are currently not working. These tutorials will be updated once there is a fix.

Neuron 2.14.0 #

Date: 09/15/2023

Neuron Calculator now supports multiple model configurations for Tensor Parallel Degree computation. See Neuron Calculator
Announcement to deprecate --model-type=transformer-inference flag. See Announcing deprecation for --model-type=transformer-inference compiler flag
Updated HF ViT benchmarking script to use --model-type=transformer flag. See [script]
Updated torch_neuronx.analyze API documentation. See PyTorch NeuronX Analyze API for Inference
Updated Performance benchmarking numbers for models on Inf1,Inf2 and Trn1 instances with 2.14 release bits. See _benchmark
New tutorial for Training Llama2 7B with Tensor Parallelism and ZeRO-1 Optimizer using neuronx-distributed Training Llama2 7B with Tensor Parallelism and ZeRO-1 Optimizer (neuronx-distributed )
New tutorial for T5-3B model inference using neuronx-distributed (tutorial)
Updated Neuron Persistent Cache documentation regarding clarification of flags parsed by neuron_cc_wrapper tool which is a wrapper over Neuron Compiler CLI. See Neuron Persistent Cache
Added tokenizers_parallelism=true in various notebook scripts to supress tokenizer warnings making errors easier to detect
Updated Neuron device plugin and scheduler YAMLs to point to latest images. See yaml configs
Added notebook script to fine-tune deepmind/language-perceiver model using torch-neuronx. See sample script
Added notebook script to fine-tune clip-large model using torch-neuronx. See sample script
Added SD XL Base+Refiner inference sample script using torch-neuronx. See sample script
Upgraded default diffusers library from 0.14.0 to latest 0.20.2 in Stable Diffusion 1.5 and Stable Diffusion 2.1 inference scripts. See sample scripts
Added Llama-2-13B model training script using neuronx-nemo-megatron ( tutorial )

Neuron 2.13.0 #

Date: 08/28/2023

Added tutorials for GPT-NEOX 6.9B and 20B models training using neuronx-distributed. See more at Tutorials for NeuronX Distributed (neuronx-distributed )
Added TensorFlow 2.x (tensorflow-neuronx) analyze_model API section. See more at TensorFlow 2.x (tensorflow-neuron) analyze_model API
Updated setup instructions to fix path of existing virtual environments in DLAMIs. See more at setup guide
Updated setup instructions to fix pinned versions in upgrade instructions of setup guide. See more at setup guide
Updated tensorflow-neuron HF distilbert tutorial to improve performance by removing HF pipeline. See more at [html] [notebook]
Updated training troubleshooting guide in torch-neuronx to describe network Connectivity Issue on trn1/trn1n 32xlarge with Ubuntu. See more at PyTorch Neuron (torch-neuronx) for Training Troubleshooting Guide
Added “Unsupported Hardware Operator Code” section to Neuron Runtime Troubleshooting page. See more at Neuron Runtime Troubleshooting on Inf1, Inf2 and Trn1
Removed ‘beta’ tag from neuronx-distributed section for training. neuronx-distributed Training is now considered stable and neuronx-distributed inference is considered as beta.
Added FLOP count(flop_count) and connected Neuron Device ids (connected_devices) to sysfs userguide. See Neuron Sysfs User Guide
Added tutorial for T5 model inference. See more at [notebook]
Updated neuronx-distributed api guide and inference tutorial. See more at API Reference Guide (neuronx-distributed ) and Inference with Tensor Parallelism (neuronx-distributed) [Beta]
Announcing End of support for AWS Neuron reference for Megatron-LM starting Neuron 2.13. See more at AWS Neuron reference for Megatron-LM no longer supported
Announcing end of support for torch-neuron version 1.9 starting Neuron 2.14. See more at Announcing end of support for torch-neuron version 1.9
Upgraded numpy version to 1.21.6 in various training scripts for Text Classification
Added license for Nemo Megatron to SDK Maintenance Policy. See more at SDK Maintenance Policy
Updated bert-japanese training Script to use multilingual-sentiments dataset. See `hf-bert-jp <https://github.com/aws-neuron/aws-neuron-samples/tree/master/torch-neuronx/training/hf_bert_jp> `_
Added sample script for LLaMA V2 13B model inference using transformers-neuronx. See neuron samples repo
Added samples for training GPT-NEOX 20B and 6.9B models using neuronx-distributed. See neuron samples repo
Added sample scripts for CLIP and Stable Diffusion XL inference using torch-neuronx. See neuron samples repo
Added sample scripts for vision and language Perceiver models inference using torch-neuronx. See neuron samples repo
Added camembert training/finetuning example for Trn1 under hf_text_classification in torch-neuronx. See neuron samples repo
Updated Fine-tuning Hugging Face BERT Japanese model sample in torch-neuronx. See neuron samples repo
See more neuron samples changes in neuron samples release notes
Added samples for pre-training GPT-3 23B, 46B and 175B models using neuronx-nemo-megatron library. See aws-neuron-parallelcluster-samples
Announced End of Support for GPT-3 training using aws-neuron-reference-for-megatron-lm library. See aws-neuron-parallelcluster-samples
Updated bert-fine-tuning SageMaker sample by replacing amazon_reviews_multi dataset with amazon_polarity dataset. See aws-neuron-sagemaker-samples

Neuron 2.12.0 #

Date: 07/19/2023

Added best practices user guide for benchmarking performance of Neuron Devices Benchmarking Guide and Helper scripts
Announcing end of support for Ubuntu 18. See more at Announcing end of support for Ubuntu 18
Improved sidebar navigation in Documentation.
Removed support for Distributed Data Parallel(DDP) Tutorial.

Neuron 2.11.0 #

Date: 06/14/2023

New Neuron Calculator Documentation section to help determine number of Neuron Cores needed for LLM Inference.
Added App Note Generative LLM inference with Neuron
New ML Libraries Documentation section to have NeuronX Distributed and Transformers NeuronX (transformers-neuronx)
Improved Installation and Setup Guides for the different platforms supported. See more at Setup Guide
Added Tutorial How to prepare trn1.32xlarge for multi-node execution

This document is relevant for: Inf1, Inf2, Trn1, Trn1n

AWS Neuron Documentation

Neuron Documentation Release Notes

Contents

Neuron Documentation Release Notes#

Neuron 2.18.0 #

Neuron 2.16.0 #

Neuron 2.15.0 #

Known Issues and Limitations#

Neuron 2.14.0 #

Neuron 2.13.0 #

Neuron 2.12.0 #

Neuron 2.11.0 #

AWS Neuron Documentation

Neuron Documentation Release Notes

Contents

Neuron Documentation Release Notes#

Neuron 2.18.0#

Neuron 2.16.0#

Neuron 2.15.0#

Known Issues and Limitations#

Neuron 2.14.0#

Neuron 2.13.0#

Neuron 2.12.0#

Neuron 2.11.0#

Neuron 2.18.0 #

Neuron 2.16.0 #

Neuron 2.15.0 #

Neuron 2.14.0 #

Neuron 2.13.0 #

Neuron 2.12.0 #

Neuron 2.11.0 #