× We want your feedback about Neuron SDK! Let us know by taking the Neuron survey

PyTorch Neuron release notes

This document lists the release notes for the Pytorch-Neuron package.

Known Issues and Limitations - Updated 12/30/2021

The index outputs of the aten::argmin, aten::argmax, aten::min, and aten::max operator implementations are sensitive to precision. For models that contain these operators and have float32 inputs, we recommend using the --fp32-cast=matmult --fast-math no-fast-relayout compiler option to avoid numerical imprecision issues. Additionally, the aten::min and aten::max operator implementations do not currently support int64 inputs when dim=0. For more information on precision and performance-accuracy tuning, see Mixed precision and performance-accuracy tuning.

The following are not torch-neuron limitations, but may impact models you can successfully torch.neuron.trace

  • If you attempt to import torch.neuron from Python 3.5 you will see this error in - please use Python 3.6 or greater:

File "/tmp/install_test_env/lib/python3.5/site-packages/torch_neuron/__init__.py", line 29
   f'Invalid dependency version torch=={torch.__version__}. '
SyntaxError: invalid syntax
  • Torchvision has dropped support for Python 3.5

  • HuggingFace transformers has dropped support for Python 3.5

  • There is a dependency between versions of torchvision and the torch package that customers should be aware of when compiling torchvision models. These dependency rules can be managed through pip. At the time of writing torchvision==0.6.1 matched the torch==1.5.1 release, and torchvision==0.8.2 matched the torch==1.7.1 release

PyTorch Neuron release []

Date: 04/29/2022

New in this release

Bug fixes

  • Improved aten::gelu accuracy

  • Updated aten::meshgrid to support optional indexing argument introduced in torch 1.10 , see PyTorch issue 50276

PyTorch Neuron release []

Date: 03/25/2022

New in this release

  • Added full support for aten::max_pool2d_with_indices - (Was previously supported only when indices were unused).

  • Added new torch-neuron packages compiled with -D_GLIBCXX_USE_CXX11_ABI=1, the new packages support PyTorch 1.8, PyTorch 1.9, and PyTorch 1.10. To install the additional packages compiled with -D_GLIBCXX_USE_CXX11_ABI=1 please change the package repo index to https://pip.repos.neuron.amazonaws.com (https://pip.repos.neuron.amazonaws.com/)/cxx11/

PyTorch Neuron release []

Date: 01/20/2022

New in this release

  • Added PyTorch 1.10 support

  • Added new operators support, see PyTorch Supported operators

  • Updated aten::_convolution to support 2d group convolution

  • Updated neuron::forward operators to allocate less dynamic memory. This can increase performance on models with many input & output tensors.

  • Updated neuron::forward to better handle batch sizes when dynamic_batch_size=True. This can increase performance at inference time when the input batch size is exactly equal to the traced model batch size.

Bug fixes

  • Added the ability to torch.jit.trace a torch.nn.Module where a submodule has already been traced with torch_neuron.trace on a CPU-type instance. Previously, if this had been executed on a CPU-type instance, an initialization exception would have been thrown.

  • Fixed aten::matmul behavior on 1-dimensional by n-dimensional multiplies. Previously, this would cause a validation error.

  • Fixed binary operator type promotion. Previously, in unusual situations, operators like aten::mul could produce incorrect results due to invalid casting.

  • Fixed aten::select when index was -1. Previously, this would cause a validation error.

  • Fixed aten::adaptive_avg_pool2d padding and striding behavior. Previously, this could generate incorrect results with specific configurations.

  • Fixed an issue where dictionary inputs could be incorrectly traced when the tensor values had gradients.

PyTorch Neuron release [2.0.536.0]

Date: 01/05/2022

New in this release

PyTorch Neuron release [2.0.468.0]

Date: 12/15/2021

New in this release

  • Added support for aten::cumsum operation.

  • Fixed aten::expand to correctly handle adding new dimensions.

PyTorch Neuron release [2.0.392.0]

Date: 11/05/2021

  • Updated Neuron Runtime (which is integrated within this package) to libnrt to fix a container issue that was preventing the use of containers when /dev/neuron0 was not present. See details here Neuron Runtime 2.x Release Notes.

PyTorch Neuron release [2.0.318.0]

Date: 10/27/2021

New in this release

Resolved Issues

  • Fixed a performance issue when using both the dynamic_batch_size=True trace option and --neuron-core-pipeline compiler option. Dynamic batching now uses OpenMP to execute pipeline batches concurrently.

  • Fixed torch_neuron.trace issues:

    • Fixed a failure when the same submodule was traced with multiple inputs

    • Fixed a failure where some operations would fail to be called with the correct arguments

    • Fixed a failure where custom operators (torch plugins) would cause a trace failure

  • Fixed variants of aten::upsample_bilinear2d when scale_factor=1

  • Fixed variants of aten::expand using dim=-1

  • Fixed variants of aten::stack using multiple different input data types

  • Fixed variants of aten::max using indices outputs


Date: 08/12/2021


  • Minor updates.


Date: 07/02/2021



Date: 5/28/2021


  • Added support for PyTorch 1.8.1

    • Models compatibility

      • Models compiled with previous versions of Neuron PyTorch (<1.8.1) are compatible with Neuron PyTorch 1.8.1.

      • Models compiled with Neuron PyTorch 1.8.1 are not backward compatible with previous versions of Neuron PyTorch (<1.8.1) .

    • Updated tutorials to use Hugging Face Transformers 4.6.0.

    • Added a new set of forward operators (forward_v2)

    • Host memory allocation when loading the same model on multiple NeuronCores is significantly reduced

    • Fixed an issue where models would not deallocate all memory within a python session after being garbage collected.

    • Fixed a TorchScript/C++ issue where loading the same model multiple times would not use multiple NeuronCores by default.

  • Fixed logging to no longer configure the root logger.

  • Removed informative messages that were produced during compilations as warnings. The number of warnings reduced significantly.

  • Convolution operator support has been extended to include ConvTranspose2d variants.

  • Reduce the amount of host memory usage during inference.


Date: 4/30/2021



Date: 3/4/2021


  • Minor enhancements.


Date: 2/24/2021


  • Fix for CVE-2021-3177.


Date: 1/30/2021


  • Made changes to allow models with -inf scalar constants to correctly compile

  • Added new operator support. Please see PyTorch Supported operators for the complete list of operators.


Date: 12/23/2020


  • We are dropping support for Python 3.5 in this release

  • torch.neuron.trace behavior will now throw a RuntimeError in the case that no operators are compiled for neuron hardware

  • torch.neuron.trace will now display compilation progress indicators (dots) as default behavior (neuron-cc must updated to the December release to greater to see this feature)

  • Added new operator support. Please see PyTorch Supported operators for the complete list of operators.

  • Extended the BERT pretrained tutorial to demonstrate execution on multiple cores and batch modification, updated the tutorial to accomodate changes in the Hugging Face Transformers code for version 4.0

  • Added a tutorial for torch-serve which extends the BERT tutorial

  • Added support for PyTorch 1.7


Date: 11/17/2020


  • Fixed bugs in comparison operators, and added remaining variantes (eq, ne, gt, ge, lt, le)

  • Added support for prim::PythonOp - note that this must be run on CPU and not Neuron. We recommend you replace this code with PyTorch operators if possible

  • Support for a series of new operators. Please see PyTorch Supported operators for the complete list of operators.

  • Performance improvements to the runtime library

  • Correction of a runtime library bug which caused models with large tensors to generate incorrect results in some cases


Date: 09/22/2020


  • Various minor improvements to the Pytorch autopartitioner feature

  • Support for the operators aten::constant_pad_nd, aten::meshgrid

  • Improved performance on various torchvision models. Of note are resnet50 and vgg16


Date: 08/08/2020


  • Various minor improvements to the Pytorch autopartitioner feature

  • Support for the aten:ones operator


Date: 08/05/2020


Various minor improvements.


Date: 07/16/2020


This release adds auto-partitioning, model analysis and PyTorch 1.5.1 support, along with a number of new operators

Major New Features

  • Support for Pytorch 1.5.1

  • Introduce an automated operator device placement mechanism in torch.neuron.trace to run sub-graphs that contain operators that are not supported by the neuron compiler in native PyTorch. This new mechanism is on by default and can be turned off by adding argument fallback=False to the compiler arguments.

  • Model analysis to find supported and unsupported operators in a model

Resolved Issues


Date 6/11/2020


Major New Features

Resolved Issues

Known Issues and Limitations


Date: 5/11/2020


Additional PyTorch operator support and improved support for model saving and reloading.

Major New Features

  • Added Neuron Compiler support for a number of previously unsupported PyTorch operators. Please see :ref:`neuron-cc-ops-pytorch`for the complete list of operators.

  • Add support for torch.neuron.trace on models which have previously been saved using torch.jit.save and then reloaded.

Resolved Issues

Known Issues and Limitations


Date: 3/26/2020


Major New Features

Resolved Issues

Known Issues and limitations


Date: 2/27/2020


Added Neuron Compiler support for a number of previously unsupported PyTorch operators. Please see PyTorch Supported operators for the complete list of operators.

Major new features

  • None

Resolved issues

  • None


Date: 1/27/2020


Major new features

Resolved issues

  • Python 3.5 and Python 3.7 are now supported.

Known issues and limitations

Other Notes


Date: 12/20/2019


This is the initial release of torch-neuron. It is not distributed on the DLAMI yet and needs to be installed from the neuron pip repository.

Note that we are currently using a TensorFlow as an intermediate format to pass to our compiler. This does not affect any runtime execution from PyTorch to Neuron Runtime and Inferentia. This is why the neuron-cc installation must include [tensorflow] for PyTorch.

Major new features

Resolved issues

Known issues and limitations


The following models have successfully run on neuron-inferentia systems

  1. SqueezeNet

  2. ResNet50

  3. Wide ResNet50

Pytorch Serving

In this initial version there is no specific serving support. Inference works correctly through Python on Inf1 instances using the neuron runtime. Future releases will include support for production deployment and serving of models

Profiler support

Profiler support is not provided in this initial release and will be available in future releases

Automated partitioning

Automatic partitioning of graphs into supported and non-supported operations is not currently supported. A tutorial is available to provide guidance on how to manually parition a model graph. Please see pytorch-manual-partitioning-jn-tutorial

PyTorch dependency

Currently PyTorch support depends on a Neuron specific version of PyTorch v1.3.1. Future revisions will add support for 1.4 and future releases.

Trace behavior

In order to trace a model it must be in evaluation mode. For examples please see ResNet50 model for Inferentia

Six pip package is required

The Six package is required for the torch-neuron runtime, but it is not modeled in the package dependencies. This will be fixed in a future release.

Multiple NeuronCore support

If the num-neuroncores options is used the number of cores must be manually set in the calling shell environment variable for compilation and inference.

For example: Using the keyword argument compiler_args=[‘—num-neuroncores’, ‘4’] in the trace call, requires NEURONCORE_GROUP_SIZES=4 to be set in the environment at compile time and runtime

CPU execution

At compilation time a constant output is generated for the purposes of tracing. Running inference on a non neuron instance will generate incorrect results. This must not be used. The following error message is generated to stderr:

Warning: Tensor output are ** NOT CALCULATED ** during CPU execution and only
indicate tensor shape

Other notes

  • Python version(s) supported:

    • 3.6

  • Linux distribution supported:

    • DLAMI Ubuntu 18 and Amazon Linux 2 (using Python 3.6 Conda environments)

    • Other AMIs based on Ubuntu 18

    • For Amazon Linux 2 please install Conda and use Python 3.6 Conda environment