What’s New¶
Neuron 1.16.2 (12/15/2021)¶
Neuron 1.16.2 is a patch release. This release include performance enhancements and minor bug fixes in Neuron Compiler and PyTorch Neuron.
Neuron 1.16.1 (11/05/2021)¶
Neuron 1.16.1 is a patch release. This release fixes a bug in Neuron Runtime that would have prevented users from launching a container that doesn’t use all of the Neuron Devices in the instance. If you are using Neuron within a container, please update to this new release by updating to latest Neuron ML framework package, Neuron Tools, and/or TensorFlow Neuron Model Server.
To update to latest PyTorch 1.9.1:
pip install --upgrade torch-neuron neuron-cc[tensorflow] torchvision
To update to latest TensorFlow 2.5.1:
pip install --upgrade tensorflow-neuron[cc]
To update to latest TensorFlow 1.15.5:
pip install --upgrade tensorflow-neuron==1.15.5.* neuron-cc
To update to latest MXNet 1.8.0:
pip install --upgrade mx_neuron neuron-cc
For more details on how to update the framework packages, please check out our QuickStart guides.
Neuron 1.16.0 (10/27/2021)¶
Neuron 1.16.0 is a release that requires your attention. You must update to the latest Neuron Driver ( aws-neuron-dkms
version 2.1 or newer)
for successful installation or upgrade.
This release introduces
Neuron Runtime 2.x, upgrades PyTorch Neuron to
PyTorch 1.9.1, adds support for new APIs (torch.neuron.DataParallel()
and torch_neuron.is_available()
),
adds new features and capabilities (compiler --fast-math
option for better fine-tuning of accuracy/performance and MXNet FlexEG feature),
improves tools, adds support for additional operators,
improves performance
(Up to 20% additional throughput and up to 25% lower latency),
and reduces model loading times. It also simplifies Neuron installation steps,
and improves the user experience of container creation and deployment.
In addition it includes bug fixes, new application notes, updated tutorials,
and announcements of software deprecation and maintenance.
Neuron Runtime 2.x
Introducing Neuron Runtime 2.x (libnrt.so) - In this release we are introducing Neuron Runtime 2.x. The new runtime is a shared library (
libnrt.so
), replacing Neuron Runtime 1.x which was a server daemon (neruon-rtd
).Upgrading to
libnrt.so
is expected to improves throughput and latency, simplifies Neuron installation and upgrade process, introduces new capabilities for allocating NeuronCores to applications, streamlines container creation, and deprecates tools that are no longer needed. The new library-based runtime (libnrt.so
) is directly integrated into Neuron’s ML Frameworks (with the exception of MXNet 1.5) and Neuron Tools packages. As a result, users no longer need to install/deploy theaws-neuron-runtime
package.Important
You must update to the latest Neuron Driver (
aws-neuron-dkms
version 2.1 or newer) for proper functionality of the new runtime library.Read Introducing Neuron Runtime 2.x (libnrt.so) application note that describes why we are making this change and how this change will affect the Neuron SDK in detail.
Read Migrate your application to Neuron Runtime 2.x (libnrt.so) for detailed information of how to migrate your application.
Performance
Updated performance numbers - Improved performance: Up to 20% additional throughput and up to 25% lower latency.
Documentation resources
Improved Neuron Setup Guide.
New Introducing Neuron Runtime 2.x (libnrt.so) application note.
New Running inference on variable input shapes with bucketing application note.
New Mixed precision and performance-accuracy tuning application note.
New Data Parallel Inference on Torch Neuron application note.
New Flexible Execution Group (FlexEG) in Neuron-MXNet application note.
New Parallel Execution using NEURONCORE_GROUP_SIZES application note.
New Using NEURON_RT_VISIBLE_CORES with TensorFlow Serving tutorial.
Updated ResNet50 model for Inferentia tutorial to use
torch.neuron.DataParallel()
.
PyTorch
PyTorch now supports Neuron Runtime 2.x only. Please visit Introducing Neuron Runtime 2.x (libnrt.so) for more information.
Introducing PyTorch 1.9.1 support.
Introducing new APIs:
torch.neuron.DataParallel()
(see Data Parallel Inference on Torch Neuron application note for more details) andtorch_neuron.is_available()
.Introducing new operators support.
For more information visit PyTorch Neuron
TensorFlow 2.x
TensorFlow 2.x now supports Neuron Runtime 2.x only. Please visit Introducing Neuron Runtime 2.x (libnrt.so) for more information.
Updated Tensorflow 2.3.x from Tensorflow 2.3.3 to Tensorflow 2.3.4.
Updated Tensorflow 2.4.x from Tensorflow 2.4.2 to Tensorflow 2.4.3.
Updated Tensorflow 2.5.x from Tensorflow 2.5.0 to Tensorflow 2.5.1.
Introducing new operators support
For more information visit TensorFlow Neuron
TensorFlow 1.x
TensorFlow 1.x now supports Neuron Runtime 2.x only. Please visit Introducing Neuron Runtime 2.x (libnrt.so) for more information.
Introducing new operators support.
For more information visit TensorFlow Neuron
MXNet 1.8
MXNet 1.8 now supports Neuron Runtime 2.x only. Please visit Introducing Neuron Runtime 2.x (libnrt.so) for more information.
Introducing Flexible Execution Groups (FlexEG) feature.
MXNet 1.5 enters maintenance mode. Please visit Neuron support for Apache MXNet 1.5 enters maintenance mode for more information.
For more information visit Neuron Apache MXNet (Incubating)
Neuron Compiler
Introducing the
–-fast-math
option for better fine-tuning of accuracy/performance. See Mixed precision and performance-accuracy tuningSupport added for new ArgMax and ArgMin operators. See Neuron Compiler Release Notes.
For more information visit Neuron Compiler
Neuron Tools
Updates have been made to
neuron-ls
andneuron-top
to improve the interface and utility of information provided.neuron-monitor` has been enhanced to include additional information when used to monitor the latest Frameworks released with Neuron 1.16.0. See Neuron Tools 2.x Release Notes.
neuron-cli
is entering maintenance mode as its use is no longer relevant when using ML Frameworks with an integrated Neuron Runtime (libnrt.so).For more information visit Neuron Tools
Neuron Containers
Starting with Neuron 1.16.0, installation of Neuron ML Frameworks now includes an integrated Neuron Runtime library. As a result, it is no longer required to deploy
neuron-rtd
. Please visit Introducing Neuron Runtime 2.x (libnrt.so) for information.When using containers built with components from Neuron 1.16.0, or newer, please use
aws-neuron-dkms
version 2.1 or newer and the latest version ofaws-neuron-runtime-base
. Passing additional system capabilities is no longer required.For more information visit Containers
Neuron Driver
Support is added for Neuron Runtime 2.x (libnrt.so).
Memory improvements have been made to ensure all allocations are made with 4K alignments.
Software Deprecation
Software maintenance mode
Detailed release notes¶
Software |
Details |
---|---|
General |
|
PyTorch |
|
TensorFlow 2.x |
|
TensorFlow 1.x |
|
Apache MXNet (Incubating) |
|
Compiler |
|
Runtime |
|
Containers |
|
Tools |
|
Software Deprecation |
|
Software Maintenance |