This document is relevant for: Inf1, Inf2, Trn1, Trn1n

NeuronX Runtime#

NeuronX runtime consists of kernel driver and C/C++ libraries which provides APIs to access Inferentia and Trainium Neuron devices. The Neuron ML frameworks plugins for TensorFlow, PyTorch and Apache MXNet use the Neuron runtime to load and run models on the NeuronCores. Neuron runtime loads compiled deep learning models, also referred to as Neuron Executable File Format (NEFF) to the Neuron devices and is optimized for high-throughput and low-latency.

This document is relevant for: Inf1, Inf2, Trn1, Trn1n

AWS Neuron Documentation

NeuronX Runtime

NeuronX Runtime#