This document is relevant for: Inf2, Trn1, Trn2, Trn3

Neuron Runtime API Reference#

This section provides comprehensive API reference documentation for the Neuron Runtime (NRT) and Neuron Driver Library (NDL). These APIs enable low-level access to AWS Neuron devices and provide interfaces for model loading, execution, memory management, and collective operations.

Source code for these APIs can be found at: aws-neuron/aws-neuron-sdk.

Core Runtime APIs#

NRT API	Main Neuron Runtime API for model loading, execution, and tensor management
NRT Status	Status codes and error handling for runtime operations
NRT Version	Version information and compatibility checking

Asynchronous Execution APIs#

NRT Async	Asynchronous execution API for non-blocking operations
NRT Async Send/Recv	Asynchronous tensor send and receive operations

Profiling and Debugging APIs#

NRT Profile	Profiling API for performance analysis and optimization
NRT System Trace	System trace capture and event fetching
Debug Stream	Debug event streaming from Logical Neuron Cores

Collective Operations API#

NEC API

Neuron Elastic Collectives (NEC) for distributed operations

Neuron Driver Library (NDL) APIs#

NDL API	Low-level Neuron Driver Library for device access and control
Neuron Driver Shared	Shared definitions between runtime and driver
Tensor Batch Operations	Batch operation structures for tensor transfers

Neuron Datastore API#

Neuron Datastore

Neuron Datastore (NDS) for sharing metrics and model information

Experimental APIs#

NRT Experimental

Experimental features and APIs (subject to change)