This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3
Neuron Runtime API Reference#
This section provides comprehensive API reference documentation for the Neuron Runtime (NRT) and Neuron Driver Library (NDL). These APIs enable low-level access to AWS Neuron devices and provide interfaces for model loading, execution, memory management, and collective operations.
Source code for these APIs can be found at: aws-neuron/aws-neuron-sdk.
Core Runtime APIs#
Main Neuron Runtime API for model loading, execution, and tensor management |
|
Status codes and error handling for runtime operations |
|
Version information and compatibility checking |
Asynchronous Execution APIs#
Asynchronous execution API for non-blocking operations |
|
Asynchronous tensor send and receive operations |
Profiling and Debugging APIs#
Profiling API for performance analysis and optimization |
|
System trace capture and event fetching |
|
Debug event streaming from Logical Neuron Cores |
Collective Operations API#
Neuron Elastic Collectives (NEC) for distributed operations |
Neuron Driver Library (NDL) APIs#
Low-level Neuron Driver Library for device access and control |
|
Shared definitions between runtime and driver |
|
Batch operation structures for tensor transfers |
Neuron Datastore API#
Neuron Datastore (NDS) for sharing metrics and model information |
Experimental APIs#
Experimental features and APIs (subject to change) |
This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3