AWS Neuron Documentation#

AWS Neuron is a software stack that enables high-performance deep learning and generative AI workloads on AWS Inferentia and AWS Trainium instances. Neuron provides a complete machine learning development experience with compiler optimization, runtime efficiency, and comprehensive tooling.

For more details, see What is AWS Neuron? and What’s New in AWS Neuron?

Neuron and Open Source

Neuron includes open-source components across the software stack. The NKI Compiler, Neuron Kernel Driver, NKI Library, NxD Inference, and Neuron Explorer are available under open-source licenses. Framework integrations for PyTorch, JAX, and vLLM provide transparent implementations with public repositories for community contributions and modifications. See the list of Neuron open-source GitHub repos for more details.

Learn more about AWS Neuron#

Select a card below to read more about these features:

Native PyTorch

Learn about native PyTorch support in AWS Neuron.

vLLM on Neuron

High-performance inference serving for large language models with OpenAI-compatible APIs on Trainium and Inferentia.

Developer Tools

Profile and monitor your models as you develop, build, test, and deploy them with Neuron’s developer tools.

Neuron Kernel Interface

Low-level programming interface for custom kernel development on Trainium and Inferentia with direct hardware access.

Other Neuron features:

Orchestration and Deployment on AWS EC2 and EKS

Configure and run AWS Deep Learning Images (DLAMIs) and Containers (DLCs) to test and deploy your models with AWS EC2 and EKS.

AWS Neuron Open Source

Interested in contributing to Neuron source code and samples? Review this documentation and learn about our public GitHub repos and how to contribute to the code and samples in them.

AWS Neuron-supported ML frameworks
NeuronX Distributed (NxD) libraries
Runtime & Collectives
Legacy Documentation and Samples

AWS and the AWS logo are trademarks of Amazon Web Services, Inc. or its affiliates. All rights reserved.