AWS Neuron Documentation#

AWS Neuron is a software stack that enables high-performance deep learning and generative AI workloads on AWS Inferentia and AWS Trainium instances. Neuron provides a complete machine learning development experience with compiler optimization, runtime efficiency, and comprehensive tooling.

Join our Beta program

Get early access to new Neuron features and tools! Fill out this form and apply to join our Beta program.

Learn more about AWS Neuron#

Select a card below to read more about these features:

Native PyTorch

Learn about native PyTorch support in AWS Neuron.

vLLM on Neuron

High-performance inference serving for large language models with OpenAI-compatible APIs on Trainium and Inferentia.

Developer Tools

Profile and monitor your models as you develop, build, test, and deploy them with Neuron’s developer tools.

Neuron Kernel Interface

Low-level programming interface for custom kernel development on Trainium and Inferentia with direct hardware access.

Other Neuron features:

Orchestration and Deployment on AWS EC2 and EKS

Configure and run AWS Deep Learning Images (DLAMIs) and Containers (DLCs) to test and deploy your models with AWS EC2 and EKS.

AWS Neuron Open Source

Interested in contributing to Neuron source code and samples? Review this documentation and learn about our public GitHub repos and how to contribute to the code and samples in them.

AWS Neuron-supported ML frameworks
NeuronX Distributed (NxD) libraries
Legacy Documentation and Samples

AWS and the AWS logo are trademarks of Amazon Web Services, Inc. or its affiliates. All rights reserved.