Deploy on AWS#

Run your training and inference workloads on AWS Trainium and Inferentia instances. This section covers everything from launching your first instance to running production Kubernetes services — choose a pre-configured environment, pick a compute service, and deploy.

New to Neuron deployment?

Read Choose your deployment path to compare deployment options side by side and find the right path for your workload.


Start with a pre-configured environment#

Pick a pre-configured environment based on how you’ll run the workload, not which framework you use:

  • One EC2 instance, interactive work → use a Deep Learning AMI.

  • Orchestrated containers (EKS, ECS, Batch, SageMaker) → use a Deep Learning Container.

  • Need to control the OS image or the dependency set → use a custom Docker build, optionally based on a DLC.

DLAMIs and DLCs share the same Neuron SDK and frameworks; the difference is the unit of deployment.

Deep Learning AMIs

An EC2 AMI with Neuron SDK, frameworks, and virtual environments pre-installed. Multi-framework, single-framework, and base variants. Fastest path from zero to running code on a single instance.

Deep Learning Containers

Pre-built Docker images on Amazon ECR with Neuron SDK and a specific framework. PyTorch training, PyTorch inference, vLLM inference, and JAX training images.

Custom Docker builds

Install Neuron drivers, configure Docker, and build containers from scratch — or extend a DLC with custom packages. Full control over every dependency.


Deploy on an AWS compute service#

Choose where to run your workload. EC2 pairs with a DLAMI; every other service consumes a DLC (or a custom container).

Amazon EC2

Direct access to Neuron hardware on individual instances. Launch a DLAMI, SSH in, and start training or serving models. Supports Inf1, Inf2, Trn1, and Trn2 instances.

Amazon EKS

Kubernetes orchestration with Neuron device plugins, topology-aware scheduling, and Dynamic Resource Allocation (DRA) for Trn2. Includes Helm chart for one-command infrastructure setup and UltraServer support.

Amazon ECS

Task-based container orchestration without Kubernetes. Run Neuron DLCs as ECS tasks with node problem detection for automatic health monitoring and recovery.

AWS Batch

Submit training jobs and let Batch manage compute provisioning, scaling, and cleanup. Build a container, configure a compute environment, and submit jobs.

AWS ParallelCluster

HPC cluster management with Slurm for large-scale distributed training. Set up a head node and Trn1 compute fleet with EFA networking and shared storage.

Amazon SageMaker

Fully managed ML platform. Use JumpStart for model fine-tuning, HyperPod for resilient distributed training, or SageMaker Training for on-demand compute.


Manage Neuron infrastructure#

Kubernetes plugins, monitoring, and operational tools for running Neuron workloads in production.

Infrastructure components

Neuron device plugin, scheduler extension, DRA driver, node problem detector, and monitoring. These components manage device discovery, topology-aware scheduling, and health monitoring in Kubernetes.

Container tutorials

Step-by-step guides for Docker environment setup, building Neuron containers, configuring OCI hooks, and running inference and training in containers.

Third-party solutions

Partner integrations including Ray for distributed orchestration and Domino for enterprise ML platforms.

FAQ and troubleshooting

Common questions about Neuron containers, device exposure, EFA networking, and Kubernetes scheduling.


Release notes#