Deploy on AWS#

Run your training and inference workloads on AWS Trainium and Inferentia instances. This section covers everything from launching your first instance to running production Kubernetes services — choose a pre-configured environment, pick a compute service, and deploy.

New to Neuron deployment?

Read Choose your deployment path to compare deployment options side by side and find the right path for your workload.

Start with a pre-configured environment#

Pick a pre-configured environment based on how you’ll run the workload, not which framework you use:

One EC2 instance, interactive work → use a Deep Learning AMI.
Orchestrated containers (EKS, ECS, Batch, SageMaker) → use a Deep Learning Container.
Need to control the OS image or the dependency set → use a custom Docker build, optionally based on a DLC.

DLAMIs and DLCs share the same Neuron SDK and frameworks; the difference is the unit of deployment.

Deep Learning AMIs

An EC2 AMI with Neuron SDK, frameworks, and virtual environments pre-installed. Multi-framework, single-framework, and base variants. Fastest path from zero to running code on a single instance.

Use for: EC2 development on Inf1, Inf2, Trn1, or Trn2; prototyping; Jupyter notebooks; Slurm clusters

Deep Learning Containers

Pre-built Docker images on Amazon ECR with Neuron SDK and a specific framework. PyTorch training, PyTorch inference, vLLM inference, and JAX training images.

Use for: EKS, ECS, AWS Batch, SageMaker, vLLM serving, multi-node training

Custom Docker builds

Install Neuron drivers, configure Docker, and build containers from scratch — or extend a DLC with custom packages. Full control over every dependency.

Use for: Custom dependencies, hardened base images, internal CI/CD pipelines

Deploy on an AWS compute service#

Choose where to run your workload. EC2 pairs with a DLAMI; every other service consumes a DLC (or a custom container).

Amazon EC2

Direct access to Neuron hardware on individual instances. Launch a DLAMI, SSH in, and start training or serving models. Supports Inf1, Inf2, Trn1, and Trn2 instances.

Use for: Development, prototyping, single-node training and inference

Amazon EKS

Kubernetes orchestration with Neuron device plugins, topology-aware scheduling, and Dynamic Resource Allocation (DRA) for Trn2. Includes Helm chart for one-command infrastructure setup and UltraServer support.

Use for: Production inference services, distributed training, auto-scaling

Amazon ECS

Task-based container orchestration without Kubernetes. Run Neuron DLCs as ECS tasks with node problem detection for automatic health monitoring and recovery.

Use for: Container workloads without Kubernetes, simpler orchestration

AWS Batch

Submit training jobs and let Batch manage compute provisioning, scaling, and cleanup. Build a container, configure a compute environment, and submit jobs.

Use for: Scheduled training, batch processing, variable compute demand

AWS ParallelCluster

HPC cluster management with Slurm for large-scale distributed training. Set up a head node and Trn1 compute fleet with EFA networking and shared storage.

Use for: Multi-node distributed training, Slurm-based workflows

Amazon SageMaker

Fully managed ML platform. Use JumpStart for model fine-tuning, HyperPod for resilient distributed training, or SageMaker Training for on-demand compute.

Use for: Managed infrastructure, end-to-end ML workflows

Manage Neuron infrastructure#

Kubernetes plugins, monitoring, and operational tools for running Neuron workloads in production.

Infrastructure components

Neuron device plugin, scheduler extension, DRA driver, node problem detector, and monitoring. These components manage device discovery, topology-aware scheduling, and health monitoring in Kubernetes.

Container tutorials

Step-by-step guides for Docker environment setup, building Neuron containers, configuring OCI hooks, and running inference and training in containers.

Third-party solutions

Partner integrations including Ray for distributed orchestration and Domino for enterprise ML platforms.

FAQ and troubleshooting

Common questions about Neuron containers, device exposure, EFA networking, and Kubernetes scheduling.

Deploy on AWS

Contents

Deploy on AWS#

Start with a pre-configured environment#

Deploy on an AWS compute service#

Manage Neuron infrastructure#

Release notes#