Deploy on AWS#
Run your training and inference workloads on AWS Trainium and Inferentia instances. This section covers everything from launching your first instance to running production Kubernetes services — choose a pre-configured environment, pick a compute service, and deploy.
New to Neuron deployment?
Read Choose your deployment path to compare deployment options side by side and find the right path for your workload.
Start with a pre-configured environment#
Pick a pre-configured environment based on how you’ll run the workload, not which framework you use:
One EC2 instance, interactive work → use a Deep Learning AMI.
Orchestrated containers (EKS, ECS, Batch, SageMaker) → use a Deep Learning Container.
Need to control the OS image or the dependency set → use a custom Docker build, optionally based on a DLC.
DLAMIs and DLCs share the same Neuron SDK and frameworks; the difference is the unit of deployment.
Install Neuron drivers, configure Docker, and build containers from scratch — or extend a DLC with custom packages. Full control over every dependency.
Deploy on an AWS compute service#
Choose where to run your workload. EC2 pairs with a DLAMI; every other service consumes a DLC (or a custom container).
Kubernetes orchestration with Neuron device plugins, topology-aware scheduling, and Dynamic Resource Allocation (DRA) for Trn2. Includes Helm chart for one-command infrastructure setup and UltraServer support.
Task-based container orchestration without Kubernetes. Run Neuron DLCs as ECS tasks with node problem detection for automatic health monitoring and recovery.
Manage Neuron infrastructure#
Kubernetes plugins, monitoring, and operational tools for running Neuron workloads in production.