This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3
Install PyTorch via Deep Learning Container#
Deploy PyTorch with Neuron support using pre-configured Docker images from AWS ECR.
⏱️ Estimated time: 10 minutes
Note
For a non-containerized setup, consider the DLAMI-based installation or manual installation instead.
What are Neuron DLCs?#
AWS Neuron Deep Learning Containers (DLCs) are pre-configured Docker images with the Neuron SDK and ML frameworks pre-installed. They provide Docker-based isolation, reproducibility, and portability across deployment platforms including EC2, EKS, ECS, and SageMaker.
Available PyTorch Neuron DLC images:
Container Type |
Use Case |
Links |
|---|---|---|
PyTorch Inference (NeuronX) |
Model serving on Inf2/Trn1/Trn2/Trn3 |
|
PyTorch Inference vLLM (NeuronX) |
LLM serving with vLLM |
|
PyTorch Training (NeuronX) |
Model training on Trn1/Trn2/Trn3 |
|
PyTorch Inference (Neuron) |
Legacy inference on Inf1 |
Prerequisites#
Requirement |
Details |
|---|---|
Instance Type |
Inf2, Trn1, Trn2, or Trn3 |
Docker |
Docker Engine installed and running |
AWS CLI |
Configured with ECR access permissions |
Neuron Driver |
|
Quick Start: vLLM Inference Container#
The fastest way to get started with LLM inference on Neuron:
# Authenticate with ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com
# Pull the vLLM inference container
docker pull 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference-neuronx:2.1.2-neuronx-py310-sdk2.20.2-ubuntu20.04
# Run with Neuron device access
docker run -it --device=/dev/neuron0 \
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference-neuronx:2.1.2-neuronx-py310-sdk2.20.2-ubuntu20.04
For the latest image tags and a step-by-step walkthrough, see Quickstart: Configure and deploy a vLLM server using Neuron Deep Learning Container (DLC).
Quick Start: Training Container#
# Authenticate with ECR
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com
# Pull the training container
docker pull 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training-neuronx:2.1.2-neuronx-py310-sdk2.20.2-ubuntu20.04
# Run with all Neuron devices
docker run -it --device=/dev/neuron0 --device=/dev/neuron1 \
763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-training-neuronx:2.1.2-neuronx-py310-sdk2.20.2-ubuntu20.04
Note
The image tags above are examples. For the latest available images, see the Neuron DLC repository.
Customizing a DLC#
You can extend a Neuron DLC with additional packages by creating a custom Dockerfile:
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference-neuronx:2.1.2-neuronx-py310-sdk2.20.2-ubuntu20.04
# Install additional packages
RUN pip install transformers datasets
# Copy your application code
COPY app/ /app/
For more details, see Customize Neuron DLC.
Deployment Platforms#
Neuron DLCs can be deployed across multiple AWS services:
Next Steps#
Quickstart: Configure and deploy a vLLM server using Neuron Deep Learning Container (DLC) - Full vLLM DLC deployment walkthrough
Neuron Deep Learning Containers - Find the right DLC image for your workload
Neuron Containers - Full containers documentation
Training (torch-neuronx) - Training tutorials
Inference with torch-neuronx (Inf2 & Trn1/Trn2) - Inference tutorials
This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3