This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3
Update PyTorch in a Deep Learning Container#
Update your DLC-based PyTorch Neuron deployment to the latest release.
Update the container image#
DLC images are versioned and tagged with the Neuron SDK version. To update, pull the latest image tag from ECR:
# Training
docker pull public.ecr.aws/neuron/pytorch-training-neuronx:<new_image_tag>
# Inference
docker pull public.ecr.aws/neuron/pytorch-inference-neuronx:<new_image_tag>
# vLLM Inference
docker pull public.ecr.aws/neuron/pytorch-inference-vllm-neuronx:<new_image_tag>
Replace <new_image_tag> with the tag for the desired SDK version (e.g.,
2.9.0-neuronx-py312-sdk2.29.0-ubuntu24.04).
Check available tags at the ECR Public Gallery:
For the full list of available images and tags, see Neuron Deep Learning Containers.
Update Neuron driver on the host#
The Neuron driver runs on the host, not inside the container. Update it separately when moving to a new Neuron SDK release.
sudo apt-get update
sudo apt-get install -y aws-neuronx-dkms
Important
Ubuntu 22.04 has reached end-of-support on Neuron. Neuron no longer provides Ubuntu 22.04 DLAMIs or container images. New deployments should use Ubuntu 24.04. See Neuron no longer includes Ubuntu 22.04 DLAMIs and DLCs starting this release.
sudo apt-get update
sudo apt-get install -y aws-neuronx-dkms
sudo dnf install -y aws-neuronx-dkms
Verify the update#
Launch the new container and verify:
docker run -it \
--device=/dev/neuron0 \
--cap-add SYS_ADMIN \
--cap-add IPC_LOCK \
public.ecr.aws/neuron/pytorch-training-neuronx:<new_image_tag> \
bash
Inside the container:
python3 -c "import torch; import torch_neuronx; print(f'PyTorch {torch.__version__}, torch-neuronx {torch_neuronx.__version__}')"
neuron-ls
⚠️ Troubleshooting: Version mismatch between host driver and container
If you see runtime errors after updating the container image but not the host driver:
Check the host driver version:
modinfo neuronon the hostUpdate the host driver to match the SDK version in the container
Reboot if the driver update requires it:
sudo reboot
This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3