This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

Update PyTorch in a Deep Learning Container#

Update your DLC-based PyTorch Neuron deployment to the latest release.

Update the container image#

DLC images are versioned and tagged with the Neuron SDK version. To update, pull the latest image tag from ECR:

# Training
docker pull public.ecr.aws/neuron/pytorch-training-neuronx:<new_image_tag>

# Inference
docker pull public.ecr.aws/neuron/pytorch-inference-neuronx:<new_image_tag>

# vLLM Inference
docker pull public.ecr.aws/neuron/pytorch-inference-vllm-neuronx:<new_image_tag>

Replace <new_image_tag> with the tag for the desired SDK version (e.g., 2.9.0-neuronx-py312-sdk2.29.0-ubuntu24.04).

Check available tags at the ECR Public Gallery:

For the full list of available images and tags, see Neuron Deep Learning Containers.

Update Neuron driver on the host#

The Neuron driver runs on the host, not inside the container. Update it separately when moving to a new Neuron SDK release.

sudo apt-get update
sudo apt-get install -y aws-neuronx-dkms

Important

Ubuntu 22.04 has reached end-of-support on Neuron. Neuron no longer provides Ubuntu 22.04 DLAMIs or container images. New deployments should use Ubuntu 24.04. See Neuron no longer includes Ubuntu 22.04 DLAMIs and DLCs starting this release.

sudo apt-get update
sudo apt-get install -y aws-neuronx-dkms
sudo dnf install -y aws-neuronx-dkms

Verify the update#

Launch the new container and verify:

docker run -it \
  --device=/dev/neuron0 \
  --cap-add SYS_ADMIN \
  --cap-add IPC_LOCK \
  public.ecr.aws/neuron/pytorch-training-neuronx:<new_image_tag> \
  bash

Inside the container:

python3 -c "import torch; import torch_neuronx; print(f'PyTorch {torch.__version__}, torch-neuronx {torch_neuronx.__version__}')"
neuron-ls
⚠️ Troubleshooting: Version mismatch between host driver and container

If you see runtime errors after updating the container image but not the host driver:

  1. Check the host driver version: modinfo neuron on the host

  2. Update the host driver to match the SDK version in the container

  3. Reboot if the driver update requires it: sudo reboot

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3