This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

Install PyTorch via Deep Learning AMI#

Install PyTorch with Neuron support using pre-configured AWS Deep Learning AMIs.

⏱️ Estimated time: 5 minutes

Note

Want to read about Neuron’s Deep Learning machine images (DLAMIs) before diving in? Check out the Neuron DLAMI User Guide.


Prerequisites#

Requirement

Details

Instance Type

Inf2, Trn1, Trn2, or Trn3

AWS Account

With EC2 permissions

SSH Key Pair

For instance access

AWS CLI

Configured with credentials (optional)

Installation Steps#

Step 1: Find the Latest AMI

Get the latest PyTorch DLAMI for Ubuntu 24.04 using the AWS CLI:

aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=Deep Learning AMI Neuron PyTorch 2.9 (Ubuntu 24.04)*" \
  --query 'Images | sort_by(@, &CreationDate) | [-1].ImageId' \
  --output text

You can also use the AWS EC2 parameter store to find the ID of a DLAMI. See `https://docs.aws.amazon.com/dlami/latest/devguide/find-dlami-id.html`__ for details. Record the ID (image-id) for the next step.

Step 2: Launch Instance

Launch a Trn1 or Inf2 instance with the AMI using the AWS CLI:

aws ec2 run-instances \
  --image-id ami-xxxxxxxxxxxxxxxxx \
  --instance-type trn1.2xlarge \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxxxxxx \
  --subnet-id subnet-xxxxxxxxx

Replace:

  • ami-xxxxxxxxxxxxxxxxx with AMI ID from Step 1

  • your-key-pair with your SSH key pair name

  • sg-xxxxxxxxx with your security group ID

  • subnet-xxxxxxxxx with your subnet ID

You can also launch your DLAMI through the AWS EC2 web console, which also provides hints for security group and subnet IDs. For more details, see `https://docs.aws.amazon.com/dlami/latest/devguide/launch.html`__.

Step 3: Connect to Instance

ssh -i your-key-pair.pem ubuntu@<instance-public-ip>

Step 4: Activate Environment

The DLAMI includes a pre-configured virtual environment:

source /opt/aws_neuronx_venv_pytorch_2_9/bin/activate

Step 5: Verify Installation

python3 -c "import torch; import torch_neuronx; print(f'PyTorch {torch.__version__}, torch-neuronx {torch_neuronx.__version__}')"
neuron-ls

You should see output similar to this (the versions, instance IDs, and details should match your expected ones, not the ones in this example):

Expected output:

PyTorch 2.9.0+cpu, torch-neuronx 2.9.0.1.0

+--------+--------+--------+-----------+
| DEVICE | CORES  | MEMORY | CONNECTED |
+--------+--------+--------+-----------+
| 0      | 2      | 32 GB  | Yes       |
| 1      | 2      | 32 GB  | Yes       |
+--------+--------+--------+-----------+
⚠️ Troubleshooting: Module not found

If you see ModuleNotFoundError: No module named 'torch_neuronx':

  1. Verify virtual environment is activated:

    which python
    # Should show:  source /opt/aws_neuronx_venv_pytorch_2_9/bin/activate
    
  2. Check Python version:

    python --version
    # Should be 3.10 or higher
    
  3. Reinstall torch-neuronx:

    pip install --force-reinstall torch-neuronx
    
⚠️ Troubleshooting: No Neuron devices found

If neuron-ls shows no devices:

  1. Verify instance type:

    curl http://169.254.169.254/latest/meta-data/instance-type
    # Should show trn1.*, trn2.*, trn3.*, or inf2.*
    
  2. Check Neuron driver:

    lsmod | grep neuron
    # Should show neuron driver loaded
    
  3. Restart Neuron runtime:

    sudo systemctl restart neuron-monitor
    neuron-ls
    

Step 1: Find the Latest AMI

Get the latest PyTorch DLAMI for Ubuntu 22.04:

aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=Deep Learning AMI Neuron PyTorch 2.9 (Ubuntu 22.04)*" \
  --query 'Images | sort_by(@, &CreationDate) | [-1].ImageId' \
  --output text

Step 2: Launch Instance

aws ec2 run-instances \
  --image-id ami-xxxxxxxxxxxxxxxxx \
  --instance-type trn1.2xlarge \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxxxxxx \
  --subnet-id subnet-xxxxxxxxx

Step 3: Connect to Instance

ssh -i your-key-pair.pem ubuntu@<instance-public-ip>

Step 4: Activate Environment

source /opt/aws_neuronx_venv_pytorch_2_9/bin/activate

Step 5: Verify Installation

python3 -c "import torch; import torch_neuronx; print(f'PyTorch {torch.__version__}, torch-neuronx {torch_neuronx.__version__}')"
neuron-ls

You should see output similar to this (the versions, instance IDs, and details should match your expected ones, not the ones in this example):

Expected output:

PyTorch 2.9.0+cpu, torch-neuronx 2.9.0.1.0

+--------+--------+--------+-----------+
| DEVICE | CORES  | MEMORY | CONNECTED |
+--------+--------+--------+-----------+
| 0      | 2      | 32 GB  | Yes       |
| 1      | 2      | 32 GB  | Yes       |
+--------+--------+--------+-----------+
⚠️ Troubleshooting: Module not found

If you see ModuleNotFoundError: No module named 'torch_neuronx':

  1. Verify virtual environment is activated

  2. Check Python version: python --version (should be 3.10+)

  3. Reinstall: pip install --force-reinstall torch-neuronx

⚠️ Troubleshooting: No Neuron devices found

If neuron-ls shows no devices:

  1. Verify instance type

  2. Check Neuron driver: lsmod | grep neuron

  3. Restart runtime: sudo systemctl restart neuron-monitor

Step 1: Find the Latest AMI

Get the latest PyTorch DLAMI for Amazon Linux 2023:

aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=Deep Learning AMI Neuron PyTorch 2.9 (Amazon Linux 2023)*" \
  --query 'Images | sort_by(@, &CreationDate) | [-1].ImageId' \
  --output text

Step 2: Launch Instance

aws ec2 run-instances \
  --image-id ami-xxxxxxxxxxxxxxxxx \
  --instance-type trn1.2xlarge \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxxxxxx \
  --subnet-id subnet-xxxxxxxxx

Step 3: Connect to Instance

ssh -i your-key-pair.pem ec2-user@<instance-public-ip>

Note

Amazon Linux 2023 uses ec2-user instead of ubuntu.

Step 4: Activate Environment

source /opt/aws_neuronx_venv_pytorch_2_9/bin/activate

Step 5: Verify Installation

.. code-block:: bash

python3 -c "import torch; import torch_neuronx; print(f'PyTorch {torch.__version__}, torch-neuronx {torch_neuronx.__version__}')"
neuron-ls

You should see output similar to this (the versions, instance IDs, and details should match your expected ones, not the ones in this example):

Expected output:

PyTorch 2.9.0+cpu, torch-neuronx 2.9.0.1.0

+--------+--------+--------+-----------+
| DEVICE | CORES  | MEMORY | CONNECTED |
+--------+--------+--------+-----------+
| 0      | 2      | 32 GB  | Yes       |
| 1      | 2      | 32 GB  | Yes       |
+--------+--------+--------+-----------+
⚠️ Troubleshooting: Module not found

If you see ModuleNotFoundError: No module named 'torch_neuronx':

  1. Verify virtual environment is activated

  2. Check Python version: python --version (should be 3.10+)

  3. Reinstall: pip install --force-reinstall torch-neuronx

⚠️ Troubleshooting: No Neuron devices found

If neuron-ls shows no devices:

  1. Verify instance type

  2. Check Neuron driver: lsmod | grep neuron

  3. Restart runtime: sudo systemctl restart neuron-monitor

Update an existing installation#

To update PyTorch versions or Neuron drivers on an existing DLAMI, see Update PyTorch on a Deep Learning AMI.

Tip

vLLM for LLM inference

Neuron provides a dedicated vLLM DLAMI with vLLM and the vLLM-Neuron Plugin pre-installed. Launch the Deep Learning AMI Neuron PyTorch Inference vLLM (Ubuntu 24.04) and activate the pre-configured environment:

source /opt/aws_neuronx_venv_pytorch_inference_vllm_0_16/bin/activate

vLLM provides an OpenAI-compatible API, continuous batching, and supports models like Llama 2/3.1/3.3/4, Qwen 2.5/3, and multimodal models with quantization support (INT8/FP8).

The vLLM environment is also available in the multi-framework DLAMI. For more details on available DLAMIs and SSM parameters, see Neuron DLAMI User Guide.

Next Steps#

Now that PyTorch is installed:

  1. Try a Quick Example:

    import torch
    import torch_neuronx
    
    # Simple tensor operation on Neuron
    x = torch.randn(3, 3)
    model = torch.nn.Linear(3, 3)
    
    # Compile for Neuron
    trace = torch_neuronx.trace(model, x)
    print(trace(x))
    
  2. Follow Tutorials:

  3. Read Documentation:

  4. Explore Tools:

  5. Deploy LLM inference: Neuron DLAMI User Guide (vLLM on Neuron)

Additional Resources#

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3