This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3
Installation Troubleshooting#
Common issues and solutions for Neuron SDK installation.
Module Import Errors#
ModuleNotFoundError: No module named ‘torch_neuronx’#
Symptoms: Python cannot find torch_neuronx module after installation.
Causes:
Virtual environment not activated
Wrong Python version
Installation failed silently
Multiple Python installations
Solutions:
Verify virtual environment:
which python # Should show virtual environment path, not system Python
Check Python version:
python --version # Should be 3.10, 3.11, or 3.12
Reinstall torch-neuronx:
pip install --force-reinstall torch-neuronx --extra-index-url=https://pip.repos.neuron.amazonaws.com
Verify installation:
pip list | grep neuron
ImportError: cannot import name ‘neuron’ from ‘torch’#
Symptoms: Import error when trying to use Neuron features.
Cause: Using PyTorch/XLA syntax with Native PyTorch backend.
Solution: Update code to use Native PyTorch syntax:
# Old (PyTorch/XLA)
import torch_xla.core.xla_model as xm
device = xm.xla_device()
# New (Native PyTorch)
import torch
device = torch.device('neuron')
See PyTorch Support on Neuron for complete migration guide.
Device and Runtime Errors#
No Neuron devices found#
Symptoms: neuron-ls shows no devices or returns error.
Causes:
Wrong instance type
Neuron driver not loaded
Runtime not started
Solutions:
Verify instance type:
curl http://169.254.169.254/latest/meta-data/instance-type # Should show inf2.*, trn1.*, trn2.*, trn3.*, or inf1.*
Check Neuron driver:
lsmod | grep neuron # Should show neuron driver loaded
Install/reload driver:
# Ubuntu/Debian sudo apt-get install -y aws-neuronx-dkms # Amazon Linux sudo yum install -y aws-neuronx-dkms
Restart runtime:
sudo systemctl restart neuron-monitor neuron-ls
RuntimeError: Neuron runtime initialization failed#
Symptoms: Runtime fails to initialize when running models.
Causes:
Insufficient permissions
Runtime version mismatch
Corrupted runtime state
Solutions:
Check runtime status:
sudo systemctl status neuron-monitor
Verify permissions:
ls -l /dev/neuron* # Should be accessible by current user
Reinstall runtime:
sudo apt-get install --reinstall aws-neuronx-runtime-lib
Version Compatibility Issues#
Compiler version mismatch#
Symptoms: Error about incompatible compiler version.
Cause: neuronx-cc version incompatible with framework version.
Solution: Install compatible versions:
# For PyTorch 2.9
pip install neuronx-cc==2.15.* --extra-index-url=https://pip.repos.neuron.amazonaws.com
See AWS Neuron SDK Release Notes for version compatibility matrix.
Package dependency conflicts#
Symptoms: pip reports conflicting dependencies.
Solution: Use fresh virtual environment:
python3 -m venv ~/fresh_neuron_venv
source ~/fresh_neuron_venv/bin/activate
pip install -U pip
# Install packages in correct order
pip install torch==2.9.0
pip install torch-neuronx neuronx-cc --extra-index-url=https://pip.repos.neuron.amazonaws.com
Network and Repository Issues#
Cannot connect to Neuron repository#
Symptoms: apt-get or pip cannot reach Neuron repositories.
Solutions:
Verify network connectivity:
curl -I https://apt.repos.neuron.amazonaws.com curl -I https://pip.repos.neuron.amazonaws.com
Check proxy settings (if behind corporate proxy):
export https_proxy=http://proxy.example.com:8080 export http_proxy=http://proxy.example.com:8080
Use alternative index URL:
pip install torch-neuronx --index-url=https://pip.repos.neuron.amazonaws.com
GPG key expired#
Symptoms: “EXPKEYSIG” error during apt-get update.
Solution:
wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | sudo apt-key add -
sudo apt-get update -y
Getting Help#
If issues persist:
Check release notes: AWS Neuron SDK Release Notes
Review documentation: PyTorch Support on Neuron
GitHub Issues: aws-neuron-sdk/aws-neuron-sdk
AWS Support: Open support case if you have AWS Support plan
Diagnostic Information#
When reporting issues, include:
# System information
uname -a
cat /etc/os-release
# Instance type
curl http://169.254.169.254/latest/meta-data/instance-type
# Neuron devices
neuron-ls
# Package versions
pip list | grep -E "(torch|neuron)"
# Driver status
lsmod | grep neuron
sudo systemctl status neuron-monitor
This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3