NxD Inference#
NxD Inference (where NxD stands for NeuronX Distributed) is an open-source PyTorch-based inference library that simplifies deep learning model deployment on AWS Inferentia and Trainium instances.
NxDI Inference Overview and Setup
Get started with Inference
Tutorials
Tutorial: Scaling LLM Inference with Data Parallelism on Trn2
Tutorial: Multi-LoRA serving for Llama-3.1-8B on Trn2 instances
Tutorial: Evaluating Accuracy of Llama-3.1-70B on Neuron using open source datasets
Tutorial: Static 1P1D Disaggregated Inference on Trn2 [BETA]
Tutorial: Evaluating Performance of Llama-3.3-70B on Neuron using Performance CLI