Neuron Inference Model Support#
This section provides information on model support in NeuronX Distributed Inference (NxDI) and how to determine appropriate configurations for both online and offline use cases.
Llama 3#
Meta’s Llama 3 family includes large language models available in multiple sizes and versions. Select the model variant that matches your application requirements:
Llama 3.3 70B
Meta’s multilingual LLM, featuring 70B parameters and Grouped Query Attention.
Note
Instructions for additional models will be available soon. For a complete list of supported model architectures, refer to this developer guide.