This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

Neuron Containers#

This section contains the technical documentation for using AWS Neuron Deep Learning Containers (DLCs) and containerized deployments on Inferentia and Trainium instances.

What are Neuron Deep Learning Containers?#

AWS Neuron Deep Learning Containers (DLCs) are a set of pre-configured Docker images for training and serving models on AWS Trainium and Inferentia instances using the AWS Neuron SDK. Each DLC is optimized for specific ML frameworks and comes with all Neuron components pre-installed, enabling you to quickly deploy containerized workloads without manual setup.

With Neuron DLCs, developers can:

  • Deploy production-ready containers with pre-installed Neuron SDK and ML frameworks

  • Use containers across multiple deployment platforms including EC2, EKS, ECS, and SageMaker

  • Customize DLCs to fit specific project requirements

  • Leverage Neuron plugins for better observability and fault tolerance

  • Run distributed training and inference workloads with vLLM integration

  • Schedule MPI jobs on Trn2 UltraServers for improved performance

Neuron DLCs support popular ML frameworks including PyTorch, TensorFlow, and JAX, and are available for both training and inference workloads on Inf1, Inf2, Trn1, Trn1n, and Trn2 instances.

Neuron DRA for Kubernetes

Neuron has released support for Dynamic Resource Allocation (DRA) with Kubernetes. Read more about it here.

Quickstarts#

Quickstart: Deploy a DLC with vLLM

Get started by configuring and deploying a Deep Learning Container with vLLM for inference. Time to complete: ~30 minutes.

Quickstart: Build a Custom Neuron Container

Learn how to build a custom Neuron container using Docker for training or inference workloads.

Neuron Containers Documentation#

Getting Started

Step-by-step guide for building Neuron containers using Docker, including driver installation and container setup.

Locate Neuron DLC Images

Find the right pre-configured Deep Learning Container image for your ML framework and instance type.

Customize Neuron DLC

Learn how to customize Neuron Deep Learning Containers to fit your specific project requirements.

Neuron Plugins

Explore Neuron plugins for containerized environments, providing better observability and fault tolerance.

Tutorials

Hands-on tutorials for deploying containers on EC2, EKS, ECS, and other platforms with various configurations.

How-To: Schedule MPI Jobs on UltraServers

Learn how to schedule MPI jobs to run on Neuron UltraServers in EKS for improved performance.

FAQ & Troubleshooting

Frequently asked questions and solutions for common issues with Neuron containers.

Neuron Containers Release Notes

Review the latest updates, new DLC images, and improvements in Neuron container releases.

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3