This document is relevant for: Inf1

AWS Neuron architecture guides#

Review and understand the hardware architecture of AWS Neuron instances, including AWS Elastic Compute Cloud (EC2) Trn and Inf instance types, AWS Inferentia and Trainium chips, and NeuronCore processing units. The documentation covers system specifications, memory hierarchies, interconnect topologies, and architectural considerations for machine learning workloads.

About Neuron Hardware#

AWS Neuron hardware consists of custom-designed machine learning accelerators optimized for deep learning workloads. This section covers the architecture and capabilities of AWS Inferentia and Trainium chips, their NeuronCore processing units, and the EC2 instances that host them.

Trainium Architecture#

AWS Trainium3

Third-generation training accelerator chip

AWS Trainium2

Second-generation training accelerator chip

AWS Trainium

First-generation training accelerator chip

Inferentia Architecture#

AWS Inferentia2

Second-generation inference accelerator chip

AWS Inferentia

First-generation inference accelerator chip

NeuronCore Architecture#

NeuronCores are fully-independent heterogenous compute-units that power Tranium, Tranium2, Inferentia, and Inferentia2 chips.

NeuronCore v4

Processing unit architecture for Trainium3

NeuronCore v3

Processing unit architecture for Trainium2

NeuronCore v2

Processing unit architecture for Inferentia2 and Trainium

NeuronCore v1

Processing unit architecture for Inferentia

Neuron AWS EC2 Platform Architecture#

Overviews of the AWS Inf and Trn instance and UltraServer architectures.

Inf1 Architecture

Inf1 instance architecture and specifications

Inf2 Architecture

Inf2 instance architecture and specifications

Trn1 Architecture

Trn1 instance architecture and specifications

Trn2 Architecture

Trn2 instance architecture and specifications

Trn3 Architecture

Trn3 instance architecture and specifications

This document is relevant for: Inf1