This document is relevant for: Inf1, Trn1

Neuron Glossary#

Terms#

Neuron Devices (Accelerated Machine Learning chips)#

Term

Description

Inferentia#

AWS first generation accelerated machine learning chip supporting inference only

Trainium#

AWS second generation accelerated machine learning chip supporting training and inference

Neuron Device#

Accelerated machine learning chip (e.g. Inferentia or Trainium)

Neuron powered Instances#

Term

Description

Inf1#

Inferentia powered accelerated compute EC2 instance

Trn1#

Trainium powered accelerated compute EC2 instance

NeuronCore terms#

Term

Description

NeuronCore#

The machine learning compute cores within Inferentia/Trainium

NeuronCore-v1#

Neuron Core withing Inferentia

NeuronCore-v2#

Neuron Core withing Trainium

Tensor Engine#

2D systolic array (within the NeuronCore), used for matrix computations

Scalar Engine#

A scalar-engine within each NeuronCore, which can accelerate element-wise operations (e.g. GELU, ReLU, reciprocal, etc)

Vector Engine#

A vector-engine with each NeuronCore, which can accelerate spatial operations (e.g. layerNorm, TopK, pooling, etc)

GPSIMD Engine#

Embedded General Purpose SIMD cores, within each NeuronCore, to accelerate custom-operators

Sync Engine#

The SP engine, which is integrated inside NeuronCore. Used for synchronization and DMA triggering.

Collective Communication Engine#

Dedicated engine for collective communication, allows for overlapping computation and communication

Interconnect between NeuronCores

Interconnect between NeuronCores in Inferentia device

Interconnect between NeuronCores in Trainium device

Abbreviations#

Abbreviation

Description

NC#

Neuron Core

NeuronCore#

Neuron Core

ND#

Neuron Device

NeuronDevice#

Neuron Device

TensEng#

Tensor Engine

ScalEng#

Scalar Engine

VecEng#

Vector Engine

SyncEng#

Sync Engine

CCE#

Collective Communication Engine

FP32#

Float32

TF32#

TensorFloat32

FP16#

Float16

BF16#

Bfloat16

cFP8#

Configurable Float8

RNE#

Round Nearest Even

SR#

Stochastic Rounding

CustomOps#

Custom Operators

RT#

Neuron Runtime

DP#

Data Parallel

DPr#

Data Parallel degree

TP#

Tensor Parallel

TPr#

Tensor Parallel degree

PP#

Pipeline Parallel

PPr#

Pipeline Parallel degree

This document is relevant for: Inf1, Trn1