This document is relevant for: Inf1

Inferentia Architecture#

At the heart of the Inf1 instance are 16 x Inferentia devices (each Inferentia include 4 x NeuronCore-v1), as depicted below:

../../../_images/inferentia-neurondevice.png

Each Inferentia device consists of:

  • Compute:
    • 4x NeuronCore-v1 cores, delivering 128 INT8 TOPS and 64 FP16/BF16 TFLOPS

  • Device Memory:
    • 8GB of device DRAM memory (for storing parameters and intermediate state), with 50 GB/sec of bandwidth

  • NeuronLink:

This document is relevant for: Inf1