This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2
Logical NeuronCore configuration#
Logical NeuronCore configuration (LNC) is a set of compiler and runtime settings for instances powered by AWS Trainium2 that determines the number of NeuronCores exposed to your machine learning (ML) applications. LNC configuration works by combining the compute and memory resources of multiple physical NeuronCores into a single logical NeuronCore. You can configure these settings to reduce the number of worker process needed for training and deployment of large-scale models.
Logical NeuronCores#
A logical NeuronCore is a grouping of physical NeuronCores that the Neuron Compiler, Neuron Runtime, Neuron Tools, and Frameworks handle as a single unified NeuronCore. Every Trainium2 device contains eight physical NeuronCore-v3.
Compiler and runtime settings#
LNC configuration is controlled with the following runtime and compiler settings:
NEURON_LOGICAL_NC_CONFIG
runtime environment variable controls how many physical NeuronCores are grouped to make up a logical NeuronCore.--logical-nc-config
or -lnc
command-line options control the degree of model sharding the compiler performs on an input graph. You must compile your Models to use the LNC configuration set by the Neuron Runtime environment variable. AWS Neuron currently doesn’t support setting the compiler flag to a different LNC configuration than the Neuron Runtime environment variable.Logical NeuronCore configurations#
AWS Neuron supports the following Logical NeuronCore configurations:
A Logical NeuronCore configuration (LNC) of two is the default setting on Trainium2 devices. It combines two physical
NeuronCore-v3 into a logical NeuronCore with the software id NC_V3d
. When you set Logical NeuronCore configuration to
two, it directs Trainium2 devices to expose four NC_v3d
to your machine learning applications. On this setting,
a Trn2.48xlarge
instance presents 64 available NeuronCores. The folowing high-level diagram shows a Trn2.48xlarge
instance, connected in a 2D torus topology, with the Logical NeuronCore configuration set to two.
Trainium2 devices contain four 24GB HBM banks. Each bank is shared by two physical NeuronCore-v3. When LNC=2, the two physical NeuronCores share a single address space. Workers on each of the two physical NeuronCores can access tensors and perform local collective operations without accessing the network. The following diagram shows how a logical NeuronCore is presented to the software under this configuration.
To set the Logical NeuronCore configuration to two, use the following runtime and compiler flag combination:
NEURON_LOGICAL_NC_CONFIG
= 2-lnc
= 2When you set the Logical NeuronCore configuration to one, it assigns each physical NeuronCore-v3 to a single logical
NeuronCore with the software id NC_V3
. This directs Trainium2 devices to expose eight NC_v3
to your machine learning
applications. On this setting, a Trn2.48xlarge
instance presents 128 available NeuronCores.
The following high-level diagram shows a Trn2.48xlarge
instance, connected in a 2D torus topology,
with the Logical NeuronCore configuration set to one.
Trainium2 devices contain four 24GB HBM banks. Each bank is shared by two physical NeuronCore-v3. When the Logical NeuronCore configuration is set to one, both physical NeuronCores have access to the entire 24GB HBM bank. The following diagram shows how logical NeuronCores are presented to the software under this configuration.
To set the Logical NeuronCore configuration to one, use the following runtime and compiler flag combination:
NEURON_LOGICAL_NC_CONFIG
= 1-lnc
= 1This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2