This document is relevant for: Trn1, Trn2, Trn3

nki.isa.register_alloc#

nki.isa.register_alloc(x=None)[source]#

Allocate a virtual register and optionally initialize it with a value.

Each engine sequencer (Tensor/Scalar/Vector/GpSimd/Sync Engine) within a NeuronCore maintains its own set of physical registers for scalar operations (64x 32-bit registers per engine sequencer in NeuronCore v2-v4). This API conceptually allocates a register within a virtual register space. Users do not need to explicitly free a register through nisa APIs. The NKI compiler handles physical register allocation (and deallocation) across the appropriate engine sequencers based on the dynamic program flow.

NKI provides the following APIs to manipulate allocated registers:

  • nisa.register_move: Move a constant integer or another register’s value into a register

  • nisa.register_load: Load a scalar (32-bit) value from HBM/SBUF into a register

  • nisa.register_store: Store register contents to HBM/SBUF

In the current NKI release, these registers are primarily used to specify dynamic loop boundaries and while loop conditions. The NKI compiler compiles such dynamic looping constructs to branching instructions executed by engine sequencers. For additional details, see nl.dynamic_range. For more information on engine sequencer and its capabilities, see Trainium/Inferentia2 architecture guide.

Parameters:

x

optional initialization value. Can be one of:

  • None (default): allocate an uninitialized register

  • int: allocate a register initialized with this immediate integer value

Example:

Three ways to allocate a register initialized to zero:

# Approach 1: Using an immediate value
reg1 = nisa.register_alloc(0)

# Approach 2: Two-step with register_load
zero_tensor = nl.zeros([1, 1], dtype=nl.int32, buffer=nl.sbuf)
reg2 = nisa.register_alloc(None)
nisa.register_load(reg2, zero_tensor)

This document is relevant for: Trn1, Trn2, Trn3