This document is relevant for: Inf2
, Trn1
, Trn1n
nki.isa.iota#
- nki.isa.iota(expr, dtype, mask=None, **kwargs)[source]#
Build a constant literal in SBUF using GpSimd Engine, rather than transferring the constant literal values from the host to device.
The iota instruction takes an affine expression of
nki.language.arange()
indices as the input pattern to generate constant index values (see examples below for more explanation). The index values are computed in 32-bit integer math. The GpSimd Engine is capable of casting the integer results into any desirable data type (specified bydtype
) before writing them back to SBUF, at no additional performance cost.Estimated instruction cost:
150 + N
GpSimd Engine cycles, whereN
is the number of elements per partition in the output tile.- Parameters:
expr – an input affine expression of
nki.language.arange()
dtype – output data type of the generated constant literal (see Supported Data Types for more information)
mask – (optional) a compile-time constant predicate that controls whether/how this instruction is executed (see NKI API Masking for details)
- Returns:
an output tile in SBUF
Example:
import neuronxcc.nki.isa as nisa import neuronxcc.nki.language as nl import numpy as np ... ################################################################## # Example 1: Generate tile a of 512 constant values in SBUF partition 0 # that start at 0 and increment by 1: ################################################################## # a = [0, 1, ..., 511] expr_a = nl.arange(0, 512)[None, :] a = nisa.iota(expr_a, dtype=np.int32) ################################################################## # Example 2: Generate tile b of 128 constant values across SBUF partitions # that start at 0 and increment by 1, with one value per partition: # b = [[0], # [1], # ..., # [127]] ################################################################## expr_b = nl.arange(0, 128)[:, None] b = nisa.iota(expr_b, dtype=np.int32) ################################################################## # Example 3: Generate tile c of 512 constant values in SBUF partition 0 # that start at 0 and decrement by 1: # c = [0, -1, ..., -511] ################################################################## expr_c = expr_a * -1 c = nisa.iota(expr_c, dtype=np.int32) ################################################################## # Example 4: Generate tile d of 128 constant values across SBUF # partitions that start at 5 and increment by 2 ################################################################## # d = [[5], # [7], # ..., # [259]] expr_d = 5 + expr_b * 2 d = nisa.iota(expr_d, dtype=np.int32) ################################################################## # Example 5: Generate tile e of shape [128, 512] by # broadcast-add expr_a and expr_b # e = [[0, 1, ..., 511], # [1, 2, ..., 512], # ... # [127, 2, ..., 638]] ################################################################## e = nisa.iota(expr_a + expr_b, dtype=np.int32)
This document is relevant for: Inf2
, Trn1
, Trn1n