This document is relevant for: Inf2, Trn1, Trn1n

nki.isa.iota#

nki.isa.iota(expr, dtype, mask=None, **kwargs)[source]#

Build a constant literal in SBUF using GpSimd Engine, rather than transferring the constant literal values from the host to device.

The iota instruction takes an affine expression of nki.language.arange() indices as the input pattern to generate constant index values (see examples below for more explanation). The index values are computed in 32-bit integer math. The GpSimd Engine is capable of casting the integer results into any desirable data type (specified by dtype) before writing them back to SBUF, at no additional performance cost.

Estimated instruction cost:

150 + N GpSimd Engine cycles, where N is the number of elements per partition in the output tile.

Parameters:
  • expr – an input affine expression of nki.language.arange()

  • dtype – output data type of the generated constant literal (see Supported Data Types for more information)

  • mask – (optional) a compile-time constant predicate that controls whether/how this instruction is executed (see NKI API Masking for details)

Returns:

an output tile in SBUF

Example:

import neuronxcc.nki.isa as nisa
import neuronxcc.nki.language as nl
import numpy as np
...
  ##################################################################
  # Example 1: Generate tile a of 512 constant values in SBUF partition 0
  # that start at 0 and increment by 1:
  ##################################################################
  # a = [0, 1, ..., 511]
  expr_a = nl.arange(0, 512)[None, :]
  a = nisa.iota(expr_a, dtype=np.int32)

  ##################################################################
  # Example 2: Generate tile b of 128 constant values across SBUF partitions
  # that start at 0 and increment by 1, with one value per partition:
  # b = [[0],
  #      [1],
  #      ...,
  #      [127]]
  ##################################################################
  expr_b = nl.arange(0, 128)[:, None]
  b = nisa.iota(expr_b, dtype=np.int32)
  
  ##################################################################
  # Example 3: Generate tile c of 512 constant values in SBUF partition 0
  # that start at 0 and decrement by 1:
  # c = [0, -1, ..., -511]
  ##################################################################
  expr_c = expr_a * -1
  c = nisa.iota(expr_c, dtype=np.int32)

  ##################################################################
  # Example 4: Generate tile d of 128 constant values across SBUF
  # partitions that start at 5 and increment by 2
  ##################################################################
  # d = [[5],
  #      [7],
  #      ...,
  #      [259]]
  expr_d = 5 + expr_b * 2
  d = nisa.iota(expr_d, dtype=np.int32)

  ##################################################################
  # Example 5: Generate tile e of shape [128, 512] by
  # broadcast-add expr_a and expr_b
  # e = [[0, 1, ..., 511],
  #      [1, 2, ..., 512],
  #      ...
  #      [127, 2, ..., 638]]
  ##################################################################
  e = nisa.iota(expr_a + expr_b, dtype=np.int32)

This document is relevant for: Inf2, Trn1, Trn1n