This document is relevant for: Inf2, Trn1, Trn1n

nki.language.atomic_rmw#

nki.language.atomic_rmw(dst, value, op, mask=None, **kwargs)[source]#

Perform an atomic read-modify-write operation on HBM data dst = op(dst, value)

Parameters:
  • dst – HBM tensor with subscripts, only supports indirect dynamic indexing currently.

  • value – tile or scalar value that is the operand to op.

  • op – atomic operation to perform, only supports np.add currently.

  • mask – (optional) a compile-time constant predicate that controls whether/how this instruction is executed (see NKI API Masking for details)

Returns:

import neuronxcc.nki.language as nl
...

  i_p = nl.arange(128)[:, None]
  i_f = nl.arange(512)[None, :]

  ########################################################################
  # Atomic read-modify-write example:
  #   - read: values of rmw_tensor is indexed by values from indices_tile
  #   - modify: incremented by value
  #   - write: saved back into rmw_tensor
  # resulting in rmw_tensor = rmw_tensor + value
  ########################################################################
  nl.atomic_rmw(rmw_tensor[indices_tile[i_p, 0], i_f], value=value, op=np.add)

This document is relevant for: Inf2, Trn1, Trn1n