This document is relevant for: Trn2, Trn3

nki.collectives.all_to_all#

nki.collectives.all_to_all(srcs: List, dsts: List, replica_group: ReplicaGroup, collective_dim: int, priority: Optional[int] = None) None[source]#

Perform an all-to-all on the given replica group and input/output tensors.

The srcs and dsts parameters accept lists of tensors to support coalesced collective communication, which allows multiple tensors to be redistributed in a single collective operation for improved efficiency.

Tensors must reside on HBM. SBUF is not currently supported for all-to-all.

Parameters:
  • srcs – List of input tensors to redistribute

  • dsts – List of output tensors to store results

  • replica_group – ReplicaGroup defining rank groups for the collective

  • collective_dim – Dimension along which input tensors are split and output tensors are concatenated. Currently only 0 is supported.

  • priority – DMA quality-of-service priority level 0-3 where lower is higher priority (NeuronCore-v4+ only)

This document is relevant for: Trn2, Trn3