This document is relevant for: Trn2, Trn3

nki.language.tile_size#

class nki.language.tile_size#

Hardware tile size constants (pmax, psum_fmax, gemm_stationary_fmax, etc.).

Attributes

pmax#

Maximum partition dimension of a tile.

psum_fmax#

Maximum free dimension of a tile on PSUM buffer, in FP32 elements.

psum_fmax_bytes#

Maximum free dimension of a tile on PSUM buffer, in bytes.

psum_num_banks#

Number of usable PSUM banks per partition.

Returns 7 when dma_transpose is lowered to PE transpose (NKI_DMA_TRANSPOSE_AS_PE_TRANSPOSE=true on trn2+), since bank 7 is reserved for nc_transpose. Otherwise returns 8.

sbuf_size_bytes#

Total SBUF capacity in bytes (all partitions combined).

sbuf_fmax#

Maximum free dimension of a tile on SBUF buffer, in FP32 elements.

sbuf_fmax_bytes#

Maximum free dimension of a tile on SBUF buffer, in bytes.

gemm_stationary_fmax#

Maximum free dimension of the stationary operand of General Matrix Multiplication on Tensor Engine.

gemm_moving_fmax#

Maximum free dimension of the moving operand of General Matrix Multiplication on Tensor Engine.

bn_stats_fmax#

Maximum free dimension of BN_STATS.

psum_min_align#

Minimum byte alignment requirement for PSUM free dimension address.

sbuf_min_align#

Minimum byte alignment requirement for SBUF free dimension address.

total_available_sbuf_size#

Usable SBUF free dimension per partition, in bytes.

Deprecated since version 0.4.0b4: Despite the name, this returns the usable SBUF capacity per partition, not the total SBUF capacity. Use sbuf_size_bytes for the total SBUF capacity across all partitions, or sbuf_fmax_bytes for the usable per-partition size.

This document is relevant for: Trn2, Trn3