This document is relevant for: Trn2, Trn3

nki.isa.matmul_perf_mode#

class nki.isa.matmul_perf_mode(value)[source]#

Performance mode for matmul.

Attributes

none

Default mode, no performance optimization

double_row

Double FP8 mode, 2x matmul throughput by packing two FP8 weight/ifmap element pairs

This document is relevant for: Trn2, Trn3