This document is relevant for: Inf2, Trn1, Trn2
nki.language.matmul#
- nki.language.matmul(x, y, *, transpose_x=False, mask=None, **kwargs)[source]#
x @ ymatrix multiplication ofxandy.((Similar to numpy.matmul))
Note
For optimal performance on hardware, use
nki.isa.nc_matmul()or callnki.language.matmulwithtranspose_x=True. Usenki.isa.nc_matmulalso to access low-level features of the Tensor Engine.Note
Implementation details:
nki.language.matmulcallsnki.isa.nc_matmulunder the hood.nc_matmulis neuron specific customized implementation of matmul that computesx.T @ y, as a result,matmul(x, y)lowers tonc_matmul(transpose(x), y). To avoid this extra transpose instruction being inserted, usex.Tandtranspose_x=Trueinputs to thismatmul.- Parameters:
x – a tile on SBUF (partition dimension
<= 128, free dimension<= 128),x’s free dimension must matchy’s partition dimension.y – a tile on SBUF (partition dimension
<= 128, free dimension<= 512)transpose_x – Defaults to False. If
True,xis treated as already transposed. IfFalse, an additional transpose will be inserted to makex’s partition dimension the contract dimension of the matmul to align with the Tensor Engine.mask – (optional) a compile-time constant predicate that controls whether/how this instruction is executed (see NKI API Masking for details)
- Returns:
x @ yorx.T @ yiftranspose_x=True
This document is relevant for: Inf2, Trn1, Trn2