nki.isa.tensor_tensor#
- nki.isa.tensor_tensor(dst, data1, data2, op, engine=engine.unknown, name=None)[source]#
Perform an element-wise operation of input two tiles using Vector Engine or GpSimd Engine. The two tiles must have the same partition axis size and the same number of elements per partition.
The element-wise operator is specified using the
opfield and can be any binary operator supported by NKI (see Supported Math Operators for NKI ISA for details) that runs on the Vector Engine, or can benp.power/nl.powerthat runs on the GpSimd Engine. For bitvec operators, the input/output data types must be integer types and Vector Engine treats all input elements as bit patterns without any data type casting. For arithmetic operators, there is no restriction on the input/output data types, but the engine automatically casts input data types to float32 and performs the element-wise operation in float32 math. The float32 results are cast to the target data type specified in thedtypefield before written into the output tile. If thedtypefield is not specified, it is default to be the same as the data type ofdata1ordata2, whichever has the higher precision.Since GpSimd Engine cannot access PSUM, the input or output tiles cannot be in PSUM if
opisnp.power/nl.power(see NeuronCore-v2 Compute Engines for details). Otherwise, the output tile can be in either SBUF or PSUM. However, the two input tiles,data1anddata2cannot both reside in PSUM. The three legal cases are:Both
data1anddata2are in SBUF.data1is in SBUF, whiledata2is in PSUM.data1is in PSUM, whiledata2is in SBUF.
Note, if you need broadcasting capability in the free dimension for either input tile, you should consider using nki.isa.tensor_scalar API instead, which has better performance than
nki.isa.tensor_tensorin general.- Parameters:
dst – an output tile of the element-wise operation
data1 – lhs input operand of the element-wise operation
data2 – rhs input operand of the element-wise operation
op – a binary math operator (see Supported Math Operators for NKI ISA for supported operators)
engine – (optional) the engine to use for the operation: nki.isa.vector_engine, nki.isa.gpsimd_engine or nki.isa.unknown_engine (default, let compiler select best engine based on the input tile shape).