nki.isa.tensor_copy#

nki.isa.tensor_copy(dst, src, engine=engine.unknown, name=None)[source]#

Create a copy of src tile within NeuronCore on-chip SRAMs using Vector, Scalar or GpSimd Engine.

The output tile has the same partition axis size and also the same number of elements per partition as the input tile src.

tensor_copy casting behavior depends on the input and output data types.

  1. When src and dst data types are the same: tensor_copy performs a bit-accurate copy.

  2. When src and dst data types differ: tensor_copy performs an intermediate FP32 cast.

In addition, since GpSimd Engine cannot access PSUM in NeuronCore, Scalar or Vector Engine must be chosen when the input or output tile is in PSUM (see NeuronCore-v2 Compute Engines for details).

On NeuronCore v2, tensor_copy is not supported on the Scalar Engine. Instead, use nisa.activation with op=nl.copy.

Constraints.

  • Supported engines:
    • NeuronCore v2: Vector, GpSimd

    • NeuronCore v3+: Vector, Scalar, GpSimd

  • Since GpSimd cannot access PSUM, src and dst must be in SBUF when using GpSimd Engine.

Parameters:
  • dst – a tile with the same content and partition axis size as the src tile.

  • src – the source of copy, must be a tile in SBUF or PSUM.

  • engine – (optional) the engine to use for the operation: nki.isa.vector_engine, nki.isa.scalar_engine, nki.isa.gpsimd_engine or nki.isa.unknown_engine (default, compiler selects best engine based on engine workload).