This document is relevant for: Inf2, Trn1, Trn1n

NKI FAQ#

When should I use NKI?#

NKI enables customers to self serve, onboard novel deep learning architectures, and implement operators currently unsupported by traditional ML Framework operators. With NKI, customers can experiment with models and operators and can create unique differentiation. Additionally, in cases where the compiler’s optimizations are too generalized for a developers’ particular use case, NKI enables customers to program directly against the Neuron primitives and therefore optimize performance of existing operators that are not being compiled efficiently.

Which AWS chips does NKI support?#

NKI supports all families of chips included in AWS custom-built machine learning accelerators, Trainium and Inferentia. This includes the second generation of NeuronCore-v2 and the following instances: Inf2, Trn1, Trn1n.

Which hardware engines are supported?#

The following AWS Trainium and Inferentia hardware engines are supported: Tensor Engine, Vector Engine, Scalar Engine, and GpSIMD Engine. For more details, see the Trainium/Inferentia2 Architecture Guide.

What ML Frameworks support NKI kernels?#

NKI is integrated with PyTorch and JAX frameworks. For more details, see the NKI Kernel as a Framework Custom Operator.

Where can I find NKI sample kernels?#

NKI hosts an open source sample repository nki-samples which includes a set of reference kernels and tutorial kernels built by the Neuron team and external contributors. For more information, see nki.kernels and NKI tutorials.

What should I do if I have trouble resolving a kernel compilation error?#

Refer to NKI Error Manual for a detailed guidance on how to resolve some of the common NKI compilation errors.

If you encounter compilation errors from Neuron Compiler that you cannot understand or resolve, you may check out NKI sample Github issues and open an issue if no similar issues exist.

How can I debug numerical issues in NKI kernels?#

We encourage NKI programmers to build kernels incrementally and verify output of small operators one at a time. NKI also provides a CPU simulation mode that supports printing of kernel intermediate tensor values to the console. See nki.simulate for a code example.

How can I optimize my NKI kernel?#

To learn how to optimize your NKI kernel, see the NKI Performance Guide.

Does NKI support entire Neuron instruction set?#

Neuron will iteratively add support for the Neuron instruction set through adding more nki.isa (Instruction Set Architecture) APIs in upcoming Neuron releases.

Will NKI APIs guarantee backwards compatibility?#

The NKI APIs follow the Neuron Software Maintenance policy for Neuron APIs. For more information, please see the SDK Maintenance Policy.

This document is relevant for: Inf2, Trn1, Trn1n