This document is relevant for: Inf2, Trn1, Trn1n

Neuron Kernel Interface (NKI) release notes#

Neuron Kernel Interface (NKI) (Beta) [2.20]#

Date: 09/16/2024

  • This release includes the beta launch of the Neuron Kernel Interface (NKI) (Beta). NKI is a programming interface enabling developers to build optimized compute kernels on top of Trainium and Inferentia. NKI empowers developers to enhance deep learning models with new capabilities, performance optimizations, and scientific innovation. It natively integrates with PyTorch and JAX, providing a Python-based programming environment with Triton-like syntax and tile-level semantics offering a familiar programming experience for developers. Additionally, to enable bare-metal access precisely programming the instructions used by the chip, this release includes a set of NKI APIs (nki.isa) that directly emit Neuron Instruction Set Architecture (ISA) instructions in NKI kernels.

  • In addition to documentation, we’ve included many of the innovative kernels used with-in the neuron-compiler such as mamba and flash attention as open-source samples in a new nki-samples GitHub repository. New kernel contributions are welcome via GitHub Pull-Requests as well as feature requests and bug reports as GitHub Issues. For more information see the latest documentation. Included in this initial beta release is an in-depth getting started, architecture, profiling, and performance guide, along with multiple tutorials, api reference documents, documented known issues and frequently asked questions.

This document is relevant for: Inf2, Trn1, Trn1n