NKI Developer How-To Guides#

Comprehensive guides for developing high-performance kernels with the Neuron Kernel Interface (NKI). These resources cover everything from basic NKI concepts to advanced performance optimization techniques for AWS Trainium and Inferentia accelerators.

For details on programming with NKI APIs and syntax, see NKI Programming Model (Legacy).

NKI Kernel Optimization Guide

Learn the basics of NKI kernel optimization with this code-backed guide.

Framework Custom Operators

Integrate NKI kernels as custom operators in PyTorch and JAX frameworks

How to Profile a NKI Kernel

Learn how to profile a NKI kernel with Neuron Explorer.

Profiling NKI Kernels (Legacy Guide)

Performance analysis and debugging techniques using Neuron Profile tools

NKI Performance Guide

Advanced optimization strategies for maximizing kernel performance and efficiency

Direct Allocation Guide

Manual memory management techniques for fine-grained control over data placement

Block Dimension Migration

Migration guide for updating kernels to use new block dimension features