This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

Neuron Explorer#

Important

Neuron Explorer is the recommended profiling tool for AWS Neuron workloads. It provides end-to-end profiling support along with the latest features and an improved user experience.

Note: Neuron will end support for Neuron Profiler 2.0 and Neuron Profiler in Neuron 2.29 release. Users are encouraged to migrate to Neuron Explorer. Please see Migration Guide from Neuron Profiler to Neuron Explorer and Neuron Explorer FAQ for more details.

Neuron Explorer is a suite of tools designed to support ML engineers throughout their development journey on AWS Trainium. Neuron Explorer helps developers maintain context, iterate efficiently, and focus on building and optimizing high-performance models. Developers can access Neuron Explorer from CLI, UI, or inside their IDE through VSCode integration.

Profiling Viewers#

Neuron Explorer enables ML performance engineers to trace execution from source code down to hardware operations, enabling detailed analysis of model behavior at every layer of the stack. The suite of tools supports both single-node and distributed applications, allowing developers to analyze workloads at scale.

Getting Started#

Get Started

Set up Neuron Explorer, launch the web UI, and configure SSH tunneling for secure access to profiling data.

Capture and View Profiles

Learn how to capture and view profiles in the Neuron Explorer UI or directly in your IDE via VSCode Integration.

Visualization and Analysis#

Device Trace Viewer

Explore hardware-level execution with timeline view, operator table, event details, annotations, dependency highlighting, search, and more analysis features.

System Trace Viewer

Explore system-level execution with timeline view and more analysis features.

Hierarchy Viewer

Visualize the entire execution from model layers down to hardware execution, supporting interactivity with device viewer and source code linking.

Source Code Viewer

Navigate between NKI and PyTorch source code and profile data with bidirectional linking and highlighting.

Summary Viewer

Get streamlined performance insights and optimization recommendations with high-level metrics and visualizations.

Database Viewer

Develop your own analyses, examine profiling data stored in database tables, or run ad-hoc queries during performance analysis.

Tensor Viewer

Viewing tensor information including names, sizes, shapes, and memory usage details.

AI Recommendation Viewer

Get AI powered bottleneck analysis and optmization recommendations for NKI profiles.

Tutorials#

Profile a NKI Kernel

Learn how to profile a NKI kernel with Neuron Explorer.

Multi-node Training

Profile multi-node training jobs with SLURM scheduling and visualize distributed workload performance.

vLLM Performance

Capture and analyze system-level and device-level profiles for vLLM inference workloads on Trainium.

Additional Resources#

Viewing Profiles with Perfetto

Learn how to view Neuron Explorer profiles using the Perfetto UI for trace analysis.

Download the Neuron Explorer Visual Studio Code Extension#

Get the Neuron Explorer VSCode Extension

Once downloaded, open the command palette by pressing CMD+Shift+P (MacOS) or Ctrl+Shift+P (Windows), type > Extensions: Install from VSIX... and press Enter. When you are prompted to select a file, select aws-neuron.neuron-explorer-2.28.0.vsix and then the Install button (or press Enter) to install the extension.

Neuron Explorer FAQ#

What can I expect from the Neuron Explorer?#

Neuron Explorer provides a comprehensive profiling experience with both device-level and system-level profiling support. Neuron Explorer features an enhanced profiling experience with hierarchical profiling, bidirectional code linking, AI-powered recommendations, IDE integration, and more. In future releases, Neuron Explorer will continue to expand with additional profiling viewers and features, debugging capabilities, and enhanced recommendation and analysis tools to support the entire ML development journey on Trainium.

What is the difference between device-level and system-level profiling?#

Device-level profiling captures hardware execution data from NeuronCores, including compute engine instructions, DMA operations, and hardware utilization. Use device-level profiling to analyze hardware performance, identify compute or memory bottlenecks, and optimize kernel implementations.

System-level profiling captures software execution data, including framework operations, Neuron Runtime API calls, CPU utilization, and memory usage. Use system-level profiling to analyze framework overhead, identify CPU bottlenecks, and debug runtime issues.

Is Neuron Explorer going to replace Neuron Profiler and Neuron Profiler 2.0?#

Yes. Neuron Explorer is the recommended profiling tool and replaces both Neuron Profiler and Profiler 2.0.

Neuron Profiler and Profiler 2.0 are supported for one final release. In Neuron 2.29 release, they will enter end-of-support and will no longer receive updates or technical support, though they will remain accessible through the neuron-profile package in previous releases. Users should migrate to Neuron Explorer now.

Are my existing profiles compatible with Neuron Explorer?#

Yes. Neuron Explorer is backwards compatible with profile data captured using Neuron Profiler or Profiler 2.0. Existing profile files must be reprocessed before viewing in Neuron Explorer, but you do not need to recapture them. See Get Started with Neuron Explorer.

For detailed migration guidance, including CLI command mappings and feature comparisons, see the Migration Guide from Neuron Profiler to Neuron Explorer.

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3