This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

Neuron Explorer#

Important

Neuron Explorer is in active development! At this time, it does not support system level profiling. For a stable user experience and system profiling, see Neuron Profiler 2.0 and Neuron Profiler.

Neuron Explorer is a suite of tools designed to support ML engineers throughout their development journey on AWS Trainium. Neuron Explorer helps developers maintain context, iterate efficiently, and focus on building and optimizing high-performance models. Developers can access Neuron Explorer from CLI, UI, or inside their IDE through VSCode integration.

Advanced Profiling Viewers#

Note

Neuron Explorer will replace Neuron Profiler and Neuron Profiler 2.0 in a future release. Please see Neuron Explorer FAQ for more details.

Neuron Explorer includes improvements over prior profiling workflows supported by Neuron Profiler and Neuron Profiler 2.0. Neuron Explorer enables ML performance engineers to trace execution from source code down to hardware operations, enabling detailed analysis of model behavior at every layer of the stack. The suite of tools supports both single-node and distributed applications, allowing developers to analyze workloads at scale.

Getting Started#

Get Started

Set up Neuron Explorer, launch the web UI, and configure SSH tunneling for secure access to profiling data.

Launch Profiles via CLI, UI, or IDE

Learn how to capture and launch the Neuron Explorer UI, use the Profile Manager, and view results in VSCode.

Visualization and Analysis#

Device Viewer

Explore hardware-level execution with timeline view, operator table, event details, annotations, dependency highlighting, search, and more analysis features.

Hierarchy Viewer

Visualize the entire execution from model layers down to hardware execution, supporting interactivity with device viewer and source code linking.

Source Code Viewer

Navigate between NKI and PyTorch source code and profile data with bidirectional linking and highlighting.

Summary Viewer

Get streamlined performance insights and optimization recommendations with high-level metrics and visualizations.

AI Recommendation Viewer

Get AI powered bottleneck analysis and optmization recommendations for NKI profiles.

Tutorials#

Profile a NKI Kernel

Learn how to profile a NKI kernel with Neuron Explorer.

Multi-node Training

Profile multi-node training jobs with SLURM scheduling and visualize distributed workload performance.

vLLM Performance

Capture and analyze system-level and device-level profiles for vLLM inference workloads on Trainium.

Neuron Explorer FAQ#

What can I expect from the Neuron Explorer?#

At this time, Neuron Explorer features an enhanced device profiling experience. In future releases, Neuron Explorer will expand to provide support for the entire ML development journey on Trainium, with additional system and device level profiling viewers and features, debugging capabilities, IDE tooling, and enhanced recommendation and analysis tools.

What is the difference between device-level and system-level profiling?#

Device-level profiling captures hardware execution data from NeuronCores, including compute engine instructions, DMA operations, and hardware utilization. Use device-level profiling to analyze hardware performance, identify compute or memory bottlenecks, and optimize kernel implementations.

System-level profiling captures software execution data, including framework operations, Neuron Runtime API calls, CPU utilization, and memory usage. Use system-level profiling to analyze framework overhead, identify CPU bottlenecks, and debug runtime issues.

For comprehensive performance analysis, you must consider both profiling levels to understand the complete picture from application code to hardware execution.

Should I continue using Neuron Profiler or migrate to Neuron Explorer?#

Use Neuron Profiler or Profiler 2.0 if you need both device-level and system-level profiling in a single workflow. These are the current default tools and provide the most comprehensive profiling experience with a stable, proven interface.

Use Neuron Explorer if your analysis focuses on hardware-level performance and you want enhanced capabilities such as hierarchical profiling, bidirectional code linking, or AI-powered recommendations. Neuron Explorer is particularly effective for NKI kernel development, hardware bottleneck analysis, and iterative optimization workflows that benefit from IDE integration and faster performance.

How do I see end to end profile for my workload with the latest features for both system and device profiling?#

Neuron Explorer currently provides next generation device-level profiling features. For latest system-level profiling support, use Neuron Profiler 2.0 until Neuron Explorer includes this capability.

For guidance on how to use the Neuron Explorer for device profiling and Neuron Profiler 2.0 for system profiling, see tutorials.

Is Neuron Explorer going to replace Neuron Profiler and Neuron Profiler 2.0? When will this happen?#

Currently, Neuron Profiler and Profiler 2.0 are fully supported as the default tools. Neuron Explorer is in public beta with device-level profiling capabilities.

Neuron Explorer will become the default profiling tool once system-level profiling is integrated. Neuron Profiler and Profiler 2.0 will remain supported until Neuron Explorer enters GA classification. When Neuron Profiler and Profiler 2.0 enter end-of-support, they will no longer receive updates or technical support, though they will remain accessible through the neuron-profile package in previous releases. Users should plan to migrate to Neuron Explorer before the end-of-support date.

Are my existing profiles compatible with Neuron Explorer?#

Yes. Neuron Explorer is backwards compatible with profile data captured using Neuron Profiler or Profiler 2.0. Existing profile files must be reprocessed before viewing in Neuron Explorer, but you do not need to recapture them. See Get Started with Neuron Explorer.

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3