This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3

System Tools#

Neuron system tools provide essential utilities for monitoring, debugging, and managing AWS Neuron devices and workloads. These command-line tools offer real-time insights into device utilization, process management, hardware health, and performance metrics across Neuron instances.

Neuron-Monitor User Guide

Real-time monitoring tool for tracking NeuronCore utilization, memory usage, and thermal metrics across Neuron devices with customizable output formats.

Neuron-Top User Guide

Interactive process viewer similar to htop that displays running processes on Neuron devices with real-time resource consumption metrics.

Neuron-LS User Guide

Device discovery and listing tool that provides detailed information about available Neuron devices, their capabilities, and current status.

Neuron-Sysfs User Guide

Low-level system interface tool for accessing Neuron device information through the Linux sysfs filesystem interface.

NCCOM-TEST User Guide

Collective communication testing and benchmarking tool for validating and measuring performance of multi-device communication patterns.

TensorBoard

TensorBoard Neuron plugin for Trn1 instances, including installation, configuration, and advanced visualization features.

Tutorials

Tutorials for how to utilize the Neuron system tools suite.

What’s New

Latest updates, new features, and improvements to the Neuron system tools suite.

This document is relevant for: Inf1, Inf2, Trn1, Trn2, Trn3