This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2
Neuron Sysfs User Guide#
Introduction#
The kernel provides a few ways in which userspace programs can get system information from the kernel space. Sysfs is one common way to do so. It is a virtual filesystem typically mounted on the /sys
directory and contains information about hardware devices attached to the system and about drivers handling those devices. By navigating the hierarchical structure of the sysfs filesystem and viewing the information provided by its files and directories, you can gather valuable information that can help diagnose and resolve a wide range of hardware and system issues.
Thus a sysfs filesystem is set up per Neuron Device under /sys/devices/virtual/neuron_device
to give you an insight into the Neuron Driver and Runtime at system level. By performing several simple CLIs such as reading or writing to a sysfs file, you can get information such as Runtime status, memory usage, Driver info etc. You can even create your own shell scripts to query Runtime and Driver statistics from sysfs and generate customized reports.
This user guide will first explain the Neuron sysfs structure and then introduce many ways where you can perform diagnostic works with Neuron sysfs.
Neuron Sysfs Filesystem Structure#
High Level Overview#
Here is the high level structure of the Neuron sysfs filesystem, where the total and present counters are not shown:
/sys/devices/virtual/neuron_device/
├── neuron0/
│ ├── subsystem
│ ├── uevent
│ ├── connected_devices
│ ├── core_count
│ ├── reset
│ ├── power/
│ │ ├── async
│ │ ├── control
│ │ ├── runtime_active_time
│ │ ├── runtime_active_kids
│ │ └── ...
│ ├── info/
│ │ ├── notify_delay
│ │ ├── serial_number
│ │ └── architecture/
│ │ ├── arch_type
│ │ ├── device_name
│ │ └── instance_type
├── stats
│ ├── hardware
│ │ ├── mem_ecc_uncorrected
│ │ └── sram_ecc_uncorrected
│ └── memory_usage
│ └── host_mem
│ ├── application_memory
│ ├── constants
│ ├── dma_buffers
│ ├── dma_rings
│ ├── driver_memory
│ ├── notifications
│ ├── tensors
│ └── uncategorized
├── neuron_core0/
│ ├── info/
│ │ └── architecture/
│ │ └── arch_type
│ ├── stats/
│ │ ├── status/
│ │ │ ├── exec_bad_input
│ │ │ ├── hw_error
│ │ │ ├── infer_failed_to_queue
│ │ │ ├── resource_nc_error
│ │ │ ├── unsupported_neff_version
│ │ │ ├── failure
│ │ │ ├── infer_completed_with_error
│ │ │ ├── invalid_error
│ │ │ ├── oob_error
│ │ │ ├── success
│ │ │ ├── generic_error
│ │ │ ├── infer_completed_with_num_error
│ │ │ ├── resource_error
│ │ │ └── timeout
│ │ ├── memory_usage/
│ │ │ ├── device_mem/
│ │ │ │ ├── collectives
│ │ │ │ ├── constants
│ │ │ │ ├── dma_rings
│ │ │ │ ├── driver_memory
│ │ │ │ ├── model_code
│ │ │ │ ├── model_shared_scratchpad
│ │ │ │ ├── nonshared_scratchpad
│ │ │ │ ├── notifications
│ │ │ │ ├── runtime_memory
│ │ │ │ ├── tensors
│ │ | │ └── uncategorized
│ │ │ └── host_mem
│ │ └── other_info/
│ │ ├── flop_count
│ │ ├── inference_count
│ │ ├── model_load_count
│ │ ├── reset_fail_count
│ │ └── reset_req_count
│ └── ...
│── neuron_core1/
│ │ ├── info/
│ │ │ └── ...
│ │ └── stats/
│ │ └── ...
│ └── ...
├── neuron1
├── neuron2
├── neuron3
└── ...
Each Neuron Device is represented as a directory under /sys/devices/virtual/neuron_device/
, where neuron0/
represents the Neuron Device 0, neuron1/
represents the Neuron Device 1, etc. Each NeuronCore is represented as a directory under a Neuron Device directory, represented as neuron_core{0,1,2,...}
. Metrics such as Runtime and Driver info and statistics are collected as per NeuronCore in two directories under the NeuronCore directory, i.e. info/
and stats/
.
Most of the metrics belong to a category called “counter.” Each counter is represented as a directory, which holds two numerical values as two files: total and present. Each memory usage counter has an additional value called peak. The total value starts accumulating metrics when the Driver is loaded. The present value records the last changed metric value. The peak value records the max value so far. Each counter has the same filesystem structure like this:
/sys/devices/virtual/neuron_device/neuron0/neuron_core0/status/
├── exec_bad_input/
│ ├── total
│ └── present
├── hw_error/
│ ├── total
│ └── present
├── infer_failed_to_queue/
│ ├── total
│ └── present
└── ...
Description for Each Field#
info/
: This directory stores general information about hardware and software. None of them are counter types.
notify_delay
: The delay between notifications from the Neuron Device. Current settings are on (0
) or off (-1
). Off by default.serial_number
: The unique device identifier.architecture/
: This directory stores hardware architecture information.arch_type
: The architecture type of the Neuron Device. Sample architecture types are v1, v2, and v3. You can only read the value. You cannot change it.instance_type
: The instance type of the Neuron Device. Sample instance types are Inf1, Inf2, and Trn1. You can only read the value. You cannot change it.device_type
: The Neuron Device type. Sample Neuron Device types are Inferentia, Inferentia2, and Trainium1. You can only read the value. You cannot change it.
stats/
: This directory stores Neuron Runtime and Driver statistics. It contains three subdirectories: status/
, memory_usage/
, and other_info/
.
status/
: This directory stores the number of each return status of API calls. As explained in The LIBNRT API Return Codes, every API call returns an NRT_STATUS value, which represents the return status of that API call. Our sysfs filesystem stores allNRT_STATUS
as subdirectories under thestatus/
directory. They all have the counter structure. Thus eachNRT_STATUS
subdirectory holds two values (total and present) and records the number of times you receive a certainNRT_STATUS
. The following is description for eachNRT_STATUS
subdirectory. You should see the description align with what is described in The LIBNRT API Return Codes.memory_usage/
: This directory contains memory usage statistics for both device and host, represented as counters. In this directory, the total counters indicate the current memory usage, present counters represent the memory allocation or deallocation amount in the previous operation, and peak counters indicate the maximum memory usage observed. Additionally, this directory provides detailed breakdown statistics for device and host memory usage. These memory breakdown details correspond to the Memory Usage Summary section displayed on in Neuron Monitor.device_mem/
: The amount of memory that Neuron Runtime uses for weights, instructions and DMA rings.- This device memory per NeuronCore is further categorized into five types:
collectives/
,constants/
,dma_rings/
,driver_memory/
,model_code/
,model_shared_scratchpad/
,nonshared_scratchpad/
,notifications/
,runtime_memory/
,tensors/
, anduncategorized/
. Each of these categories has total, present, and peak. collectives
- amount of device memory used for collective communication between workersconstants
- amount of device memory used for constants (for applications running training) or weights (for applications running inferences)dma_rings
- amount of device memory used for storing model executable code used for data movementsdriver_memory
- amount of device memory used by the Neuron Drivermodel_code
- amount of device memory used for storing model executable codemodel_shared_scratchpad
- amount of device memory used for the shared model scratchpad, a buffer shared between models on the same Neuron Core used for internal model variables and other auxiliary buffersnonshared_scratchpad
- amount of device memory used for non-shared model scratchpad, a buffer used by a single model for internal model variables and other auxiliary buffersnotifications
- amount of device memory used to store instruction level trace information used to profile workloads ran on the deviceruntime_memory
- amount of device memory used by the Neuron Runtime (outside of the previous categories)tensors
- amount of device memory used for tensorsuncategorized
- amount of device memory that does not belong in any other catagory in this list
- This device memory per NeuronCore is further categorized into five types:
host_mem/
: The amount of memory that Neuron Runtime uses for input and output tensors.The host memory per Neuron Device is further categorized into four types:
application_memory/
,constants/
,dma_buffers/
,dma_rings/
,driver_memory/
,notifications/
,tensors/
,uncategorized/
. These categories provide more granular host memory classification compared to Host Used Memory section. Each of these categories has total, present, and peak
hardware/
: Hardware statistics.mem_ecc_uncorrected
: The number of uncorrected ECC events in the Neuron device’s DRAM.sram_ecc_uncorrected
: The number of uncorrected ECC events in the Neuron device’s SRAM.
other_info/
: This directory contains statistics that are not included bystatus/
andmemory_usage/
. None of them are counter types.flop_count
: The number of flops. You can use it to calculate the TFLOP/s byflop_count
/ time intervalinference_count
: The number of successful inferencesmodel_load_count
: The number of successful model loadsreset_fail_count
: The number of failed device resetsreset_req_count
: The number of device resets requests
Other fields:
connected_devices
: The list of connected devices’ ids. You should see the same output as neuron-ls’s CONNECTED DEVICES.reset
: write to this file resets corresponding the Neuron Device.
Read and Write to Sysfs#
Reading a sysfs file gives the value for the corresponding metric. You can use the cat command to view the contents of the sysfs files.:
ubuntu@ip-xxx-xx-xx-xxx:~$ sudo cat /sys/devices/virtual/neuron_device/neuron0/neuron_core0/stats/status/failure/total
0
ubuntu@ip-xxx-xx-xx-xxx:~$ sudo cat /sys/devices/virtual/neuron_device/neuron0/neuron_core0/info/architecture/arch_type
NCv2
Sysfs metrics of counter type are write to clear. You can write any value to the file, and the metric will be set to 0:
ubuntu@ip-xxx-xx-xx-xxx:~$ echo 1 | sudo tee /sys/devices/virtual/neuron_device/neuron0/neuron_core0/stats/status/failure/total
1
Writing to reset
resets the corresponding Neuron Device. E.g. the below resets Neuron Device 0:
ubuntu@ip-xxx-xx-xx-xxx:~$ echo 1 | sudo tee /sys/devices/virtual/neuron_device/neuron0/reset
1
Note#
All files under /sys/devices/virtual/neuron_device/neuron0/power
such as runtime_active_kids
or runtime_status
are related to generic device power management. They are not created or controlled by our sysfs metrics. The word runtime
in these files does not refer to Neuron Runtime.
How to Troubleshoot via Sysfs#
You can perform simple and easy tasks to troubleshoot your ML jobs with one or a few CLIs to read or write the sysfs filesystem. You can do aggregations across all the NeuronCores and all the Neuron Device to get a summarized view using your scripts.
You can also use the Sysfs notification feature to wait passively (without wasting CPU cycles) for changes to the values of Sysfs files. To use this feature, you need to implement a user-space program that calls the poll() function on the Sysfs file that you want to wait on.
The poll() function has the following signature: unsigned int (*poll) (struct file *, struct poll_table_struct *)
.
By default, the Sysfs notification feature is turned off when the driver is loaded. To enable notifications, you can set the value of /sys/devices/virtual/neuron_device/neuron0/info/notify_delay
to 0. To disable notifications, you can set it to -1. Please note that enabling this feature can impact performance.
Here is a sample user space program using poll():
#include <fcntl.h>
#include <poll.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char * argv[])
{
char readbuf[128];
int attr_fd = -1;
struct pollfd pfd;
int retval = 0;
ssize_t read_bytes;
if (argc < 2) {
fprintf(stderr, "Error: Please specify sysfs file path\n");
exit(1);
}
attr_fd = open(argv[1], O_RDONLY, 0);
if (attr_fd < 0) {
perror(argv[1]);
exit(2);
}
read_bytes = read(attr_fd, readbuf, sizeof(readbuf));
if (read_bytes < 0) {
perror(argv[1]);
exit(3);
}
printf("%.*s", (int)read_bytes, readbuf);
pfd.fd = attr_fd;
pfd.events = POLLERR | POLLPRI;
pfd.revents = 0;
while ((retval = poll(&pfd, 1, 100)) >= 0) {
if (pfd.revents & (POLLERR | POLLPRI)) {
pfd.revents = 0;
lseek(attr_fd, 0, SEEK_SET);
read_bytes = read(attr_fd, readbuf, sizeof(readbuf));
if (read_bytes < 0) {
perror(argv[1]);
exit(4);
}
printf("%.*s", (int)read_bytes, readbuf);
}
}
return 0;
}
This document is relevant for: Inf1
, Inf2
, Trn1
, Trn2