Getting Started: Neuron Plugin for TensorBoard¶
Table of Contents
This guide is for developers who want to better understand how their model is executed using Neuron SDK through TensorBoard.
TensorBoard is an open source visualization tool for machine learning projects. It allows for custom plugins, allowing for new ways to visualize information within TensorBoard.
The Neuron plugin for TensorBoard is focused on helping users better understand the performance of their machine learning workload using Neuron SDK. It is compatible with TensorBoard versions 1.15 and higher, and supported for Neuron tools version 1.5 and higher. Neuron tools version 1.5 is introduced in Neuron v1.13.0 release.
Neuron Plugin for TensorBoard provides visualizations and profiling results for graphs executed on NeuronCores.
Graph visualization is currently only supported for TensorFlow-Neuron. Support for MXNet-Neuron and PyTorch-Neuron visualization will be added in a future release.
- Launch an Inf1 Instance:
Please follow the instructions at launch an Amazon EC2 Instance to Launch an Inf1 instance, when choosing the instance type at the EC2 console. Please make sure to select the correct instance type. To get more information about Inf1 instances sizes and pricing see Inf1 web page.
When choosing an Amazon Machine Image (AMI) make sure to select Deep Learning AMI with Conda Options. Please note that Neuron Conda environments are supported only in Ubuntu 18 DLAMI and Amazon Linux2 DLAMI, Neuron Conda environments are not supported in Amazon Linux DLAMI.
After launching the instance, follow the instructions in Connect to your instance to connect to the instance
You can also launch the instance from AWS CLI, please see AWS CLI commands to launch inf1 instances.
- Set up a development environment:
To compile and run inference from the instance terminal, first enable the ML framework conda environment of your choice by running one of the following from the terminal:
Enable PyTorch-Neuron Conda enviroment:
source activate aws_neuron_pytorch_p36
Enable TensorFlow-Neuron Conda enviroment:
To activate TensorFlow 1.x:
source activate aws_neuron_tensorflow_p36
To activate TensorFlow 2.x:
# Setup a new Python virtual environment sudo apt-get install python3-venv python3 -m venv neuron_tf2_env source neuron_tf2_env/bin/activate pip install --upgrade pip pip config set global.extra-index-url https://pip.repos.neuron.amazonaws.com pip install tensorflow-neuron[cc] pip install ipykernel python -m ipykernel install --user --name neuron_tf2 --display-name "Python (Neuron TensorFlow 2)" pip install jupyter notebook pip install environment_kernels
Enable MXNet-Neuron Conda enviroment:
source activate aws_neuron_mxnet_p36
Refer to the following guides on how to compile a graph using Neuron SDK.
“Compile model for Neuron” in PyTorch-Neuron Resnet50 Tutorial
In this step, we enable Neuron profile data collection and collect results from executing an inference.
4.1. To start profiling the neural network and collect inference traces, create a
directory where profile data will be dumped and set the
variable. In this example, we will assume this directory is
mkdir -p $HOME/profile export NEURON_PROFILE=$HOME/profile
4.2. Ensure Neuron Tools are executable by setting the
PATH environment variable.
4.3. Execute inference!
Please run the inference script outside of Jupyter notebook. Profiling in Jupyter notebook is not supported at this time.
Please ensure the inference script executes only one inference, as profiling results are currently only supported for a single inference.
For more info on how to execute inference, refer to the following guides:
“Run inference on Single Core” in Resnet50 model for Inferentia
4.4. Check if profiling results were successfully saved. In the directory
pointed to by
NEURON_PROFILE environment variable set in Step 4.1, there
should be at least two files, one with the
.neff extension and one with the
.ntff extension. For TensorFlow-Neuron users, the graph file (
also be in this directory.
In this step, we will process the Neuron profile data and launch TensorBoard.
5.1. Install the Neuron plugin for Tensorboard.
If you are using the DLAMI TensorFlow-Neuron Conda environment, please run the following to update TensorBoard before installing the Neuron plugin.
pip install "tensorboard<=2.4.0" --force-reinstall
Modify Pip repository configurations to point to the Neuron repository:
tee $VIRTUAL_ENV/pip.conf > /dev/null <<EOF [global] extra-index-url = https://pip.repos.neuron.amazonaws.com EOF
pip install tensorboard-plugin-neuron
5.2. After collecting the raw profile data, we need to post-process it to create the
log files used by the Neuron plugin. This can be done when launching TensorBoard
by passing an extra flag
--run_neuron_profiler. Using this flag will create the
directory specified by
--logdir and populate it with Neuron plugin data. Please
note that the
NEURON_PROFILE environment variable set in Step 4.1 must still point
to the same directory as before.
tensorboard --logdir results --run_neuron_profiler
If using TensorBoard >= 2.5, please use the
--load_fast=false option when launching.
tensorboard --logdir results --run_neuron_profiler --load_fast=false
5.3. After you see the following message, TensorBoard is ready to use. By default,
TensorBoard will be launched at
localhost:6006 on the Deployment Instance.
... Running neuron-profile Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all TensorBoard 2.4.1 at http://localhost:6006/ (Press CTRL+C to quit)
In this step, we will view the Neuron plugin for TensorBoard from a browser on your local development machine.
6.1. Connect to the Deployment Instance while enabling port forwarding. In this example, we
assume TensorBoard has been launched using the default address
localhost:6006 on the
# if Ubuntu-based AMI ssh -i <PEM key file> ubuntu@<instance DNS> -L 6006:localhost:6006 # if AL2-based AMI ssh -i <PEM key file> ec2-user@<instance DNS> -L 6006:localhost:6006
6.2. In a browser, visit localhost:6006.
6.3. In the top navigation bar, switch from
Neuron. If it does not show up,
please wait a while and refresh the page while the plugin loads. If the issue persists, check
Inactive dropdown list on the right and check for
6.4. If TensorBoard failed to find the generated logs, you will see the following message:
In this case, please check the console output on the Deployment Instance where TensorBoard was
launched for any warnings or error messages, and make sure the version of the
package is compatible.
To view how the graph was partitioned to run on NeuronCores, select “Device” under “Graph Color Schemes” in the left navigation bar.
Each operator will be colored according to the device used. In this example, light blue indicates an operator was executed on CPU, and orange indicates the operator was executed on NeuronCores. Operators that are white may have been optimized by the Neuron compiler and fused into another operation.
You can also view how long each operator took by changing to the “Compute time” color scheme.
This view will show time taken by each layer and will be colored according to how much relative time the layer took to compute. A lighter shade of red means that a relatively small portion of compute time was spent in this layer, while a darker red shows that more compute time was used.
The “Compatibility” color scheme allows you to better understand what operators are currently supported by the Neuron compiler - green for compatible ops, red for incompatible ops, and yellow for subgraphs that contain both compatible and incompatible ops.
Additionally, you can choose to filter by CPU and NeuronCores, which will only color ops that match the selected device(s).
Each rectangular node in the graph represents a subgraph that can be expanded or collapse by clicking on the name. Operators will be represented by ellipses, and can be clicked to reveal more information on that operator, such as inputs and execution device.
Expand All and
Collapse All buttons can be used to expand or collapse every subgraph.
When using these features, the positioning of the graph may change when redrawing the new graph.
Reset Position button and zoom out by scrolling if the graph appears to be missing.
On the right side of the Neuron plugin, information on the profiled inference will be displayed.
First is the “Neuron Performance Summary,” which gives a quick overview on how Neuron executed the graph, including information on the number of NeuronCores and both on-NeuronCore time and on-CPU time.
Next, the “Neuron Execution” will give more details on how a graph was partitioned for Neuron. Each entry in the table will show the order it was executed in, what type of device was used, the compute time (in microseconds), and the percentage of total time spent. To dive deeper into subgraphs, you can check the “Show Details” box to display the breakdown per NeuronCore.
The “Op Time Table” section shows the cycle count per operator, much like the “Compute time” coloring
for graph visualization. This table can be sorted by clicking the column names, and searched using the
provided text box in the top right corner. Due to Neuron compiler optimizations, some of the compute may
not be associated with any specific operator and will be categorized as
unknown. Additionally, time
spent moving data to and from NeuronCores will fall under