Getting Started: TensorBoard-Neuron¶
This guide is for developers who want to better understand how their model runs on Neuron Cores.
TensorBoard-Neuron is adapted to provide useful information related to Neuron devices, such as compatibility and profiling. It also preserves TensorBoard’s existing features, including the Debugger plugin, which may be useful in finding numerical mismatches.
When profiling with PyTorch in a conda environment please re-install the correct version of torch as a workaround for https://github.com/aws/aws-neuron-sdk/issues/230. See issue for details.
Note: on DLAMI v26.0, please do
conda install numpy=1.17.2 --yes --quiet before following the Conda
installation instructions, as the installed numpy version prevents the
update. See DLAMI with Neuron Release Notes for more info.
This section assumes the Neuron repos have been configured as shown here: Setup Guide
By default, TensorBoard-Neuron will be installed when you install TensorFlow-Neuron.
If using Conda, there is no standalone package for
tensorboard-neuron at this time, it is currently packaged together
tensorflow-neuron conda package.
$ pip install tensorflow-neuron
It can also be installed separately.
$ pip install tensorboard-neuron
Additionally, if you would like to profile your model (see below), you will also need to have Neuron tools installed.
$ sudo apt install aws-neuron-tools
Note: TensorBoard does not need to be installed to use TensorBoard-Neuron, and should be replaced with TensorBoard-Neuron if already installed.
$ pip uninstall tensorboard $ pip install tensorboard-neuron
$ pip install tensorboard-neuron --force-reinstall
If TensorBoard-Neuron is not properly installed, the added
functionalities for AWS Neuron may not work. For example, errors such as
tensorboard: error: unrecognized arguments: --run_neuron_profile may
occur when attempting to profile an inference.
TensorBoard-Neuron is included under the
$ conda update tensorflow-neuron
Profile the network and collect inference traces¶
When using TensorFlow-Neuron, MXNet-Neuron, or PyTorch-Neuron, raw profile data will be collected if NEURON_PROFILE environment variable is set. The raw profile is dumped into the directory pointed by NEURON_PROFILE environment variable.
The steps to do this:
Set NEURON_PROFILE environment variable, e.g.:
NOTE: this directory must exist before you move on to the next step. Otherwise, profile data will not be emitted.
Run inference through the framework. See the tutorials for each framework for more info.
Visualizing data with TensorBoard-Neuron¶
To view data in TensorBoard-Neuron, run the command below, where
“logdir” is the directory where TensorFlow logs are kept. This logdir
may or may not have any existing logs, or may not even exist yet. AWS
Neuron will populate this directory when using the
--run_neuron_profile option. (Note that this “logdir” is not the
same as the NEURON_PROFILE directory that you set during inference, and
in fact, depending on your configuration you may not have any tensorflow
logs. For this step, NEURON_PROFILE still needs to be set to the same
directory you used during your inference run.
will process the neuron profile data from the NEURON_PROFILE directory
$ tensorboard_neuron --logdir /path/to/logdir --run_neuron_profile
By default, TensorBoard-Neuron will be launched at “localhost:6006,” by specifying “–host” and “–port” option the URL can be changed.
Now, in a browser visit localhost:6006 to view the visualization or and enter the host and port if specified above.
How to: Check Neuron compatibility¶
TensorBoard-Neuron can visualize which operators are supported on Neuron devices. All Neuron compatible operators would run on Neuron Cores and other operators would run on CPU.
Step 1: Generate the EVENT files¶
Use the TensorFlow APIs to create the event file. See the sample Python code snippet below for TensorFlow:
import tensorflow as tf graph_file = '/path/to/graph_def.pb' # Change path here graph_def = tf.GraphDef() with open(graph_file, 'rb') as f: graph_def.ParseFromString(f.read()) graph = tf.Graph() with graph.as_default(): tf.import_graph_def(graph_def, name='') fw = tf.summary.FileWriter(graph=graph, logdir='/path/to/logdir') # Change logdir here fw.flush()
Step 3: select “Neuron Compatibility“¶
In the navigation pane on the left, under the “Color” section, select “Neuron Compatibility.”
Step 4: View compatible operators¶
Now, the graph should be colored red and/or green. Green indicates that an operator that is compatible with Neuron devices, while red indicates that the operator is currently not supported. If there are unsupported operators, all of these operators’ names will be listed under the “Incompatible Operations” section.
How to: Visualize graphs run on a Neuron device¶
After successfully analyzing the profiled run on a Neuron device, you can launch TensorBoard-Neuron to view the graph and see how much time each operator is taking.
Step 1: Generate the Files¶
This step requires Neuron tools in order to work.
Step 3: select the “Neuron_profile” tag¶
The “neuron_profile” tag contains timing information regarding the inference you profiled.
Step 4: select “Compute Time”¶
In the navigation pane on the left, under the “Color” section, select “Compute time.”
Step 5: View time taken by various layers¶
This view will show time taken by each layer and will be colored according to how much relative time the layer took to compute. A lighter shade of red means that a relatively small portion of compute time was spent in this layer, while a darker red shows that more compute time was used. Some layers may also be blank, which indicates that these layers may have been optimized out to improve inference performance. Clicking on a node will show the compute time, if available.
How to: View detailed profile using the Neuron Profile plugin¶
To get a better understanding of the profile, you can check out the Neuron Profile plugin. Here, you will find more information on the inference, including an overview, a list of the most time-consuming operators (op profile tool), and an execution timeline view (Chrome trace).
step 1: Generate the files¶
This step requires Neuron tools in order to work.
Step 3: Select the “Neuron Profile” plugin¶
On the navigation bar at the top of the page, there will be a list of active plugins. In this case, you will need to use the “Neuron Profile” plugin. The plugin may take a while to register on first load. If this tab does not show initially, please refresh the page.
Step 4a: the profile overview¶
The first page you will land on in the Neuron Profile plugin is the overview page. It contains various information regarding the inference. In the “Performance Summary” section, you will see execution stats, such as the total execution time, the average layer execution time, and the utilization of NeuronMatrix Units.
The “Neuron Time Graph” shows how long a portion of the graph (a NeuronOp) took to execute.
The “Top TensorFlow operations executed on Neuron Cores” sections gives a quick summary of the most time-consuming operators that were executed on the device.
“Run Environment” shows the information on devices used during this inference.
Finally, the “Recommendation for Next Steps” section gives helpful pointers to place to learn more about what to do next
STEP 4B: THE OPERATOR PROFILE¶
In the “Tools” dropdown menu, select “op_profile.”
The “op profile” tool displays the percentage of overall time taken for each operator, sorted by the most expensive operators at the top. It gives a better understanding of where the bottlenecks in a model may be.
Step 4c: Chrome trace¶
In the “Tools” dropdown menu, select “trace_viewer.”
For developers wanting to better understand the timeline of the inference, the Chrome trace view is the tool for you. It shows the history of execution organized by the operator names.
Please note that this tool can only be used in Chrome browsers.
How to: Debug an inference¶
To make use of the Debugger plugin, you must specify your desired output tensors before creating the saved model. See Neuron TensorFlow Serving for how to create the saved model. Essentially, adding these tensors to the “outputs” dictionary will allow you to view them in the debugger later on.
Please note that this feature is currently only available for TensorFlow users.
Step 2: Modify and run your inference script¶
In order to run the inference in “debug mode,” you must use TensorFlow’s debug wrapper. The following lines will need to be added to your script.
from tensorflow.python import debug as tf_debug # The port must be the same as the one used for --debugger_port above # in this example, PORT is 7000 DEBUG_SERVER_ADDRESS = 'localhost:7000' # create your TF session here sess = tf_debug.TensorBoardDebugWrapperSession( sess, DEBUG_SERVER_ADDRESS) # run inference using the wrapped session
After adding these modifications, run the script to begin inference. The execution will be paused before any calculation starts.
Step 3: Select the “debugger” plugin¶
On the navigation bar at the top of the page, there will be a list of active plugins. In this case, you will need to use the “Debugger” plugin.
Step 4: Enable watchpoints¶
In the “Runtime Node List” on the left, there will be a list of operators and a checkbox next to each. Select all of the operators that you would like the view the tensor output of.
step 5: execute inference¶
On the bottom left of the page, there will be a “Continue…” button that will resume the inference execution. As the graph is executed, output tensors will be saved for later viewing.
Step 6: View tensors¶
At the bottom of the page, there will be a“Tensor Value Overview” section that shows a summary of all the output tensors that were selected as watchpoints in Step 4. To view more specific information on a tensor, you can click on a tensor’s value. You may also hover over the bar in the “Health Pill” column for a more detailed summary of values.