This document is relevant for: Inf2, Trn1, Trn2, Trn3
PyTorch NeuronX Analyze API for Inference#
- torch_neuronx.analyze(func, example_inputs, compiler_workdir=None)#
Checks the support of the operations in the
funcby checking each operator against neuronx-cc.- Parameters:
func (Module,callable) – The function/module that that will be run using the
example_inputsarguments in order to record the computation graph.example_inputs (Tensor,tuple[Tensor]) – A tuple of example inputs that will be passed to the
funcwhile tracing.
- Keyword Arguments:
compiler_workdir (str) – Work directory used by neuronx-cc. This can be useful for debugging and/or inspecting intermediary neuronx-cc outputs
additional_ignored_ops (set) – A set of aten operators to not analyze. Default is an empty set.
max_workers (int) – The max number of workers threads to spawn. The default is
4.is_hf_transformers (bool) – If the model is a huggingface transformers model, it is recommended to enable this option to prevent deadlocks. Default is
False.cleanup (bool) – Specifies whether to delete the compiler artifact directories generated after running analyze. Default is
False.
- Returns:
A JSON like
Dictwith the supported operators and their count, and unsupported operators with the failure mode and location of the operator in the python code.- Return type:
Dict
Note
This function is meant to be used as a way to evaluate operator support for the model that is intended to be traced. The information can be used to modify operators that are unsupported to ones that are supported, or custom partitioning of the model.
Note that this API does not return a traced model.
Just like torch_neuronx.trace, this API can be used on any EC2 machine with sufficient memory and compute resources.
Examples#
Fully supported model
import json import torch import torch.nn as nn import torch_neuronx class MLP(nn.Module): def __init__(self, input_size=28*28, output_size=10, layers=[120,84]): super(MLP, self).__init__() self.fc1 = nn.Linear(input_size, layers[0]) self.relu = nn.ReLU() self.fc2 = nn.Linear(layers[0], layers[1]) def forward(self, x): f1 = self.fc1(x) r1 = self.relu(f1) f2 = self.fc2(r1) r2 = self.relu(f2) f3 = self.fc3(r2) return torch.log_softmax(f3, dim=1) model = MLP() ex_input = torch.rand([32,784]) model_support = torch_neuronx.analyze(model,ex_input) print(json.dumps(model_support,indent=4))
{ "torch_neuronx_version": "1.13.0.1.5.0", "neuronx_cc_version": "2.0.0.11796a0+24a26e112", "support_percentage": "100.00%", "supported_operators": { "aten::linear": 3, "aten::relu": 2, "aten::log_softmax": 1 }, "unsupported_operators": [] }
Unsupported Model/Operator
import json import torch import torch_neuronx def fft(x): return torch.fft.fft(x) model = fft ex_input = torch.arange(4) model_support = torch_neuronx.analyze(model,ex_input) print(json.dumps(model_support,indent=4))
{ "torch_neuronx_version": "1.13.0.1.5.0", "neuronx_cc_version": "2.0.0.11796a0+24a26e112", "support_percentage": "0.00%", "supported_operators": {}, "unsupported_operators": [ { "kind": "aten::fft_fft", "failureAt": "neuronx-cc", "call": "test.py(6): fft\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch_neuronx/xla_impl/analyze.py(35): forward\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch/nn/modules/module.py(1182): _slow_forward\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch/nn/modules/module.py(1194): _call_impl\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch/jit/_trace.py(976): trace_module\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch/jit/_trace.py(759): trace\n/home/ubuntu/testdir/venv/lib/python3.8/site-packages/torch_neuronx/xla_impl/analyze.py(302): analyze\ntest.py(11): <module>\n", "opGraph": "graph(%x : Long(4, strides=[1], requires_grad=0, device=cpu),\n %neuron_4 : NoneType,\n %neuron_5 : int,\n %neuron_6 : NoneType):\n %neuron_7 : ComplexFloat(4, strides=[1], requires_grad=0, device=cpu) = aten::fft_fft(%x, %neuron_4, %neuron_5, %neuron_6)\n return (%neuron_7)\n" } ] }
Note: the
failureAtfield can either be “neuronx-cc” or “Lowering to HLO”. If the field is “neuronx-cc”, then it indicates that the provided operator configuration failed to be compiled withneuronx-cc. This could either indicate that the operator configuration is unsupported, or there is a bug with that operator configuration.
This document is relevant for: Inf2, Trn1, Trn2, Trn3