Toggle navigation sidebar
Toggle in-page Table of Contents
Neuron 2.9 is released! check
What's New
and
Announcements
AWS Neuron Documentation
Overview
Quick Links
Get Started with PyTorch
Get Started with TensorFlow
Get Started with MXNet
GitHub Samples
Performance
What’s New
Announcements
ML Frameworks
PyTorch Neuron
Pytorch Neuron Setup
Training
Tutorials
Hugging Face BERT Pretraining Tutorial
Multi-Layer Perceptron Training Tutorial
PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API
Fine-tune T5 model on Trn1
Megatron-LM GPT Pretraining Tutorial
Distributed Data Parallel Training Tutorial
Neuron Custom C++ Operators in MLP Training
Additional Examples
API Reference Guide
Developer Guide
Misc
Inference
Inference with torch-neuronx (Inf2 & Trn1)
Tutorials
Additional Examples
API Reference Guide
Developer Guide
Misc
Inference with torch-neuron (Inf1)
Tutorials
Additional Examples
API Reference Guide
Developer Guide
Misc
Comparison of
torch-neuron
(
Inf1
) versus
torch-neuronx
(
Inf2
&
Trn1
) for Inference
TensorFlow Neuron
Tensorflow Neuron Setup
Inference
Inference on Inf2 & Trn1 (``tensorflow-neuronx``)
Tutorials
API Reference Guide
Misc
Inference on Inf1 (``tensorflow-neuron``)
Tutorials
Additional Examples
API Reference Guide
Misc
Training
Apache MXNet (Incubating)
MXNet Neuron Setup
Inference (mxnet-neuron)
Tutorials
Computer Vision Tutorials
Natural Language Processing (NLP) Tutorials
Utilizing Neuron Capabilities Tutorials
API Reference Guide
Developer Guide
Misc
Troubleshooting Guide for Neuron Apache MXNet (Incubating)
What's New
Neuron Apache MXNet (Incubating) Supported operators
User Guide
Neuron Runtime
API Reference Guide
Runtime API
Configuration Guide
Runtime Configuration
Misc
Troubleshooting on Inf1 and Trn1
FAQ
Neuron Runtime Release Notes
Neuron Driver Release Notes
Neuron Collectives Release Notes
Neuron Compiler
Neuron Compiler for Trn1 & Inf2
API Reference Guide
Neuron Compiler CLI Reference Guide
Developer Guide
Mixed Precision and Performance-accuracy Tuning (
neuronx-cc
)
Misc
FAQ
What's New
Neuron Compiler for Inf1
API Reference Guide
Neuron compiler CLI Reference Guide (
neuron-cc
)
Developer Guide
Mixed precision and performance-accuracy tuning (
neuron-cc
)
Misc
FAQ
What's New
Neuron Supported operators
Neuron C++ Custom Operators
API Reference Guide
Custom Operators API Reference Guide [Experimental]
Developer Guide
Neuron Custom C++ Operators Developer Guide [Experimental]
Tutorials
Neuron Custom C++ Operators in MLP Training
Misc (Neuron Custom C++ Operators)
Neuron Custom C++ Tools Release Notes
Neuron Custom C++ Library Release Notes
Neuron Tools
System Tools
Neuron-Monitor User Guide
Neuron-Top User Guide
Neuron-LS User Guide
Neuron-Sysfs User Guide
What's New
TensorBoard
Track Training Progress in TensorBoard using PyTorch Neuron
TensorBoard Plugin for Neuron (Trn1)
What's New
TensorBoard Plugin for Neuron (Inf1)
Helper Tools
Check Model
GatherInfo
NeuronPerf (Beta)
Overview
Terminology
Examples
Benchmark Guide
Evaluate Guide
Compile Guide
Model Index Guide
API
Framework Notes
FAQ
Troubleshooting
What’s New
NeuronPerf 1.x Release Notes
Setup Guide
Containers Deployment
Run training in Pytorch Neuron container
Deploy a simple mlp training script as a Kubernetes job
Run inference in pytorch neuron container
Deploy a TensorFlow Resnet50 model as a Kubernetes service
Deploy Neuron Container on EC2
Deploy Neuron Container on Elastic Container Service (ECS)
Deploy Neuron Container on Elastic Kubernetes Service (EKS)
Bring Your Own Neuron Container to Sagemaker Hosting
FAQ
Troubleshooting Neuron Containers
Neuron Containers Release Notes
Neuron K8 Release Notes
Developer Flows
Deploy Containers with Neuron
Run training in Pytorch Neuron container
Deploy a simple mlp training script as a Kubernetes job
Run inference in pytorch neuron container
Deploy a TensorFlow Resnet50 model as a Kubernetes service
Deploy Neuron Container on EC2
Deploy Neuron Container on Elastic Container Service (ECS)
Deploy Neuron Container on Elastic Kubernetes Service (EKS)
Bring Your Own Neuron Container to Sagemaker Hosting
FAQ
Troubleshooting Neuron Containers
Neuron Containers Release Notes
Neuron K8 Release Notes
Compile with Framework API and Deploy on EC2 Inf1
Compile with Framework API and Deploy on EC2 Inf2
Train your model on EC2
Deploy Neuron Container on Elastic Kubernetes Service (EKS)
Deploy Neuron Container on Elastic Container Service (ECS)
Compile with Sagemaker Neo and Deploy on Sagemaker Hosting
Bring Your Own Neuron Container to Sagemaker Hosting
Train your model on SageMaker
Train your model on ParallelCluster
Learning Neuron
Architecture
AWS Inf1 Architecture
AWS Trn1/Trn1n Architecture
AWS Inf2 Architecture
Inferentia Architecture
Inferentia2 Architecture
Trainium Architecture
AWS NeuronCore Architecture
Neuron Model Architecture Fit Guidelines
Neuron Glossary
Features
Data Types
Rounding Modes
Neuron Batching
NeuronCore Pipeline
Neuron Persistent Cache
Collective Communication
Neuron Control Flow
Neuron Custom C++ Operators
Neuron Dynamic Shapes
Application Notes
Introducing first release of Neuron 2.x enabling EC2 Trn1 general availability (GA)
Introducing Neuron Runtime 2.x (libnrt.so)
Performance Tuning
Parallel Execution using NEURON_RT_NUM_CORES
Running R-CNNs on Inf1
FAQ
Troubleshooting
About Neuron
Release Details
Roadmap
Neuron Public Roadmap
Support
SDK Maintenance Policy
Security Disclosures
Contact Us
repository
open issue
Index
A
|
B
|
C
|
D
|
E
|
F
|
G
|
I
|
L
|
M
|
N
|
O
|
P
|
R
|
S
|
T
|
V
|
W
|
Z
A
abs (C++ function)
abs_out (C++ function)
accessor (C++ function)
,
[1]
add (C++ function)
,
[1]
add_out (C++ function)
,
[1]
B
benchmark()
built-in function
BF16
bitwise_and (C++ function)
,
[1]
,
[2]
bitwise_and_out (C++ function)
,
[1]
,
[2]
bitwise_not (C++ function)
bitwise_not_out (C++ function)
bitwise_or (C++ function)
,
[1]
,
[2]
bitwise_or_out (C++ function)
,
[1]
,
[2]
built-in function
benchmark()
compile()
get_reports()
model_index.append()
model_index.copy()
model_index.create()
model_index.filter()
model_index.load()
model_index.move()
model_index.save()
print_reports()
torch.neuron.DataParallel()
torch.neuron.DataParallel.disable_dynamic_batching()
torch_neuron.trace()
torch_neuronx.analyze()
torch_neuronx.dynamic_batch()
torch_neuronx.experimental.multicore_context()
torch_neuronx.experimental.neuron_cores_context()
torch_neuronx.experimental.profiler.profile()
torch_neuronx.experimental.profiler.profile.start()
torch_neuronx.experimental.set_multicore()
torch_neuronx.experimental.set_neuron_cores()
torch_neuronx.trace()
write_csv()
write_json()
C
CCE
ceil (C++ function)
ceil_out (C++ function)
cFP8
clamp (C++ function)
clamp_out (C++ function)
close (C++ function)
,
[1]
Collective Communication Engine
compile()
built-in function
cos (C++ function)
cos_out (C++ function)
CustomOps
D
div (C++ function)
,
[1]
div_out (C++ function)
,
[1]
DP
DPr
E
empty (C++ function)
exp (C++ function)
exp_out (C++ function)
eye (C++ function)
F
fill_ (C++ function)
FLOAT32_TO_FLOAT16 (torch_neuron.Optimization attribute)
floor (C++ function)
floor_out (C++ function)
FP16
FP32
full (C++ function)
G
get_accessor_coherence_policy (C++ function)
get_reports()
built-in function
GPSIMD Engine
I
Inf1
Inferentia
L
log (C++ function)
log10 (C++ function)
log10_out (C++ function)
log2 (C++ function)
log2_out (C++ function)
log_out (C++ function)
M
model_index.append()
built-in function
model_index.copy()
built-in function
model_index.create()
built-in function
model_index.filter()
built-in function
model_index.load()
built-in function
model_index.move()
built-in function
model_index.save()
built-in function
module
placement
mul (C++ function)
,
[1]
mul_out (C++ function)
,
[1]
N
NC
ND
Neuron Device
neuron-cc
neuron-cc command line option
,
[1]
,
[2]
neuron-cc command line option
neuron-cc
,
[1]
,
[2]
neuron-ls
neuron-ls command line option
neuron-ls command line option
neuron-ls
neuron-monitor
neuron-monitor command line option
neuron-monitor command line option
neuron-monitor
NeuronCore
,
[1]
NeuronCore-v1
NeuronCore-v2
NeuronDevice
NeuronLink
NeuronLink-v1
NeuronLink-v2
neuronx-cc
neuronx-cc command line option
,
[1]
,
[2]
neuronx-cc command line option
neuronx-cc
,
[1]
,
[2]
nrt_add_tensor_to_tensor_set (C function)
nrt_allocate_tensor_set (C function)
nrt_close (C function)
nrt_destroy_tensor_set (C function)
nrt_execute (C function)
nrt_execute_repeat (C function)
nrt_free_model_tensor_info (C function)
nrt_get_model_instance_count (C function)
nrt_get_model_nc_count (C function)
nrt_get_model_tensor_info (C function)
nrt_get_tensor_from_tensor_set (C function)
nrt_get_total_nc_count (C function)
nrt_get_version (C function)
nrt_get_visible_nc_count (C function)
nrt_init (C function)
nrt_load (C function)
nrt_load_collectives (C function)
nrt_profile_start (C function)
nrt_profile_stop (C function)
nrt_tensor_allocate (C function)
nrt_tensor_allocate_empty (C function)
nrt_tensor_allocate_slice (C function)
nrt_tensor_attach_buffer (C function)
nrt_tensor_free (C function)
nrt_tensor_get_size (C function)
nrt_tensor_get_va (C function)
nrt_tensor_read (C function)
nrt_tensor_write (C function)
nrt_unload (C function)
O
ones (C++ function)
operator= (C++ function)
,
[1]
P
placement
module
pow (C++ function)
,
[1]
,
[2]
pow_out (C++ function)
,
[1]
,
[2]
PP
PPr
print_reports()
built-in function
R
read (C++ function)
read_stream_accessor (C++ function)
RNE
RT
S
Scalar Engine
ScalEng
set_accessor_coherence_policy (C++ function)
sin (C++ function)
sin_out (C++ function)
SR
sub (C++ function)
sub_out (C++ function)
,
[1]
Sync Engine
SyncEng
T
tan (C++ function)
tan_out (C++ function)
tcm_accessor (C++ function)
,
[1]
tcm_to_tensor (C++ function)
TensEng
Tensor Engine
tensor_to_tcm (C++ function)
TF32
torch.neuron.DataParallel()
built-in function
torch.neuron.DataParallel.disable_dynamic_batching()
built-in function
torch::neuron::tcm_free (C++ function)
torch::neuron::tcm_malloc (C++ function)
torch_neuron.experimental.multicore_context() (in module placement)
torch_neuron.experimental.neuron_cores_context() (in module placement)
torch_neuron.experimental.set_multicore() (in module placement)
torch_neuron.experimental.set_neuron_cores() (in module placement)
torch_neuron.Optimization (built-in class)
torch_neuron.trace()
built-in function
torch_neuronx.analyze()
built-in function
torch_neuronx.dynamic_batch()
built-in function
torch_neuronx.experimental.multicore_context()
built-in function
torch_neuronx.experimental.neuron_cores_context()
built-in function
torch_neuronx.experimental.profiler.profile()
built-in function
torch_neuronx.experimental.profiler.profile.start()
built-in function
torch_neuronx.experimental.set_multicore()
built-in function
torch_neuronx.experimental.set_neuron_cores()
built-in function
torch_neuronx.trace()
built-in function
TP
TPr
Trainium
Trn1
V
VecEng
Vector Engine
W
write (C++ function)
write_csv()
built-in function
write_json()
built-in function
write_stream_accessor (C++ function)
Z
zeros (C++ function)