Welcome to Neuron SDK’s documentation!¶
AWS Neuron is an SDK for Amazon machine-learning chips, enabling high-performance deep learning applications using AWS Inferentia custom designed machine learning chips. It includes a deep learning compiler, runtime and tools that are natively integrated into TensorFlow, PyTorch and MXnet, to deliver high-performance deep learning inference applications.
With Neuron, you can develop, profile, and deploy high-performance inference predictions on top of Inferentia based EC2 Inf1 instances .
Neuron developer flow¶
Since Neuron is pre-integrated with popular frameworks, ML applications can easily migrate to Neuron and provide high-performance inference predictions. Neuron allows customers to train anywhere, and run their production predictions with Inferetia. Developers have the option to train their models in fp16 or keep training in 32-bit floating point for best accuracy and Neuron will auto-cast the 32-bit trained model to run at speed of 16-bit using bfloat16 model.
Once a model is trained to the required accuracy, it is compiled to an optimized binary form, referred to as a Neuron Executable File Format (NEFF), which is in turn loaded by the Neuron runtime driver to execute inference input requests on the Inferentia chips. The compilation step may be performed on any EC2 instance or on-premises.
If none of the github and online resources have an answer to your question, checkout the AWS Neuron support forum.
- Getting started
- Neuron Install Guide
- What’s New
- Neuron Fundamentals
- Neuron Compiler
- Neuron Runtime
- Neuron Tools
- Neuron Frameworks
- Performance Optimization
- Profiling and debugging
- Neuron Containers
- Tutorials & Examples
- Tutorial: Docker environment setup for Neuron
- Tutorial: Docker environment setup for Neuron Runtime 1.0
- Tutorial: Kubernetes environment setup for Neuron
- Example: Run containerized neuron application
- Example: Deploy BERT as a k8s service
- Neuron K8 Scheduler Extension
- Neuron Runtime Dockerfile
- tensorflow-model-server-neuron Dockerfile
- What’s New