Neuron 2.9 is released! check What's New and Announcements
logo

AWS Neuron Documentation

Overview

  • Quick Links
  • Get Started with PyTorch
  • Get Started with TensorFlow
  • Get Started with MXNet
  • GitHub Samples
  • Performance
  • What’s New
  • Announcements

ML Frameworks

  • PyTorch Neuron
    • Pytorch Neuron Setup
    • Training
      • Tutorials
        • Hugging Face BERT Pretraining Tutorial
        • Multi-Layer Perceptron Training Tutorial
        • PyTorch Neuron for Trainium Hugging Face BERT MRPC task finetuning using Hugging Face Trainer API
        • Fine-tune T5 model on Trn1
        • Megatron-LM GPT Pretraining Tutorial
        • Distributed Data Parallel Training Tutorial
        • Neuron Custom C++ Operators in MLP Training
      • Additional Examples
      • API Reference Guide
      • Developer Guide
      • Misc
    • Inference
      • Inference with torch-neuronx (Inf2 & Trn1)
        • Tutorials
        • Additional Examples
        • API Reference Guide
        • Developer Guide
        • Misc
      • Inference with torch-neuron (Inf1)
        • Tutorials
        • Additional Examples
        • API Reference Guide
        • Developer Guide
        • Misc
      • Comparison of torch-neuron ( Inf1 ) versus torch-neuronx ( Inf2 & Trn1 ) for Inference
  • TensorFlow Neuron
    • Tensorflow Neuron Setup
    • Inference
      • Inference on Inf2 & Trn1 (``tensorflow-neuronx``)
        • Tutorials
        • API Reference Guide
        • Misc
      • Inference on Inf1 (``tensorflow-neuron``)
        • Tutorials
        • Additional Examples
        • API Reference Guide
        • Misc
    • Training
  • Apache MXNet (Incubating)
    • MXNet Neuron Setup
    • Inference (mxnet-neuron)
      • Tutorials
        • Computer Vision Tutorials
        • Natural Language Processing (NLP) Tutorials
        • Utilizing Neuron Capabilities Tutorials
      • API Reference Guide
      • Developer Guide
      • Misc
        • Troubleshooting Guide for Neuron Apache MXNet (Incubating)
        • What's New
        • Neuron Apache MXNet (Incubating) Supported operators

User Guide

  • Neuron Runtime
    • API Reference Guide
      • Runtime API
    • Configuration Guide
      • Runtime Configuration
    • Misc
      • Troubleshooting on Inf1 and Trn1
      • FAQ
      • Neuron Runtime Release Notes
      • Neuron Driver Release Notes
      • Neuron Collectives Release Notes
  • Neuron Compiler
    • Neuron Compiler for Trn1 & Inf2
      • API Reference Guide
        • Neuron Compiler CLI Reference Guide
      • Developer Guide
        • Mixed Precision and Performance-accuracy Tuning ( neuronx-cc )
      • Misc
        • FAQ
        • What's New
    • Neuron Compiler for Inf1
      • API Reference Guide
        • Neuron compiler CLI Reference Guide ( neuron-cc )
      • Developer Guide
        • Mixed precision and performance-accuracy tuning ( neuron-cc )
      • Misc
        • FAQ
        • What's New
        • Neuron Supported operators
  • Neuron C++ Custom Operators
    • API Reference Guide
      • Custom Operators API Reference Guide [Experimental]
    • Developer Guide
      • Neuron Custom C++ Operators Developer Guide [Experimental]
    • Tutorials
      • Neuron Custom C++ Operators in MLP Training
    • Misc (Neuron Custom C++ Operators)
      • Neuron Custom C++ Tools Release Notes
      • Neuron Custom C++ Library Release Notes
  • Neuron Tools
    • System Tools
      • Neuron-Monitor User Guide
      • Neuron-Top User Guide
      • Neuron-LS User Guide
      • Neuron-Sysfs User Guide
      • What's New
    • TensorBoard
      • Track Training Progress in TensorBoard using PyTorch Neuron
      • TensorBoard Plugin for Neuron (Trn1)
      • What's New
      • TensorBoard Plugin for Neuron (Inf1)
    • Helper Tools
      • Check Model
      • GatherInfo
    • NeuronPerf (Beta)
      • Overview
      • Terminology
      • Examples
      • Benchmark Guide
      • Evaluate Guide
      • Compile Guide
      • Model Index Guide
      • API
      • Framework Notes
      • FAQ
      • Troubleshooting
      • What’s New
        • NeuronPerf 1.x Release Notes
  • Setup Guide
  • Containers Deployment
    • Run training in Pytorch Neuron container
    • Deploy a simple mlp training script as a Kubernetes job
    • Run inference in pytorch neuron container
    • Deploy a TensorFlow Resnet50 model as a Kubernetes service
    • Deploy Neuron Container on EC2
    • Deploy Neuron Container on Elastic Container Service (ECS)
    • Deploy Neuron Container on Elastic Kubernetes Service (EKS)
    • Bring Your Own Neuron Container to Sagemaker Hosting
    • FAQ
    • Troubleshooting Neuron Containers
    • Neuron Containers Release Notes
    • Neuron K8 Release Notes
  • Developer Flows
    • Deploy Containers with Neuron
      • Run training in Pytorch Neuron container
      • Deploy a simple mlp training script as a Kubernetes job
      • Run inference in pytorch neuron container
      • Deploy a TensorFlow Resnet50 model as a Kubernetes service
      • Deploy Neuron Container on EC2
      • Deploy Neuron Container on Elastic Container Service (ECS)
      • Deploy Neuron Container on Elastic Kubernetes Service (EKS)
      • Bring Your Own Neuron Container to Sagemaker Hosting
      • FAQ
      • Troubleshooting Neuron Containers
      • Neuron Containers Release Notes
      • Neuron K8 Release Notes
    • Compile with Framework API and Deploy on EC2 Inf1
    • Compile with Framework API and Deploy on EC2 Inf2
    • Train your model on EC2
    • Deploy Neuron Container on Elastic Kubernetes Service (EKS)
    • Deploy Neuron Container on Elastic Container Service (ECS)
    • Compile with Sagemaker Neo and Deploy on Sagemaker Hosting
    • Bring Your Own Neuron Container to Sagemaker Hosting
    • Train your model on SageMaker
    • Train your model on ParallelCluster

Learning Neuron

  • Architecture
    • AWS Inf1 Architecture
    • AWS Trn1/Trn1n Architecture
    • AWS Inf2 Architecture
    • Inferentia Architecture
    • Inferentia2 Architecture
    • Trainium Architecture
    • AWS NeuronCore Architecture
    • Neuron Model Architecture Fit Guidelines
    • Neuron Glossary
  • Features
    • Data Types
    • Rounding Modes
    • Neuron Batching
    • NeuronCore Pipeline
    • Neuron Persistent Cache
    • Collective Communication
    • Neuron Control Flow
    • Neuron Custom C++ Operators
    • Neuron Dynamic Shapes
  • Application Notes
    • Introducing first release of Neuron 2.x enabling EC2 Trn1 general availability (GA)
    • Introducing Neuron Runtime 2.x (libnrt.so)
    • Performance Tuning
    • Parallel Execution using NEURON_RT_NUM_CORES
    • Running R-CNNs on Inf1
  • FAQ
  • Troubleshooting

About Neuron

  • Release Details
  • Roadmap
    • Neuron Public Roadmap
  • Support
    • SDK Maintenance Policy
    • Security Disclosures
    • Contact Us
Theme by the Executable Book Project
  • repository
  • open issue
  • suggest edit
  • .rst

Developer Guide (torch-neuron)

This document is relevant for: Inf1

Developer Guide (torch-neuron)#

  • Running Inference on Variable Input Shapes with Bucketing

  • Data Parallel Inference on PyTorch Neuron

  • Developer Guide - PyTorch Neuron (torch-neuron) LSTM Support

  • PyTorch Neuron (torch-neuron) Core Placement

This document is relevant for: Inf1

previous

PyTorch Neuron (torch-neuron) Core Placement API [Experimental]

next

Running inference on variable input shapes with bucketing

By AWS
© Copyright 2023, Amazon.com.