.. _model_samples_inference_inf2_trn1: Inference Samples/Tutorials (Inf2/Trn1) ======================================= .. contents:: Table of contents :local: :depth: 1 .. _encoder_model_samples_inference_inf2_trn1: Encoders -------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - bert-base-cased-finetuned-mrpc - torch-neuronx - * :ref:`BERT TorchServe tutorial ` * HuggingFace pretrained BERT tutorial :ref:`[html] ` :pytorch-neuron-src:`[notebook] ` * `LibTorch C++ Tutorial for HuggingFace Pretrained BERT `_ * `Compiling and Deploying HuggingFace Pretrained BERT on Inf2 on Amazon SageMaker `_ * - bert-base-cased-finetuned-mrpc - neuronx-distributed - * :ref:`tp_inference_tutorial` * - bert-base-uncased - torch-neuronx - * `HuggingFace Pretrained BERT Inference on Trn1 `_ * - distilbert-base-uncased - torch-neuronx - * `HuggingFace Pretrained DistilBERT Inference on Trn1 `_ * - roberta-base - tensorflow-neuronx - * HuggingFace Roberta-Base :ref:`[html]` :github:`[notebook] ` * - roberta-large - torch-neuronx - * `HuggingFace Pretrained RoBERTa Inference on Trn1 `_ .. _decoder_model_samples_inference_inf2_trn1: Decoders -------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - gpt2 - torch-neuronx - * `HuggingFace Pretrained GPT2 Feature Extraction on Trn1 `_ * - meta-llama/Llama-3-8b - transformers-neuronx - * `Run Hugging Face meta-llama/Llama-3-8b autoregressive sampling on Inf2 & Trn1 `_ * - meta-llama/Llama-3-70b - transformers-neuronx - * `Run Hugging Face meta-llama/Llama-3-70b autoregressive sampling on Inf2 & Trn1 `_ * - meta-llama/Llama-2-13b - transformers-neuronx - * `Run Hugging Face meta-llama/Llama-2-13b autoregressive sampling on Inf2 & Trn1 `_ * - meta-llama/Llama-2-70b - transformers-neuronx - * `Run Hugging Face meta-llama/Llama-2-70b autoregressive sampling on Inf2 & Trn1 `_ * `Run speculative sampling on Meta Llama models [Beta] `_ * - meta-llama/Llama-2-7b - neuronx-distributed - * Run Hugging Face meta-llama/Llama-2-7b autoregressive sampling on Inf2 & Trn1 (:ref:`[html] ` :pytorch-neuron-src:`[notebook] `) * - mistralai/Mistral-7B-Instruct-v0.1 - transformers-neuronx - * :ref:`Run Mistral-7B-Instruct-v0.1 autoregressive sampling on Inf2 & Trn1 ` * - mistralai/Mistral-7B-Instruct-v0.2 - transformers-neuronx - * `Run Hugging Face mistralai/Mistral-7B-Instruct-v0.2 autoregressive sampling on Inf2 & Trn1 [Beta] `_ * - Mixtral-8x7B-v0.1 - transformers-neuronx - * `Run Hugging Face mistralai/Mixtral-8x7B-v0.1 autoregressive sampling on Inf2 & Trn1 `_ * - codellama/CodeLlama-13b-hf - transformers-neuronx - * `Run Hugging Face codellama/CodeLlama-13b-hf autoregressive sampling on Inf2 & Trn1 `_ .. _encoder_decoder_model_samples_inference_inf2_trn1: Encoder-Decoders ---------------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - t5-large - * torch-neuronx * optimum-neuron - * T5 inference tutorial :ref:`[html] ` :pytorch-neuron-src:`[notebook] ` * - t5-3b - neuronx-distributed - * T5 inference tutorial :ref:`[html] ` :pytorch-neuron-src:`[notebook] ` * - google/flan-t5-xl - neuronx-distributed - * flan-t5-xl inference tutorial :ref:`[html] ` :pytorch-neuron-src:`[notebook] ` .. _vision_transformer_model_samples_inference_inf2_trn1: Vision Transformers ------------------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - google/vit-base-patch16-224 - torch-neuronx - * `HuggingFace Pretrained ViT Inference on Trn1 `_ * - clip-vit-base-patch32 - torch-neuronx - * `HuggingFace Pretrained CLIP Base Inference on Inf2 `_ * - clip-vit-large-patch14 - torch-neuronx - * `HuggingFace Pretrained CLIP Large Inference on Inf2 `_ .. _cnn_model_samples_inference_inf2_trn1: Convolutional Neural Networks(CNN) ---------------------------------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - resnet50 - torch-neuronx - * `Torchvision Pretrained ResNet50 Inference on Trn1 / Inf2 `_ * Torchvision ResNet50 tutorial :ref:`[html] ` :pytorch-neuron-src:`[notebook] ` * - resnet50 - tensorflow-neuronx - * :ref:`tensorflow-servingx-neuronrt-visible-cores` * - unet - torch-neuronx - * `Pretrained UNet Inference on Trn1 / Inf2 `_ * - vgg - torch-neuronx - * `Torchvision Pretrained VGG Inference on Trn1 / Inf2 `_ .. _sd_model_samples_inference_inf2_trn1: Stable Diffusion ---------------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - stable-diffusion-v1-5 - torch-neuronx - * `HuggingFace Stable Diffusion 1.5 (512x512) Inference on Trn1 / Inf2 `_ * - stable-diffusion-2-1-base - torch-neuronx - * `HuggingFace Stable Diffusion 2.1 (512x512) Inference on Trn1 / Inf2 `_ * - stable-diffusion-2-1 - torch-neuronx - * `HuggingFace Stable Diffusion 2.1 (768x768) Inference on Trn1 / Inf2 `_ * `Deploy & Run Stable Diffusion on SageMaker and Inferentia2 `_ * - stable-diffusion-xl-base-1.0 - torch-neuronx - * `HuggingFace Stable Diffusion XL 1.0 (1024x1024) Inference on Inf2 `_ * `HuggingFace Stable Diffusion XL 1.0 Base and Refiner (1024x1024) Inference on Inf2 `_ * - stable-diffusion-2-inpainting - torch-neuronx - * `stable-diffusion-2-inpainting model Inference on Trn1 / Inf2 `_ .. _multi_modal_model_samples_inference_inf2_trn1: Multi Modal ----------- .. list-table:: :widths: 20 15 45 :header-rows: 1 :align: left :class: table-smaller-font-size * - Model - Frameworks/Libraries - Samples and Tutorials * - multimodal-perceiver - torch-neuronx - * `HuggingFace Multimodal Perceiver Inference on Trn1 / Inf2 `_ * - language-perceiver - torch-neuronx - * `HF Pretrained Perceiver Language Inference on Trn1 / Inf2 `_ * - vision-perceiver-conv - torch-neuronx - * `HF Pretrained Perceiver Image Classification Inference on Trn1 / Inf2 `_