Bring Your Own Neuron Container to Sagemaker Hosting

Table of Contents


Neuron developer flow on SageMaker Neo

You can use a SageMaker Notebook or an EC2 instance to compile models and build your own containers for deployment on SageMaker Hosting using ml.inf1 instances. In this developer flow, you provision a Sagemaker Notebook or an EC2 instance to train and compile your model to Inferentia. Then you deploy your model to SageMaker Hosting using the SageMaker Python SDK. Follow the steps bellow to setup your environment. Once your environment is set you’ll be able to follow the BYOC HuggingFace pretrained BERT container to Sagemaker Tutorial .

Setup Environment

  1. Create a Compilation Instance:

    If using an EC2 instance for compilation you can use an Inf1 instance to compile and test a model. Follow these steps to launch an Inf1 instance:

    • Please follow the instructions at launch an Amazon EC2 Instance to Launch an Inf1 instance, when choosing the instance type at the EC2 console. Please make sure to select the correct instance type. To get more information about Inf1 instances sizes and pricing see Inf1 web page.

    • When choosing an Amazon Machine Image (AMI) make sure to select Deep Learning AMI with Conda Options. Please note that Neuron Conda environments are supported only in Ubuntu 18 DLAMI and Amazon Linux2 DLAMI, Neuron Conda environments are not supported in Amazon Linux DLAMI.

    • After launching the instance, follow the instructions in Connect to your instance to connect to the instance


    You can also launch the instance from AWS CLI, please see AWS CLI commands to launch inf1 instances.

    If using an SageMaker Notebook for compilation, follow the instructions in Get Started with Notebook Instances to provision the environment.

    It is recommended that you start with an ml.c5.4xlarge instance for the compilation. Also, increase the volume size of you SageMaker notebook instance, to accomodate the models and containers built locally. A volume of 10GB is sufficient.


    To compile the model in the SageMaker Notebook instance, you’ll need to update the conda environments to include the Neuron Compiler and Neuron Framework Extensions. Follow the installation guide on the section How to update to latest Neuron packages in DLAMI Conda Environments? to update the environments.

  2. Set up the environment to compile a model, build your own container and deploy:

    To compile your model on EC2 or SageMaker Notebook, follow the Set up a development environment section on the EC2 Setup Environment documentation.

    Refer to Adapting Your Own Inference Container documentation for information on how to bring your own containers to SageMaker Hosting.

    Make sure to add the AmazonEC2ContainerRegistryPowerUser role to your IAM role ARN, so you’re able to build and push containers from your SageMaker Notebook instance.


    You can use the Neuron version of the AWS Deep Learning Containers as base to build your own container images. Refer to the Containers section of our documentation for more information.