This document is relevant for: Inf1

Deploy Neuron Container on Elastic Kubernetes Service (EKS)#


Neuron developer flow for DLC on ECS

You can use the Neuron version of the AWS Deep Learning Containers to run inference on Amazon Elastic Kubernetes Service (EKS). In this developer flow, you set up an EKS cluster with Inf1 instances, create a Kubernetes manifest for your inference service and deploy it to your cluster. This developer flow assumes:

  1. The model has already been compiled through Compilation with Framework API on EC2 instance or through Compilation with Sagemaker Neo.

  2. You already set up your container to retrieve it from storage.

Setup Environment#

  1. Install pre-requisits:

    Follow these instruction to install or upgrade the eksctl command line utility on your local computer.

    Follow these instruction to install kubectl in the same computer. kubectl is a command line tool for working with Kubernetes clusters.

  2. Follow the instructions in this EKS documentation link to set up AWS Inferentia on your EKS cluster. Using the YML deployment manifest shown in the same link, replace the image in the containers specification with the one you built using Tutorial How to Build and Run a Neuron Container above.


    Before deploying your task definition to your EKS cluster, make sure to push the image to ECR. Refer to Pushing a Docker image for more information.

Self-managed Kubernetes#

Please refer to Kubernetes environment setup for Neuron. In Deploy a TensorFlow Resnet50 model as a Kubernetes service, the container image referenced in the YML manifest is created using Tutorial How to Build and Run a Neuron Container.

This document is relevant for: Inf1