This document is relevant for: Inf1

Running SSD300 with AWS Neuron#

Update 11/16: The model checkpoint linkhttps://api.ngc.nvidia.com/v2/models/nvidia/ssdpyt_fp32/versions/1/files/nvidia_ssdpyt_fp32_20190225.ptis currently broken and the AWS Neuron team is working on providing an alternative source.

This demo shows a Neuron compatible SSD300 implementation that is functionally equivalent to open source SSD300 model. This demo uses TensorFlow-Neuron, PyTorch SSD300 model and checkpoint (https://pytorch.org/hub/nvidia_deeplearningexamples_ssd/) and also shows the performance achieved by the Inf1 instance.

Table of Contents#

Launch EC2 instance and update AWS Neuron SDK software
Generating Neuron compatible SSD300 TensorFlow SavedModel
- Convert open source PyTorch SSD300 model and checkpoint into Neuron compatible SSD300 TensorFlow SavedModel
Evaluate the generated SSD300 TensorFlow SavedModel for both accuracy and performance
- Running threaded inference through the COCO 2017 validation dataset

Launch EC2 instances and update tensorflow-neuron and neuron-cc#

For this demo, launch one inf1.xlarge EC2 instance. We recommend using the latest Ubuntu 18 Deep Learning AMI (DLAMI).

Please configure your ubuntu16/ubuntu18/yum repo following the steps in the Install TensorFlow Neuron in order to install tensorflow-model-server-neuron.

Generating Neuron compatible SSD300 TensorFlow SavedModel#

First connect to your inf1.xlarge instance

Compile open source PyTorch SSD300 model and checkpoint into Neuron compatible SSD300 TensorFlow SavedModel#

In the same directory ssd300_demo, run the following:

Create venv and install dependencies

sudo apt update
sudo apt install g++ python3-dev python3-venv unzip
sudo apt install tensorflow-model-server-neuron
python3 -m venv env
source ./env/bin/activate
pip install pip setuptools --upgrade
pip install -r ./requirements.txt --extra-index-url=https://pip.repos.neuron.amazonaws.com

Clone NVIDIA’s DeepLearningExamples repo that contains PyTorch SSD300.

git clone https://github.com/NVIDIA/DeepLearningExamples.git
cd DeepLearningExamples
git checkout a644350589f9abc91b203f73e686a50f5d6f3e96
cd ..

Download PyTorch SSD300 checkpoint file.

curl -LO https://api.ngc.nvidia.com/v2/models/nvidia/ssdpyt_fp32/versions/1/files/nvidia_ssdpyt_fp32_20190225.pt

Download COCO 2017 validation set and annotations.

curl -LO http://images.cocodataset.org/zips/val2017.zip
unzip ./val2017.zip
curl -LO http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip ./annotations_trainval2017.zip

Convert PyTorch SSD300 model and checkpoint into a Neuron-compatible TensorFlow SavedModel.

python ssd300_model.py --torch_checkpoint=./nvidia_ssdpyt_fp32_20190225.pt --output_saved_model=./ssd300_tf_neuron/1

This converts PyTorch SSD300 model and checkpoint to a Neuron-compatible TensorFlow SavedModel using tensorflow-neuron and neuron-cc. The compilation output is stored in ./ssd300_tf_neuron.

Launch the tensorflow-model-server-neuron gRPC server at default port 8500 in the background.

tensorflow_model_server_neuron --model_base_path=$(pwd)/ssd300_tf_neuron &

In client, evaluate the Neuron-compatible TensorFlow SavedModel for both accuracy and performance. Note that this client by default assumes a tensorflow-model-server-neuron listening at localhost:8500. On inf1.xlarge, the expected throughput is 100 images/second once the server is fully warmed up, and the expected mean average precision (mAP) is 0.253.

python ssd300_evaluation_client.py --val2017=./val2017 --instances_val2017_json=./annotations/instances_val2017.json

After running the demo, please cleanup resources allocated in Neuron runtime by gracefully killing the tensorflow_model_server_neuron process, e. g.,

killall tensorflow_model_server_neuron

This document is relevant for: Inf1