.. _tensorflow-ssd300:

.. meta::
   :noindex:
   :nofollow:
   :description: This tutorial for the AWS Neuron SDK is currently archived and not maintained. It is provided for reference only.

Running SSD300 with AWS Neuron
==============================

.. note:: 
   This page was archived on 7/31/2025.

*Update 11/16: The model checkpoint
link*\ https://api.ngc.nvidia.com/v2/models/nvidia/ssdpyt_fp32/versions/1/files/nvidia_ssdpyt_fp32_20190225.pt\ *is
currently broken and the AWS Neuron team is working on providing an
alternative source.*


This demo shows a Neuron compatible SSD300 implementation that is
functionally equivalent to open source SSD300 model. This demo uses
TensorFlow-Neuron, PyTorch SSD300 model and checkpoint
(https://pytorch.org/hub/nvidia_deeplearningexamples_ssd/) and also
shows the performance achieved by the Inf1 instance.

Table of Contents
-----------------

1. Launch EC2 instance and update AWS Neuron SDK software
2. Generating Neuron compatible SSD300 TensorFlow SavedModel

   -  Convert open source PyTorch SSD300 model and checkpoint into
      Neuron compatible SSD300 TensorFlow SavedModel

3. Evaluate the generated SSD300 TensorFlow SavedModel for both accuracy
   and performance

   -  Running threaded inference through the COCO 2017 validation
      dataset

Launch EC2 instances and update tensorflow-neuron and neuron-cc
---------------------------------------------------------------

For this demo, launch one inf1.xlarge EC2 instance. We recommend using
the latest Ubuntu 18 Deep Learning AMI (DLAMI).

Please configure your ubuntu16/ubuntu18/yum repo following the steps in
the :ref:`install-neuron-tensorflow` in order to install
``tensorflow-model-server-neuron``.

Generating Neuron compatible SSD300 TensorFlow SavedModel
---------------------------------------------------------

First connect to your inf1.xlarge instance

Compile open source PyTorch SSD300 model and checkpoint into Neuron compatible SSD300 TensorFlow SavedModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In the same directory ssd300_demo, run the following:

1. Create venv and install dependencies

.. code:: bash

   sudo apt update
   sudo apt install g++ python3-dev python3-venv unzip
   sudo apt install tensorflow-model-server-neuron
   python3 -m venv env
   source ./env/bin/activate
   pip install pip setuptools --upgrade
   pip install -r ./requirements.txt --extra-index-url=https://pip.repos.neuron.amazonaws.com

2. Clone NVIDIA's DeepLearningExamples repo that contains PyTorch
   SSD300.

.. code:: bash

   git clone https://github.com/NVIDIA/DeepLearningExamples.git
   cd DeepLearningExamples
   git checkout a644350589f9abc91b203f73e686a50f5d6f3e96
   cd ..

3. Download PyTorch SSD300 checkpoint file.

.. code:: bash

   curl -LO https://api.ngc.nvidia.com/v2/models/nvidia/ssdpyt_fp32/versions/1/files/nvidia_ssdpyt_fp32_20190225.pt

4. Download COCO 2017 validation set and annotations.

.. code:: bash

   curl -LO http://images.cocodataset.org/zips/val2017.zip
   unzip ./val2017.zip
   curl -LO http://images.cocodataset.org/annotations/annotations_trainval2017.zip
   unzip ./annotations_trainval2017.zip

5. Convert PyTorch SSD300 model and checkpoint into a Neuron-compatible
   TensorFlow SavedModel.

.. code:: bash

   python ssd300_model.py --torch_checkpoint=./nvidia_ssdpyt_fp32_20190225.pt --output_saved_model=./ssd300_tf_neuron/1

This converts PyTorch SSD300 model and checkpoint to a Neuron-compatible
TensorFlow SavedModel using tensorflow-neuron and neuron-cc. The
compilation output is stored in ``./ssd300_tf_neuron``.

6. Launch the ``tensorflow-model-server-neuron`` gRPC server at default
   port 8500 in the background.

.. code:: bash

   tensorflow_model_server_neuron --model_base_path=$(pwd)/ssd300_tf_neuron &

7. In client, evaluate the Neuron-compatible TensorFlow SavedModel for
   both accuracy and performance. Note that this client by default
   assumes a ``tensorflow-model-server-neuron`` listening at
   ``localhost:8500``. On inf1.xlarge, the expected throughput is 100
   images/second once the server is fully warmed up, and the expected
   mean average precision (mAP) is 0.253.

.. code:: bash

   python ssd300_evaluation_client.py --val2017=./val2017 --instances_val2017_json=./annotations/instances_val2017.json

8. After running the demo, please cleanup resources allocated in Neuron
   runtime by gracefully killing the ``tensorflow_model_server_neuron``
   process, e. g.,

.. code:: bash

   killall tensorflow_model_server_neuron