.. _tensorflow-ref-neuron-compile-api:

TensorFlow 1.x (``tensorflow-neuron``) Compilation API
============================================

The Neuron compilation API for TensorFlow 1.x enables compilation of saved
model to an Inferentia target.

Method
------

``tensorflow.neuron.saved_model.compile``

Description
-----------

Within the graph or subgraph, the compile method selects and send
Neuron-supported operations to Neuron-Compiler for compilation and saves
the compiled artifacts in the graph. Uncompilable operations are kept as
original operations for framework execution.

The compiled graph can be exported to saved model and served using
TensorFlow Serving. Please see :ref:`tensorflow-serving` for more
information about exporting to saved model and serving using TensorFlow
Serving.

Options can be passed to Neuron compiler via the compile function. For
example, the “\ ``--neuroncore-pipeline-cores``\ ” option directs Neuron
compiler to compile each subgraph to fit in the specified number of
NeuronCores. This number can be less than the total available
NeuronCores on an Inf1 instance. See :ref:`neuron-compiler-cli-reference`
for more information about compiler options.

Arguments
---------

-  **model_dir:** The path of the original ``SavedModel``.
-  **new_model_dir:** The path to which the Neuron-optimized
   ``SavedModel`` will be stored.
-  **batch_size:** (Optional) Positive integer representing batch size
   used in inference. The default value is 1.
-  **model_shape_feed_dict:** (Optional) Dictionary {str: list} used for
   inferring tensor shapes. Keys should match model input names. Values
   are lists of positive integers representing model input tensor
   shapes.
-  **model_feed_dict:** (Optional) Dictionary {str: numpy.array} used
   for inference. Useful for inferring tensor shapes. Keys should match
   model input names. Values are numpy arrays that can be fed as inputs
   to the ``SavedModel``.
-  **tags:** (Optional) Iterable of strings to identify the required
   ``MetaGraphDef``. These should correspond to the tags used when
   saving the variables using the ``SavedModel`` ``save()`` API. Default
   is to use the first ``tag_set`` available in the ``SavedModel``.
-  **signature_def_key:** (Optional) String specifying the
   ``signature_def`` to use. Default is to use 'serving_default' or the
   first ``signature_def`` corresponding to ``tags``.
-  **minimum_segment_size:** (Optional) Integer indicating the minimum
   number of operations in an NeuronOp.
-  **no_fuse_ops:** (Optional) None or iterable of strings (unordered)
   representing names of operations that are forcibly placed on CPU.
-  **compiler_args:** (Optional) List of strings representing neuron-cc
   compiler arguments. Note that these arguments apply to all subgraphs
   generated by whitelist partitioning. For example, use
   ``compiler_args=['--neuroncore-pipeline-cores', '4']`` to set number
   of NeuronCores per subgraph to 4. See :ref:`neuron-compiler-cli-reference`
   for more information about compiler options.
-  **compiler_workdir:** (Optional) String representing work directory
   of the neuron-cc compiler.

Returns
-------

-  Dictionary with operator counts before/after optimization.
-  Operator count statistics are displayed to show original count,
   post-optimization count, and the number placed on Neuron runtime. For
   example:

::

   INFO:tensorflow:Number of operations in TensorFlow session: 3978
   INFO:tensorflow:Number of operations after tf.neuron optimizations: 555
   INFO:tensorflow:Number of operations placed on Neuron runtime: 554

Example Usage
-------------

.. code:: python

   import shutil
   import tensorflow.neuron as tfn
   saved_model_path = "<saved model path>"
   compiled_saved_model_path = "<compiled saved model path>"
   shutil.rmtree(compiled_saved_model_path, ignore_errors=True)
   tfn.saved_model.compile(saved_model_path, compiled_saved_model_path)