.. _tensorflow-ref-neuron-compile-api: TensorFlow 1.x (``tensorflow-neuron``) Compilation API ============================================ The Neuron compilation API for TensorFlow 1.x enables compilation of saved model to an Inferentia target. Method ------ ``tensorflow.neuron.saved_model.compile`` Description ----------- Within the graph or subgraph, the compile method selects and send Neuron-supported operations to Neuron-Compiler for compilation and saves the compiled artifacts in the graph. Uncompilable operations are kept as original operations for framework execution. The compiled graph can be exported to saved model and served using TensorFlow Serving. Please see :ref:`tensorflow-serving` for more information about exporting to saved model and serving using TensorFlow Serving. Options can be passed to Neuron compiler via the compile function. For example, the “\ ``--neuroncore-pipeline-cores``\ ” option directs Neuron compiler to compile each subgraph to fit in the specified number of NeuronCores. This number can be less than the total available NeuronCores on an Inf1 instance. See :ref:`neuron-compiler-cli-reference` for more information about compiler options. Arguments --------- - **model_dir:** The path of the original ``SavedModel``. - **new_model_dir:** The path to which the Neuron-optimized ``SavedModel`` will be stored. - **batch_size:** (Optional) Positive integer representing batch size used in inference. The default value is 1. - **model_shape_feed_dict:** (Optional) Dictionary {str: list} used for inferring tensor shapes. Keys should match model input names. Values are lists of positive integers representing model input tensor shapes. - **model_feed_dict:** (Optional) Dictionary {str: numpy.array} used for inference. Useful for inferring tensor shapes. Keys should match model input names. Values are numpy arrays that can be fed as inputs to the ``SavedModel``. - **tags:** (Optional) Iterable of strings to identify the required ``MetaGraphDef``. These should correspond to the tags used when saving the variables using the ``SavedModel`` ``save()`` API. Default is to use the first ``tag_set`` available in the ``SavedModel``. - **signature_def_key:** (Optional) String specifying the ``signature_def`` to use. Default is to use 'serving_default' or the first ``signature_def`` corresponding to ``tags``. - **minimum_segment_size:** (Optional) Integer indicating the minimum number of operations in an NeuronOp. - **no_fuse_ops:** (Optional) None or iterable of strings (unordered) representing names of operations that are forcibly placed on CPU. - **compiler_args:** (Optional) List of strings representing neuron-cc compiler arguments. Note that these arguments apply to all subgraphs generated by whitelist partitioning. For example, use ``compiler_args=['--neuroncore-pipeline-cores', '4']`` to set number of NeuronCores per subgraph to 4. See :ref:`neuron-compiler-cli-reference` for more information about compiler options. - **compiler_workdir:** (Optional) String representing work directory of the neuron-cc compiler. Returns ------- - Dictionary with operator counts before/after optimization. - Operator count statistics are displayed to show original count, post-optimization count, and the number placed on Neuron runtime. For example: :: INFO:tensorflow:Number of operations in TensorFlow session: 3978 INFO:tensorflow:Number of operations after tf.neuron optimizations: 555 INFO:tensorflow:Number of operations placed on Neuron runtime: 554 Example Usage ------------- .. code:: python import shutil import tensorflow.neuron as tfn saved_model_path = "" compiled_saved_model_path = "" shutil.rmtree(compiled_saved_model_path, ignore_errors=True) tfn.saved_model.compile(saved_model_path, compiled_saved_model_path)