This document is relevant for: Inf1

TensorFlow Neuron (tensorflow-neuron (TF2.x)) Accelerated (torch-neuron) Python APIs and Graph Ops#

This page lists TensorFlow 2.x Python APIs and graph operators that are accelerated by AWS Neuron. The lists are not exhaustive. TensorFlow 2.x Python APIs or graph operators that are not listed here may still be accelerated if they are composed of accelerated primitives, or they will be executed on CPU without significant acceleration. The TensorFlow Neuron integration contains an automatic operator-device-placement mechanism that strives to maximize the execution efficiency of your deep learning models on AWS Machine Learning ASIC instances.

Accelerated Python APIs#

Module

Accelerated Python API

Comments

tf

tf.abs

tf.add

tf.add_n

tf.broadcast_static_shape

tf.cast

tf.constant

tf.convert_to_tensor

tf.cumsum

axis must be a compile-time constant.

tf.einsum

tf.erf

tf.exp

tf.identity

tf.matmul

Uses float16/bfloat16 matmul with float32 accumulation.

tf.maximum

tf.minimum

tf.multiply

tf.negative

tf.range

start, limit and delta arguments must be compile-time constants.

tf.realdiv

tf.reciprocal

tf.reduce_all

axis must be a compile-time constant.

tf.reduce_any

axis must be a compile-time constant.

tf.reduce_max

axis must be a compile-time constant.

tf.reduce_min

axis must be a compile-time constant.

tf.reduce_prod

axis must be a compile-time constant.

tf.reduce_sum

axis must be a compile-time constant.

tf.reshape

shape argument must be a compile-time constant.

tf.rsqrt

tf.scalar_mul

tf.shape

tf.shape_n

tf.sigmoid

tf.size

tf.slice

size must be a compile-time constant. In addition,

either begin must be a compile-time constant or

size must be non-negative.

tf.sqrt

tf.square

tf.squared_difference

tf.squeeze

tf.stack

tf.stop_gradient

tf.strided_slice

tf.tanh

tf.tensordot

tf.to_bfloat16

tf.to_float

tf.truediv

tf.layers

tf.layers.batch_normalization

tf.layers.dense

tf.layers.flatten

tf.nn

tf.nn.batch_normalization

tf.nn.bias_add

tf.nn.dropout

Always treated as tf.identity during inference.

tf.nn.fused_batch_norm

tf.nn.leaky_relu

tf.nn.relu

tf.nn.relu6

tf.nn.relu_layer

tf.nn.softmax

Accelerated graph operators#

Add
AddN
AddV2
BatchMatMul
BatchMatMulV2
BiasAdd
Cast
Const
Cumsum
Einsum
Erf
Exp
ExpandDims
FusedBatchNorm
FusedBatchNormV2
FusedBatchNormV3
Greater
Identity
LeakyRelu
MatMul
Max
Maximum
Minimum
Mean
Mul
Neg
Pack
RealDiv
Relu
Relu6
Reshape
Rsqrt
Sigmoid
Softmax
Split
SplitV
Sqrt
Square
SquaredDifference
Squeeze
StridedSlice
Sub
Sum
Tanh
Transpose
Unpack

The lists share many commonalities with Available TensorFlow Ops. Portions of this page are modifications based on work created and shared by Google and used according to terms described in the Creative Commons 4.0 Attribution License.

This document is relevant for: Inf1