This document is relevant for: Inf2, Trn1, Trn2, Trn3
Developer Guide#
Learn how to optimize your models with the Neuron Compiler (neuronx-cc). These guides cover mixed precision training, performance-accuracy tuning, and custom kernel implementations for AWS Trainium and Inferentia instances.
Mixed Precision and Performance-Accuracy Tuning
Learn how to use FP32, TF32, FP16, and BF16 data types with the Neuron Compiler’s auto-cast options to balance performance and accuracy. Understand the tradeoffs between different data types and how to configure compiler settings for optimal model execution.
This document is relevant for: Inf2, Trn1, Trn2, Trn3