nki.isa.bn_aggr#
- nki.isa.bn_aggr(dst, data, name=None)[source]#
Aggregate one or multiple
bn_statsoutputs to generate a mean and variance per partition using Vector Engine.The input
datatile effectively has an array of(count, mean, variance*count)tuples per partition produced by bn_stats instructions. Therefore, the number of elements per partition ofdatamust be a modulo of three.Note, if you need to aggregate multiple
bn_statsinstruction outputs, it is recommended to declare a SBUF tensor and then make eachbn_statsinstruction write its output into the SBUF tensor at different offsets.Vector Engine performs the statistics aggregation in float32 precision. The engine automatically casts the input
datato float32 before performing computation. The float32 computation results are cast todst.dtypeat no additional performance cost.- Parameters:
dst – an output tile with two elements per partition: a mean followed by a variance
data – an input tile with results of one or more bn_stats