nki.isa.select_reduce#
- nki.isa.select_reduce(dst, predicate, on_true, on_false, reduce_res=None, reduce_cmd=reduce_cmd.idle, reduce_op=<function maximum>, reverse_pred=False, name=None)[source]#
Selectively copy elements from either
on_trueoron_falseto the destination tile based on apredicateusing Vector Engine, with optional reduction (max).The operation can be expressed in NumPy as:
# Select: predicate = ~predicate if reverse_pred else predicate result = np.where(predicate, on_true, on_false) # With Reduce: reduction_result = np.max(result, axis=1, keepdims=True)
Memory constraints:
Both
on_trueandpredicateare permitted to be in SBUFEither
on_trueorpredicatemay be in PSUM, but not both simultaneouslyThe destination
dstcan be in either SBUF or PSUM
Shape and data type constraints:
on_true,dst, andpredicatemust have identical shapes (same number of partitions and elements per partition)on_truecan be any supported dtype excepttfloat32,int32,uint32on_falsedtype must befloat32ifon_falseis a scalar.on_falsehas to be either scalar or vector of shape(on_true.shape[0], 1)predicatedtype can be any supported integer typeint8,uint8,int16,uint16reduce_resmust be a vector of shape(on_true.shape[0], 1)reduce_resdtype must of float typereduce_oponly supportsmax
Behavior:
Where predicate is True: The corresponding elements from
on_trueare copied todstWhere predicate is False: The corresponding elements from
on_falseare copied todstWhen reduction is enabled, the max value from each partition of the
resultis computed and stored inreduce_res
Accumulator behavior:
The Vector Engine maintains internal accumulator registers that can be controlled via the
reduce_cmdparameter:nisa.reduce_cmd.reset_reduce: Reset accumulators to -inf, then accumulate the current resultsnisa.reduce_cmd.reduce: Continue accumulating without resetting (useful for multi-step reductions)nisa.reduce_cmd.idle: No accumulation performed (default)
Note
Even when
reduce_cmdis set toidle, the accumulator state may still be modified. Always usereset_reduceafter any operations that ran withidlemode to ensure consistent behavior.Note
The accumulator registers are shared for other Vector Engine accumulation instructions such nki.isa.range_select
- Parameters:
dst – The destination tile to write the selected values to
predicate – Tile that determines which value to select (on_true or on_false)
on_true – Tile to select from when predicate is True
on_false – Value to use when predicate is False, can be a scalar value or a vector tile of
(on_true.shape[0], 1)reduce_res – (optional) Tile to store reduction results, must have shape
(on_true.shape[0], 1)reduce_cmd – (optional) Control accumulator behavior using
nisa.reduce_cmdvalues, defaults to idlereduce_op – (optional) Reduction operator to apply (only
nl.maximumis supported)reverse_pred – (optional) Reverse the meaning of the predicate condition, defaults to False