Index _ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | Z _ __init__() (nki.isa.nc_version method) A abs (C++ function) abs() (in module nki.language) abs_out (C++ function) accessor (C++ function), [1] activation() (in module nki.isa) activation_reduce() (in module nki.isa) add (C++ function), [1] add() (in module nki.language) add_out (C++ function), [1] affine_range() (in module nki.language) affine_select() (in module nki.isa) align_stack_curr_addr() (nkilib.core.utils.allocator.SbufManager method) all() (in module nki.language) all_gather() (in module nki.collectives) all_reduce() (in module nki.collectives) all_to_all() (in module nki.collectives) all_to_all_v() (in module nki.collectives) allgather_compute_matmul() (in module nkilib.experimental.collectives) allgather_sb2sb() (in module nkilib.experimental.collectives) allgather_sb2sb_tiled() (in module nkilib.experimental.collectives) alloc() (nkilib.core.utils.allocator.SbufManager method) alloc_heap() (nkilib.core.utils.allocator.SbufManager method) alloc_stack() (nkilib.core.utils.allocator.SbufManager method) arctan() (in module nki.language) attention_block_tkg() (in module nkilib.core.attention_block_tkg.attention_block_tkg) attention_cte() (in module nkilib.core.attention.attention_cte) attention_tkg() (in module nkilib.core.attention_tkg) AttnTKGConfig (class in nkilib.core.attention_tkg) B benchmark() built-in function BF16 bfloat16 (in module nki.language) bitwise_and (C++ function), [1], [2] bitwise_and() (in module nki.language) bitwise_and_out (C++ function), [1], [2] bitwise_not (C++ function) bitwise_not_out (C++ function) bitwise_or (C++ function), [1], [2] bitwise_or() (in module nki.language) bitwise_or_out (C++ function), [1], [2] bitwise_xor() (in module nki.language) block_len (nkilib.core.attention_tkg.AttnTKGConfig attribute) blockwise_mm_bwd() (in module nkilib.experimental.moe.bwd) bn_aggr() (in module nki.isa) bn_stats() (in module nki.isa) bool_ (in module nki.language) broadcast() (nkilib.core.utils.tensor_view.TensorView method) broadcast_to() (in module nki.language) bs (nkilib.core.attention_tkg.AttnTKGConfig attribute) built-in function benchmark() compile() get_reports() model_index.append() model_index.copy() model_index.create() model_index.filter() model_index.load() model_index.move() model_index.save() print_reports() torch.neuron.DataParallel() torch.neuron.DataParallel.disable_dynamic_batching(), [1] torch_neuron.trace() torch_neuronx.analyze() torch_neuronx.async_load() torch_neuronx.bucket_model_trace() torch_neuronx.DataParallel() torch_neuronx.dynamic_batch() torch_neuronx.experimental.profiler.profile() torch_neuronx.experimental.profiler.profile.start() torch_neuronx.lazy_load() torch_neuronx.move_trace_to_device() torch_neuronx.multicore_context() torch_neuronx.neuron_cores_context() torch_neuronx.PartitionerConfig() torch_neuronx.replace_weights() torch_neuronx.set_multicore() torch_neuronx.set_neuron_cores() torch_neuronx.trace() write_csv() write_json() C CCE ceil (C++ function) ceil() (in module nki.language) ceil_out (C++ function) cFP8 clamp (C++ function) clamp_out (C++ function) close (C++ function), [1] close_scope() (nkilib.core.utils.allocator.SbufManager method) Collective Communication Engine collective_permute() (in module nki.collectives) collective_permute_implicit() (in module nki.collectives) collective_permute_implicit_current_processing_rank_id() (in module nki.collectives) collective_permute_implicit_reduce() (in module nki.collectives) compile() built-in function conv1d() (in module nkilib.experimental.conv) copy() (in module nki.language) core_barrier() (in module nki.isa) cos (C++ function) cos() (in module nki.language) cos_out (C++ function) create_auto_alloc_manager() (in module nkilib.core.utils.allocator) cross_entropy_backward() (in module nkilib.experimental.loss) cross_entropy_forward() (in module nkilib.experimental.loss) cumsum() (in module nkilib.core.cumsum) curr_sprior (nkilib.core.attention_tkg.AttnTKGConfig attribute) CustomOps D d_head (nkilib.core.attention_tkg.AttnTKGConfig attribute) depthwise_conv1d_implicit_gemm() (in module nkilib.experimental.conv) device_print() (in module nki.language) dge_mode (class in nki.isa) div (C++ function), [1] div_out (C++ function), [1] dma_compute() (in module nki.isa) dma_copy() (in module nki.isa) dma_transpose() (in module nki.isa) DP DPr dropout() (in module nki.isa) (in module nki.language) ds() (in module nki.language) dynamic_elementwise_add() (in module nkilib.experimental.dynamic_shapes) dynamic_range() (in module nki.language) E empty (C++ function) empty_like() (in module nki.language) engine (class in nki.isa) eps (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs attribute) equal() (in module nki.language) erf() (in module nki.language) erf_dx() (in module nki.language) exp (C++ function) exp() (in module nki.language) exp_out (C++ function) expand_dim() (nkilib.core.utils.tensor_view.TensorView method) expand_dims() (in module nki.language) eye (C++ function) F fill_ (C++ function) find_nonzero_indices() (in module nkilib.core.subkernels) fine_grained_allgather() (in module nkilib.experimental.collectives) flatten_dims() (nkilib.core.utils.tensor_view.TensorView method) float16 (in module nki.language) float32 (in module nki.language) FLOAT32_TO_FLOAT16 (torch_neuron.Optimization attribute) float4_e2m1fn_x4 (in module nki.language) float8_e4m3 (in module nki.language) float8_e4m3fn (in module nki.language) float8_e4m3fn_x4 (in module nki.language) float8_e5m2 (in module nki.language) float8_e5m2_x4 (in module nki.language) floor (C++ function) floor() (in module nki.language) floor_out (C++ function) flush_logs() (nkilib.core.utils.allocator.SbufManager method) FP16 FP32 full (C++ function) full() (in module nki.language) full_sprior (nkilib.core.attention_tkg.AttnTKGConfig attribute) fuse_rope (nkilib.core.attention_tkg.AttnTKGConfig attribute) G gather_flattened() (in module nki.language) gelu() (in module nki.language) gelu_apprx_sigmoid() (in module nki.language) gelu_apprx_sigmoid_dx() (in module nki.language) gelu_apprx_tanh() (in module nki.language) gelu_dx() (in module nki.language) get_accessor_coherence_policy (C++ function) get_cpu_count (C++ function) get_cpu_id (C++ function) get_dst_tensor (C++ function) get_free_space() (nkilib.core.utils.allocator.SbufManager method) get_heap_curr_addr() (nkilib.core.utils.allocator.SbufManager method) get_name_prefix() (nkilib.core.utils.allocator.SbufManager method) get_nc_version() (in module nki.isa) get_reports() built-in function get_stack_curr_addr() (nkilib.core.utils.allocator.SbufManager method) get_total_space() (nkilib.core.utils.allocator.SbufManager method) get_used_space() (nkilib.core.utils.allocator.SbufManager method) get_view() (nkilib.core.utils.tensor_view.TensorView method) GPSIMD Engine GpSimdE greater() (in module nki.language) greater_equal() (in module nki.language) H has_dynamic_access() (nkilib.core.utils.tensor_view.TensorView method) has_lower_bound() (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs method) HBM hbm (in module nki.language) High Bandwidth Memory I increment_section() (nkilib.core.utils.allocator.SbufManager method) Inf1 Inf2 Inferentia int16 (in module nki.language) int32 (in module nki.language) int8 (in module nki.language) invert() (in module nki.language) iota() (in module nki.isa) is_hbm() (in module nki.language) is_on_chip() (in module nki.language) is_psum() (in module nki.language) is_sbuf() (in module nki.language) J jit() (in module nki) K k_out_in_sb (nkilib.core.attention_tkg.AttnTKGConfig attribute) L left_shift() (in module nki.language) less() (in module nki.language) less_equal() (in module nki.language) load() (in module nki.language) load_transpose2d() (in module nki.language) local_gather() (in module nki.isa) log (C++ function) log() (in module nki.language) log10 (C++ function) log10_out (C++ function) log2 (C++ function) log2_out (C++ function) log_out (C++ function) logical_and() (in module nki.language) logical_not() (in module nki.language) logical_or() (in module nki.language) logical_xor() (in module nki.language) lower_bound (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs attribute) M matmul() (in module nki.language) max() (in module nki.language) max8() (in module nki.isa) maximum() (in module nki.language) mean() (in module nki.language) memset() (in module nki.isa) min() (in module nki.language) minimum() (in module nki.language) mish() (in module nki.language) mlp() (in module nkilib.core.mlp) model_index.append() built-in function model_index.copy() built-in function model_index.create() built-in function model_index.filter() built-in function model_index.load() built-in function model_index.move() built-in function model_index.save() built-in function module placement moe_cte() (in module nkilib.core.moe_cte) moe_tkg() (in module nkilib.core.moe_tkg) mul (C++ function), [1] mul_out (C++ function), [1] multiply() (in module nki.language) N NC nc_find_index8() (in module nki.isa) nc_match_replace8() (in module nki.isa) nc_matmul() (in module nki.isa) nc_matmul_mx() (in module nki.isa) nc_n_gather() (in module nki.isa) nc_stream_shuffle() (in module nki.isa) nc_transpose() (in module nki.isa) nc_version (class in nki.isa) ND ndarray() (in module nki.language) needs_rms_normalization() (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs method) negative() (in module nki.language) Neuron Device Neuron Kernel Interface neuron-cc neuron-cc command line option, [1], [2] neuron-cc command line option neuron-cc, [1], [2] neuron-monitor neuron-monitor command line option neuron-monitor command line option neuron-monitor NeuronCore, [1] NeuronCore-v1 NeuronCore-v2 NeuronCore-v3 NeuronDevice NeuronLink NeuronLink-v1 NeuronLink-v2 NeuronLink-v3 neuronx-cc neuronx-cc command line option, [1], [2] neuronx-cc command line option neuronx-cc, [1], [2] NKI no_reorder() (in module nki.language) norm_type (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs attribute) not_equal() (in module nki.language) nrt_add_tensor_to_tensor_set (C function) nrt_allocate_tensor_set (C function) nrt_close (C function) nrt_debug_client_connect (C function) nrt_debug_client_connect_close (C function) nrt_debug_client_read_one_event (C function) nrt_destroy_tensor_set (C function) nrt_execute (C function) nrt_execute_repeat (C function) nrt_free_model_tensor_info (C function) nrt_get_model_instance_count (C function) nrt_get_model_nc_count (C function) nrt_get_model_tensor_info (C function) nrt_get_tensor_from_tensor_set (C function) nrt_get_total_nc_count (C function) nrt_get_version (C function) nrt_get_visible_nc_count (C function) nrt_init (C function) nrt_load (C function) nrt_load_collectives (C function) nrt_profile_start (C function) nrt_profile_stop (C function) nrt_tensor_allocate (C function) nrt_tensor_allocate_empty (C function) nrt_tensor_allocate_slice (C function) nrt_tensor_attach_buffer (C function) nrt_tensor_check_output_completion (C function) nrt_tensor_copy (C function) nrt_tensor_free (C function) nrt_tensor_get_size (C function) nrt_tensor_get_va (C function) nrt_tensor_read (C function) nrt_tensor_write (C function) nrt_unload (C function) num_programs() (in module nki.language) NxD Core NxD Inference NxD Training O ones (C++ function) ones() (in module nki.language) open_scope() (nkilib.core.utils.allocator.SbufManager method) operator= (C++ function), [1] out_in_sb (nkilib.core.attention_tkg.AttnTKGConfig attribute) output_projection_cte() (in module nkilib.core.output_projection.output_projection_cte) output_projection_tkg() (in module nkilib.core.output_projection.output_projection_tkg) P Partial Sum Buffer permute() (nkilib.core.utils.tensor_view.TensorView method) placement module pop_heap() (nkilib.core.utils.allocator.SbufManager method) pow (C++ function), [1], [2] pow_out (C++ function), [1], [2] power() (in module nki.language) PP PPr print_reports() built-in function private_hbm (in module nki.language) prod() (in module nki.language) program_id() (in module nki.language) program_ndim() (in module nki.language) PSUM psum (in module nki.language) Q q_head (nkilib.core.attention_tkg.AttnTKGConfig attribute) qk_in_sb (nkilib.core.attention_tkg.AttnTKGConfig attribute) qkv() (in module nkilib.core.qkv) quantization_type (nkilib.core.rmsnorm_quant.rmsnorm_quant.RmsNormQuantKernelArgs attribute) quantize_mx() (in module nki.isa) R rand() (in module nki.language) rand2() (in module nki.isa) rand_get_state() (in module nki.isa) rand_set_state() (in module nki.isa) random_seed() (in module nki.language) range_select() (in module nki.isa) rank_id() (in module nki.collectives) read (C++ function) read_stream_accessor (C++ function) rearrange() (nkilib.core.utils.tensor_view.TensorView method) reciprocal() (in module nki.isa) (in module nki.language) reduce_cmd (class in nki.isa) reduce_scatter() (in module nki.collectives) register_alloc() (in module nki.isa) register_load() (in module nki.isa) register_move() (in module nki.isa) register_store() (in module nki.isa) relu() (in module nki.language) ReplicaGroup (class in nki.collectives) reshape() (nkilib.core.utils.tensor_view.TensorView method) reshape_dim() (nkilib.core.utils.tensor_view.TensorView method) right_shift() (in module nki.language) rms_norm() (in module nki.language) rmsnorm_quant_kernel() (in module nkilib.core.rmsnorm_quant.rmsnorm_quant) RmsNormQuantKernelArgs (class in nkilib.core.rmsnorm_quant.rmsnorm_quant) RNE rng() (in module nki.isa) RoPE() (in module nkilib.core.rope) RoPE_sbuf() (in module nkilib.core.rope) router_topk() (in module nkilib.core.router_topk) router_topk_input_w_load() (in module nkilib.core.router_topk) router_topk_input_x_load() (in module nkilib.core.router_topk) rsqrt() (in module nki.language) RT S s_active (nkilib.core.attention_tkg.AttnTKGConfig attribute) SBUF sbuf (in module nki.language) SbufManager (class in nkilib.core.utils.allocator) Scalar Engine scalar_tensor_tensor() (in module nki.isa) ScalarE select() (nkilib.core.utils.tensor_view.TensorView method) select_reduce() (in module nki.isa) sendrecv() (in module nki.isa) sequence_bounds() (in module nki.isa) sequential_range() (in module nki.language) set_accessor_coherence_policy (C++ function) set_name_prefix() (nkilib.core.utils.allocator.SbufManager method) set_rng_seed() (in module nki.isa) shape (nkilib.core.utils.tensor_view.TensorView attribute) shared_hbm (in module nki.language) shared_identity_matrix() (in module nki.language) sigmoid() (in module nki.language) sign() (in module nki.language) silu() (in module nki.language) silu_dx() (in module nki.language) simulate() (in module nki) sin (C++ function) sin() (in module nki.language) sin_out (C++ function) slice() (nkilib.core.utils.tensor_view.TensorView method) softmax() (in module nki.language) softplus() (in module nki.language) sqrt() (in module nki.language) square() (in module nki.language) squeeze_dim() (nkilib.core.utils.tensor_view.TensorView method) SR State Buffer static_range() (in module nki.language) store() (in module nki.language) strided_mm1 (nkilib.core.attention_tkg.AttnTKGConfig attribute) strides (nkilib.core.utils.tensor_view.TensorView attribute) sub (C++ function), [1] sub_out (C++ function), [1] subtract() (in module nki.language) sum() (in module nki.language) Sync Engine T tan (C++ function) tan() (in module nki.language) tan_out (C++ function) tanh() (in module nki.language) tcm_accessor (C++ function), [1] tcm_to_tensor (C++ function) Tensor Engine tensor_copy() (in module nki.isa) tensor_copy_predicated() (in module nki.isa) tensor_partition_reduce() (in module nki.isa) tensor_reduce() (in module nki.isa) tensor_scalar() (in module nki.isa) tensor_scalar_cumulative() (in module nki.isa) tensor_scalar_reduce() (in module nki.isa) tensor_tensor() (in module nki.isa) tensor_tensor_scan() (in module nki.isa) tensor_to_tcm (C++ function) TensorE TensorView (class in nkilib.core.utils.tensor_view) TF32 tfloat32 (in module nki.language) tile_size (class in nki.language) topk_reduce() (in module nkilib.experimental.subkernels) torch.neuron.DataParallel() built-in function torch.neuron.DataParallel.disable_dynamic_batching() built-in function, [1] torch::neuron::tcm_free (C++ function) torch::neuron::tcm_malloc (C++ function) torch_neuron.experimental.multicore_context() (in module placement) torch_neuron.experimental.neuron_cores_context() (in module placement) torch_neuron.experimental.set_multicore() (in module placement) torch_neuron.experimental.set_neuron_cores() (in module placement) torch_neuron.Optimization (built-in class) torch_neuron.trace() built-in function torch_neuronx.analyze() built-in function torch_neuronx.async_load() built-in function torch_neuronx.bucket_model_trace() built-in function torch_neuronx.BucketModelConfig (built-in class) torch_neuronx.DataParallel() built-in function torch_neuronx.dynamic_batch() built-in function torch_neuronx.experimental.profiler.profile() built-in function torch_neuronx.experimental.profiler.profile.start() built-in function torch_neuronx.lazy_load() built-in function torch_neuronx.move_trace_to_device() built-in function torch_neuronx.multicore_context() built-in function torch_neuronx.neuron_cores_context() built-in function torch_neuronx.PartitionerConfig() built-in function torch_neuronx.replace_weights() built-in function torch_neuronx.set_multicore() built-in function torch_neuronx.set_neuron_cores() built-in function torch_neuronx.trace() built-in function TP tp_k_prior (nkilib.core.attention_tkg.AttnTKGConfig attribute) TPr Trainium/Inferentia2 Trainium2 transformer_tkg() (in module nkilib.experimental.transformer) transpose() (in module nki.language) Trn1 Trn2 trunc() (in module nki.language) U uint16 (in module nki.language) uint32 (in module nki.language) uint8 (in module nki.language) use_gpsimd_sb2sb (nkilib.core.attention_tkg.AttnTKGConfig attribute) use_pos_id (nkilib.core.attention_tkg.AttnTKGConfig attribute) V var() (in module nki.language) Vector Engine VectorE W where() (in module nki.language) write (C++ function) write_csv() built-in function write_json() built-in function write_stream_accessor (C++ function) Z zeros (C++ function) zeros() (in module nki.language) zeros_like() (in module nki.language)