nki.profile#
- nki.profile(func=None, **kwargs)[source]#
- Profile a NKI kernel on a NeuronDevice by using - nki.profileas a decorator.- Note - Similar to - nki.baremetal, The decorated function using- nki.benchmarkexpects numpy.ndarray as input/output tensors instead of ML framework tensor objects.- Parameters:
- working_directory – A path to working directory where profile artifacts are saved, This must be specified and must also be an absolute path. 
- save_neff_name – Name of the saved neff file if specified (file.neff by default). 
- save_trace_name – Name of the saved trace (profile) file if specified (profile.ntff by default) 
- additional_compile_opt – Additional Neuron compiler flags to pass in when compiling the kernel. 
- overwrite – Overwrite existing profile artifacts if set to True. Default is False. 
- profile_nth – Profiles the profile_nth execution. Default is 1. 
 
- Returns:
- None 
 - from neuronxcc import nki import neuronxcc.nki.language as nl @nki.profile(working_directory="/home/ubuntu/profiles", save_neff_name='file.neff', save_trace_name='profile.ntff') def nki_tensor_tensor_add(a_tensor, b_tensor): c_tensor = nl.ndarray(a_tensor.shape, dtype=a_tensor.dtype, buffer=nl.shared_hbm) a = nl.load(a_tensor) b = nl.load(b_tensor) c = a + b nl.store(c_tensor, c) return c_tensor - nki.profilewill save file.neff, profile.ntff, along with json files containing a profile summary inside of the working_directory.- See Profiling NKI kernels with Neuron Profile for more information on how to visualize the execution trace for profiling purposes. - In addition, more information about neuron-profile can be found in its documentation. - Note - nki.profiledoes not use the actual inputs passed into the profiled function when running the neff file. For instance, in the above example, the output c tensor is undefined and should not be used for numerical accuracy checks. The input tensors are used mainly to specify the shape of inputs.
