sparseml.onnx.utils package¶
Submodules¶
sparseml.onnx.utils.data module¶
Utilities for data loading into numpy for use in ONNX supported systems
-
class
sparseml.onnx.utils.data.
DataLoader
(data: Union[str, List[Dict[str, numpy.ndarray]]], labels: Union[None, str, List[Union[numpy.ndarray, Dict[str, numpy.ndarray]]]], batch_size: int, iter_steps: int = 0)[source]¶ Bases:
object
Data loader instance that supports loading numpy arrays from file or memory and creating an iterator to go through batches of that data.
Iterator returns a tuple containing (data, label). label is only returned if label data was passed in.
- Parameters
data – a file glob pointing to numpy files, path to a tar ball of numpy files, or loaded numpy data
labels – a file glob pointing to numpy files path to a tar ball of numpy files, or loaded numpy data
batch_size – the size of batches to create for the iterator
iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
-
property
batch_size
¶ the size of batches to create for the iterator
- Type
return
-
static
from_model_random
(model: Union[str, onnx.onnx_ml_pb2.ModelProto], batch_size: int, iter_steps: int = 0, num_samples: int = 100, create_labels: bool = False, strip_first_dim: bool = True)[source]¶ Create a DataLoader from random data for a model’s input and output sizes
- Parameters
model – the loaded model or a file path to the onnx model to create random data for
batch_size – the size of batches to create for the iterator
iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
num_samples – number of random samples to create
create_labels – True to create random label data as well, False otherwise
strip_first_dim – True to strip the first dimension from the inputs and outputs, typically the batch dimension
- Returns
the created DataLoader instance with the random data
-
static
from_random
(data_shapes: Dict[str, Tuple[int, …]], label_shapes: Union[None, Dict[str, Tuple[int, …]]], batch_size: int, iter_steps: int = 0, num_samples: int = 100, data_types: Optional[Dict[str, numpy.dtype]] = None)[source]¶ Create a DataLoader from random data
- Parameters
data_shapes – shapes to create for the data items
label_shapes – shapes to create for the label items
batch_size – the size of batches to create for the iterator
iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
num_samples – number of random samples to create
data_types – optional numpy data types for each of the data items
- Returns
the created DataLoader instance with the random data
-
property
infinite
¶ True if the loader instance is setup to continually create batches, False otherwise
- Type
return
-
property
iter_steps
¶ the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
- Type
return
-
property
labeled_data
¶ the loaded data and labels
- Type
return
sparseml.onnx.utils.graph_editor module¶
Helper functions to edit ONNX Graphs.
-
class
sparseml.onnx.utils.graph_editor.
ONNXGraph
(model: onnx.onnx_ml_pb2.ModelProto)[source]¶ Bases:
object
Class for quick look-up of ONNX graph nodes and initializers. If graph state changes outside of ONNXGraph class functions, update() should be called.
- Parameters
model – the ONNX graph to represent
-
add_node
(node: onnx.onnx_ml_pb2.NodeProto)[source]¶ Adds the given node to the model and graph state
- Parameters
node – node to add to the model
-
delete_initializers
(initializers: List[Union[str, onnx.onnx_ml_pb2.TensorProto]])[source]¶ deletes the given initializers from the model
- Parameters
initializers – list of initializers or initializer names to delete
-
delete_node
(node: onnx.onnx_ml_pb2.NodeProto)[source]¶ deletes the given node from the graph
- Parameters
node – node to delete
-
delete_nodes
(nodes: List[onnx.onnx_ml_pb2.NodeProto])[source]¶ deletes the given nodes from the graph :param nodes: list of nodes to delete
-
delete_unused_initializers
()[source]¶ deletes tensors in the initializer list that are not listed as inputs to any node in the current graph state or directly passed as model outputs
-
get_init_by_name
(name: str, allow_optional: bool = True) → Optional[onnx.onnx_ml_pb2.TensorProto][source]¶ - Parameters
name – name of initializer
allow_optional – if True and the given name is not found as an initializer, None will be returned. Otherwise a KeyError will be raised
- Returns
tensor of initializer with given name, returns None if the name does not exist in the cached graph
-
get_node_by_output_id
(id: str) → Optional[onnx.onnx_ml_pb2.TensorProto][source]¶ - Parameters
id – name of output id of node
- Returns
the associated node if it is present in the graph, None otherwise
-
get_node_children
(node: onnx.onnx_ml_pb2.NodeProto) → List[onnx.onnx_ml_pb2.NodeProto][source]¶ - Parameters
node – the node to get the children node of
- Returns
list of nodes that include this node as an output
-
get_node_parents
(node: onnx.onnx_ml_pb2.NodeProto) → List[Optional[Union[onnx.onnx_ml_pb2.NodeProto, onnx.onnx_ml_pb2.TensorProto]]][source]¶ - Parameters
node – node to get the input objects for
- Returns
input nodes or tensors of this node in order. if an input does not exist, None will be returned in its place
-
get_node_single_child
(node: onnx.onnx_ml_pb2.NodeProto) → Optional[onnx.onnx_ml_pb2.NodeProto][source]¶ - Parameters
node – the node to get the child node of
- Returns
child of node if it only has one child, otherwise None
-
get_node_single_parent
(node: onnx.onnx_ml_pb2.NodeProto, index: int) → Optional[onnx.onnx_ml_pb2.NodeProto][source]¶ - Parameters
node – the node to get the parent node of
index – choose which input to search
- Returns
parent of node if it only has one parent, otherwise None
-
property
nodes
¶ ordered collection of nodes in this graph
- Type
return
-
sort_nodes_topologically
()[source]¶ Sorts the order of the graph Node repeated field in place in topological order as per the ONNX Model proto specifications
-
update
(model: Optional[onnx.onnx_ml_pb2.ModelProto] = None)[source]¶ Update the graph state based on the model this graph represents or the given model.
- Parameters
model – model to represent. defaults to current loaded model state
-
update_node_input
(node: onnx.onnx_ml_pb2.NodeProto, input_id: str, input_idx: Optional[int] = None)[source]¶ - Parameters
node – node to update the inputs of
input_id – new input_id to attach to the node
input_idx – optional index of the node input list to update, if none is given, the new input id will be appended to the input list
-
sparseml.onnx.utils.graph_editor.
override_model_batch_size
(model: onnx.onnx_ml_pb2.ModelProto, batch_size: int) → onnx.onnx_ml_pb2.ModelProto[source]¶ Rewrites any positive batch dimensions in the model inputs or outputs to the given batch_size
- Parameters
model – Model to modify
batch_size – Batch size to enforce
- Returns
the given model with inputs and outputs set to batch_size if the batch dimensions are not -1.
-
sparseml.onnx.utils.graph_editor.
prune_model_one_shot
(model: onnx.onnx_ml_pb2.ModelProto, nodes: List[onnx.onnx_ml_pb2.NodeProto], sparsity: Union[float, List[float]])[source]¶ Prune a model in-place with one shot pruning (no retraining) according to magnitude pruning. Does so in an unstructured way currently
- Parameters
model – the model to apply pruning to
nodes – the nodes within the model to prune to the desired sparsities
sparsity – the sparsity level to prune all nodes to if a float, or the sparsity level to prune each node to if a list of floats
- Returns
the new, pruned model
-
sparseml.onnx.utils.graph_editor.
prune_model_one_shot_iter
(model: onnx.onnx_ml_pb2.ModelProto, nodes: List[onnx.onnx_ml_pb2.NodeProto], sparsity: Union[float, List[float]])[source]¶ Iteratively prune a model in-place with one shot pruning (no retraining) according to magnitude pruning. Does so in an unstructured way currently
- Parameters
model – the model to apply pruning to
nodes – the nodes within the model to prune to the desired sparsities
sparsity – the sparsity level to prune all nodes to if a float, or the sparsity level to prune each node to if a list of floats
-
sparseml.onnx.utils.graph_editor.
prune_unstructured
(array: numpy.ndarray, sparsity: float) → numpy.ndarray[source]¶ Prune a numpy array with unstructured sparsity according to magnitude pruning
- Parameters
array – the array to prune (introduce zeros), will remove the lowest absolute values in the array
sparsity – the sparsity value, as a decimal, to impose in the array
- Returns
the pruned array
-
sparseml.onnx.utils.graph_editor.
remove_node_and_params_from_graph
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto, keep_params: Optional[Iterable[str]] = None) → None[source]¶ Deletes a node from the mdoel graph as well as its parameters listed in node.input
- Parameters
model – Model to delete from
node – Node to delete
keep_params – Names of node input initializers not to remove from graph default is None.
-
sparseml.onnx.utils.graph_editor.
swap_node_output
(node: onnx.onnx_ml_pb2.NodeProto, output: str) → None[source]¶ Deletes the current output of the node and replaces it with the provided value Assumes that the node only has one output
- Parameters
node – Node to change the output of
output – New output value
-
sparseml.onnx.utils.graph_editor.
update_model_param
(model: onnx.onnx_ml_pb2.ModelProto, param_name: str, val: numpy.ndarray) → None[source]¶ Removes the parameter with name param_name from the model Creates a new parameter using val Adds val to the model with name param_name as an update
- Parameters
model – The model to update
param_name – The parameter name in the model to update
val – The new value of the parameter
sparseml.onnx.utils.graph_optimizer module¶
Helper functions to optimize ONNX Graphs.
-
sparseml.onnx.utils.graph_optimizer.
fold_conv_bns
(onnx_file: str) → onnx.onnx_ml_pb2.ModelProto[source]¶ When a batch norm op is the only child operator of a conv op, this function will fold the batch norm into the conv and return the processed graph
- Parameters
onnx_file – file path to ONNX model to process
- Returns
A loaded ONNX model with BatchNormalization ops folded into Conv ops where possible
-
sparseml.onnx.utils.graph_optimizer.
quantize_resnet_identity_add_inputs
(quantized_model: onnx.onnx_ml_pb2.ModelProto) → bool[source]¶ To avoid storing the identity value of a ResNet block in fp32, this optimization will pass the identity value through the same quantize operation as the ResNet block and add a de-quantize operation for the identity before the add.
Function will match to any add operation whose inputs are the output of a relu or add op and a quantize -> de-quantize block that takes the same relu as input. Performs this optimization in place.
- Parameters
quantized_model – A loaded quantized model to perform this optimization on
- Returns
True if an in-place optimization was made
-
sparseml.onnx.utils.graph_optimizer.
quantized_residual_add_optim
(quantized_model: onnx.onnx_ml_pb2.ModelProto) → bool[source]¶ This optimization adds a quant/dequant block to the identity branch of a residual whose non-identity branch is quantized. This enables the add at the end of the residual to be fused at runtime.
Function will match to any node who has two children nodes - one add node and one quantize node whose branch eventually leads to the other add node.
- Parameters
quantized_model – A loaded quantized model to perform this optimization on
- Returns
True if an in-place optimization was made
sparseml.onnx.utils.helpers module¶
Utility / helper functions
-
class
sparseml.onnx.utils.helpers.
BatchNormParams
(epsilon, momentum, scale, bias, mean, var)¶ Bases:
tuple
-
property
bias
¶ Alias for field number 3
-
property
epsilon
¶ Alias for field number 0
-
property
mean
¶ Alias for field number 4
-
property
momentum
¶ Alias for field number 1
-
property
scale
¶ Alias for field number 2
-
property
var
¶ Alias for field number 5
-
property
-
class
sparseml.onnx.utils.helpers.
NodeParam
(name, val)¶ Bases:
tuple
-
property
name
¶ Alias for field number 0
-
property
val
¶ Alias for field number 1
-
property
-
class
sparseml.onnx.utils.helpers.
NodeShape
(id, input_shapes, output_shapes)¶ Bases:
tuple
-
property
id
¶ Alias for field number 0
-
property
input_shapes
¶ Alias for field number 1
-
property
output_shapes
¶ Alias for field number 2
-
property
-
class
sparseml.onnx.utils.helpers.
SparsityMeasurement
(node_id, params_count, params_zero_count, sparsity, density)¶ Bases:
tuple
-
property
density
¶ Alias for field number 4
-
property
node_id
¶ Alias for field number 0
-
property
params_count
¶ Alias for field number 1
-
property
params_zero_count
¶ Alias for field number 2
-
property
sparsity
¶ Alias for field number 3
-
property
-
sparseml.onnx.utils.helpers.
calculate_flops
(op_type: str, input_shape: Optional[List[List]] = None, output_shape: Optional[List[List]] = None, weight_shape: Optional[List] = None, kernel_shape: Optional[List] = None, bias_shape: Optional[List] = None, attributes: Optional[Dict[str, Any]] = None) → Optional[float][source]¶ Calculate flops based on operation type and shape of certain attributes. If any fields necessary in operation are set to None, will return None
- Parameters
op_type – Operation type of flop calculation
input_shape – List of input shapes of operation
output_shape – List of output shapes of operation
weight_shape – Shape of weights in operation if any, else None
kernel_shape – Shape of kernel in operation if any, else None
bias_shape – Shape of bias in operation if any, else None
attributes – The node attributes if any, else None
- Returns
The amount of floating point operations in the operation
-
sparseml.onnx.utils.helpers.
check_load_model
(model: Union[str, onnx.onnx_ml_pb2.ModelProto]) → onnx.onnx_ml_pb2.ModelProto[source]¶ Load an ONNX model from a given file path if supplied. If already a model proto, then returns.
- Parameters
model – the model proto or path to the model ONNX file to check for loading
- Returns
the loaded ONNX ModelProto
-
sparseml.onnx.utils.helpers.
conv_node_params
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto, include_values: bool = True) → Tuple[sparseml.onnx.utils.helpers.NodeParam, Optional[sparseml.onnx.utils.helpers.NodeParam]][source]¶ Get the params (weight and bias) for a conv node in an ONNX ModelProto
- Parameters
model – the model proto loaded from the ONNX file
node – the conv node to get the params for
include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None
- Returns
a tuple containing the weight, bias (if it is present)
-
sparseml.onnx.utils.helpers.
extract_node_id
(node: onnx.onnx_ml_pb2.NodeProto) → str[source]¶ Get the node id for a given node from an ONNX model. Grabs the first ouput id as the node id. This is because is guaranteed to be unique for this node by the ONNX spec.
- Parameters
node – the node to grab an id for
- Returns
the id for the node
-
sparseml.onnx.utils.helpers.
extract_node_shapes
(model: onnx.onnx_ml_pb2.ModelProto) → Dict[str, sparseml.onnx.utils.helpers.NodeShape][source]¶ Extracts the shape information for each node as a NodeShape object.
- Parameters
model – the loaded onnx.ModelProto to extract node shape information from
- Returns
a mapping of node id to a NodeShape object
-
sparseml.onnx.utils.helpers.
extract_nodes_shapes_ort
(model: onnx.onnx_ml_pb2.ModelProto) → Dict[str, List[List[int]]][source]¶ Creates a modified model to expose intermediate outputs and runs an ONNX Runtime InferenceSession to obtain the output shape of each node.
- Parameters
model – an ONNX model
- Returns
a list of NodeArg with their shape exposed
-
sparseml.onnx.utils.helpers.
extract_nodes_shapes_shape_inference
(model: onnx.onnx_ml_pb2.ModelProto) → Dict[str, List[Union[None, List[int]]]][source]¶ Creates a modified model to expose intermediate outputs and runs an ONNX shape inference to obtain the output shape of each node.
NOTE: The ONNX docs on shape inference have the following disclaimer on shape inference: Shape inference is not guaranteed to be complete. In particular, some dynamic behaviors block the flow of shape inference, for example a Reshape to a dynamically-provide shape. Also, all operators are not required to have a shape inference implementation.
- Parameters
model – an ONNX model
- Returns
a list of NodeProto with their shape exposed
-
sparseml.onnx.utils.helpers.
extract_shape
(proto: Any) → Union[None, Tuple[Optional[int], …]][source]¶ Extract the shape info from a proto. Convenient for inputs into a model for example to get the tensor dimension.
- Parameters
proto – the proto to get tensor shape info for
- Returns
a tuple containing shape info if found, else None
-
sparseml.onnx.utils.helpers.
gemm_node_params
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto, include_values: bool = True) → Tuple[sparseml.onnx.utils.helpers.NodeParam, Optional[sparseml.onnx.utils.helpers.NodeParam]][source]¶ Get the params (weight and bias) for a gemm node in an ONNX ModelProto
- Parameters
model – the model proto loaded from the ONNX file
node – the conv node to get the params for
include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None
- Returns
a tuple containing the weight, bias (if it is present)
-
sparseml.onnx.utils.helpers.
get_attr_float_val_for_node
(node: onnx.onnx_ml_pb2.NodeProto, attr: str) → Optional[float][source]¶ - Parameters
node – Node to get the attribute value of
attr – Attribute name to match in the node
- Returns
The value of the attribute if the attribute found in the node and is a float type. Otherwise returns None
-
sparseml.onnx.utils.helpers.
get_batch_norm_params
(model: onnx.onnx_ml_pb2.ModelProto, bn_node: onnx.onnx_ml_pb2.NodeProto) → sparseml.onnx.utils.helpers.BatchNormParams[source]¶ Get the params and relevant attributes of a batch normalization operator. Following the ONNX operators spec, will default epsilon and momentum to 1e-5 and 0.9 respectively when not defined.
- Parameters
model – the model proto loaded from the ONNX file
bn_node – the batch normalization node to get the params for
- Returns
a BatchNormParams named tuple
-
sparseml.onnx.utils.helpers.
get_init_by_name
(model: onnx.onnx_ml_pb2.ModelProto, init_name: str) → Optional[Any][source]¶ Get an initializer by name from the ONNX model proto graph
- Parameters
model – the model proto loaded from the ONNX file
init_name – the name of the initializer to retrieve
- Returns
the initializer retrieved by name from the model
-
sparseml.onnx.utils.helpers.
get_kernel_shape
(attributes: Dict[str, Any]) → Optional[List[float]][source]¶ Get the kernel shape from a dictionary of a model’s attributes
- Parameters
attributes – a dictionary of a model’s attributes
- Returns
the kernel shape if attribute contains either the kernel or kernel_shape field, otherwise None
-
sparseml.onnx.utils.helpers.
get_node_attributes
(node: onnx.onnx_ml_pb2.NodeProto) → Dict[str, Any][source]¶ - Parameters
node – the ONNX node to get the attibutes for
- Returns
a dictionary containing all attributes for the node
-
sparseml.onnx.utils.helpers.
get_node_by_id
(model: onnx.onnx_ml_pb2.ModelProto, node_id: str) → Optional[onnx.onnx_ml_pb2.NodeProto][source]¶ Get a node from a model by the node_id generated from extract_node_id
- Parameters
model – the model proto loaded from the ONNX file
node_id – id of the node to get from the model
- Returns
the retrieved node or None if no node found
-
sparseml.onnx.utils.helpers.
get_node_input_nodes
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto) → List[onnx.onnx_ml_pb2.NodeProto][source]¶ Get all of the nodes that share an output edge for the inputs to a given node
- Parameters
model – the model the node is from
node – the node to get all input nodes for
- Returns
the list of nodes that share an output edge for the inputs to the given node
-
sparseml.onnx.utils.helpers.
get_node_inputs
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto) → List[str][source]¶ - Parameters
model – the model the node is from
node – the node to get all inputs (non initializers) for
- Returns
the names of all the inputs to the node that are not initializers
-
sparseml.onnx.utils.helpers.
get_node_output_nodes
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto) → List[onnx.onnx_ml_pb2.NodeProto][source]¶ Get all of the nodes that share an input edge for the outputs from a given node
- Parameters
model – the model the node is from
node – the node to get all output nodes for
- Returns
the list of nodes that share an input edge for the outputs from the given node
-
sparseml.onnx.utils.helpers.
get_node_outputs
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto) → List[str][source]¶ - Parameters
model – the model the node is from
node – the node to get all outputs (non initializers) for
- Returns
the names of all the outputs to the node that are not initializers
-
sparseml.onnx.utils.helpers.
get_node_params
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto, include_values: bool = True) → Tuple[sparseml.onnx.utils.helpers.NodeParam, Optional[sparseml.onnx.utils.helpers.NodeParam]][source]¶ Get the params (weight and bias) for a node in an ONNX ModelProto. Must be an op type of one of [conv, gemm, matmul]
- Parameters
model – the model proto loaded from the ONNX file
node – the conv node to get the params for
include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None
- Returns
a tuple containing the weight, bias (if it is present)
-
sparseml.onnx.utils.helpers.
get_nodes_by_input_id
(model: onnx.onnx_ml_pb2.ModelProto, input_id: str) → List[onnx.onnx_ml_pb2.NodeProto][source]¶ Get all the nodes in a model that have a given id as one of the inputs
- Parameters
model – the model proto loaded from the ONNX file
input_id – id of the input to get nodes by
- Returns
the retrieved nodes
-
sparseml.onnx.utils.helpers.
get_nodes_by_output_id
(model: onnx.onnx_ml_pb2.ModelProto, output_id: str) → List[onnx.onnx_ml_pb2.NodeProto][source]¶ Get all the nodes in a model that have a given id as one of the outputs
- Parameters
model – the model proto loaded from the ONNX file
output_id – id of the output to get nodes by
- Returns
the retrieved nodes
-
sparseml.onnx.utils.helpers.
get_numpy_dtype
(tensor: onnx.onnx_ml_pb2.TensorProto) → Union[None, numpy.dtype][source]¶ Extract the NumPy dtype of an ONNX tensor. Returns None if there is not a direct mapping from the ONNX data type to a NumPy dtype.
- Parameters
tensor – the tensor to get the dtype of
- Returns
a NumPy dtype for the tensor if available otherwise None
-
sparseml.onnx.utils.helpers.
get_prunable_node_from_foldable
(model: onnx.onnx_ml_pb2.ModelProto, foldable_node: Union[str, onnx.onnx_ml_pb2.NodeProto], traverse_previous: bool = True, max_node_distance: int = 3) → Union[None, onnx.onnx_ml_pb2.NodeProto][source]¶ Get a prunable node that is attached by foldable nodes to a given foldable node. Returns None if nothing could be found. Ex: get the convolution that would be folded for an attached BatchNormalization
- Parameters
model – the model the node is from
foldable_node – the foldable node or node id to find prunable node from
traverse_previous – True to only search for previous prunable nodes that the foldable node could have been attached to for Conv -> BN patterns. False to only search for following prunable nodes that the foldable node could have been attached to for BN -> Conv patterns.
max_node_distance – The maximum distance (and therefore number of foldable nodes) the prunable node must be within to match. Ex: max_node_distance = 3, the prunable node must be within 3 other foldable nodes of the foldable node passed in to match
- Returns
the found prunable node
-
sparseml.onnx.utils.helpers.
get_prunable_nodes
(model: Union[str, onnx.onnx_ml_pb2.ModelProto]) → List[Any][source]¶ Get the prunable nodes in an ONNX model proto. Prunable nodes are defined as any conv, gemm, or matmul
- Parameters
model – the model proto loaded from the ONNX file
- Returns
a list of nodes from the model proto
-
sparseml.onnx.utils.helpers.
get_quantize_parent_for_dequantize_node
(quantized_model: onnx.onnx_ml_pb2.ModelProto, dequantize_node: onnx.onnx_ml_pb2.NodeProto) → Optional[onnx.onnx_ml_pb2.NodeProto][source]¶ Returns the first quantize node found by traversing the first node input of the given de-quantize node’s ancestors. All inputs to de-quantize nodes should have a quantize node ancestor.
- Parameters
quantized_model – the model the de-quantize node is from
dequantize_node – the node to get an associated quantize node for
- Returns
the first quantize node found by traversing the first node input of the given de-quantize node’s ancestors. If no quantize node is found, returns None
-
sparseml.onnx.utils.helpers.
get_tensor_dim_shape
(tensor: onnx.onnx_ml_pb2.TensorProto, dim: int) → int[source]¶ - Parameters
tensor – ONNX tensor to get the shape of a dimension of
dim – dimension index of the tensor to get the shape of
- Returns
shape of the tensor at the given dimension
-
sparseml.onnx.utils.helpers.
is_foldable_node
(node: Union[str, onnx.onnx_ml_pb2.NodeProto]) → bool[source]¶ Foldable nodes as defined by ONNX Runtime and what it supports layerwise folding in the ONNX graphs. More info can be found in their docs: https://www.onnxruntime.ai/docs/resources/graph-optimizations.html
- Parameters
node – the node or node type to check if it is foldable or not according to the ONNX Runtime specs
- Returns
True if the node is foldable and therefore can be combined with other nodes, False otherwise
-
sparseml.onnx.utils.helpers.
is_prunable_node
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto) → bool[source]¶ - Parameters
model – the model the node is from
node – an ONNX node or op_type string
- Returns
True if the given node is prunable, False otherwise
-
sparseml.onnx.utils.helpers.
matmul_node_params
(model: onnx.onnx_ml_pb2.ModelProto, node: onnx.onnx_ml_pb2.NodeProto, include_values: bool = True) → Tuple[sparseml.onnx.utils.helpers.NodeParam, Optional[sparseml.onnx.utils.helpers.NodeParam]][source]¶ Get the params (weight) for a matmul node in an ONNX ModelProto. In the future will retrieve a following bias addition as the bias for the matmul.
- Parameters
model – the model proto loaded from the ONNX file
node – the conv node to get the params for
include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None
- Returns
a tuple containing the weight, bias (if it is present)
-
sparseml.onnx.utils.helpers.
model_inputs
(model: Union[str, onnx.onnx_ml_pb2.ModelProto]) → List[source]¶ Get the input to the model from an ONNX model
- Parameters
model – the loaded model or a file path to the ONNX model to get the model inputs for
- Returns
the input to the model
-
sparseml.onnx.utils.helpers.
model_outputs
(model: Union[str, onnx.onnx_ml_pb2.ModelProto]) → List[source]¶ Get the output from an ONNX model
- Parameters
model – the loaded model or a file path to the ONNX model to get the model outputs for
- Returns
the output from the model
-
sparseml.onnx.utils.helpers.
onnx_nodes_sparsities
(model: Union[str, onnx.onnx_ml_pb2.ModelProto]) → Tuple[sparseml.onnx.utils.helpers.SparsityMeasurement, Dict[str, sparseml.onnx.utils.helpers.SparsityMeasurement]][source]¶ Retrieve the sparsities for each Conv or Gemm op in an ONNX graph for the associated weight inputs.
- Parameters
model – ONNX model to use
- Returns
a tuple containing the overall sparsity measurement for the model, each conv or gemm node found in the model
-
sparseml.onnx.utils.helpers.
set_tensor_dim_shape
(tensor: onnx.onnx_ml_pb2.TensorProto, dim: int, value: int)[source]¶ Sets the shape of the tensor at the given dimension to the given value
- Parameters
tensor – ONNX tensor to modify the shape of
dim – dimension index of the tensor to modify the shape of
value – new shape for the given dimension
sparseml.onnx.utils.loss module¶
-
sparseml.onnx.utils.loss.
kl_divergence
(predicted: numpy.ndarray, expected: numpy.ndarray, zero_point: float = 0.0, min_value: float = 1.0) → float[source]¶ Calculate the kl_divergence (entropy) between two input arrays.
Shifts all values such that the zero_point is at one. If a value is lower, then sets it equal to 1.
- Parameters
predicted – the first array to compare with
expected – the second array to compare with
zero_point – the zero point that should be used to shift values above 1
min_value – the minimum value that all values will be truncated to if they are below
- Returns
the calculated KL divergence
sparseml.onnx.utils.model module¶
Utilities for ONNX models and running inference with them
-
class
sparseml.onnx.utils.model.
DeepSparseAnalyzeModelRunner
(model: Union[str, onnx.onnx_ml_pb2.ModelProto], batch_size: int, num_cores: Optional[int] = None)[source]¶ Bases:
sparseml.onnx.utils.model._DeepSparseBaseModelRunner
Class for handling running inference for an ONNX model through Neural Magic’s analyze_model api
- Parameters
model – the path to the ONNX model file or the loaded onnx.ModelProto
batch_size – the size of the batch to create the model for
num_cores – the number of physical cores to run the model on. Defaults to run on all available cores
-
batch_forward
(batch: Dict[str, numpy.ndarray], num_iterations: int = 1, num_warmup_iterations: int = 0, optimization_level: int = 1, imposed_ks: Union[None, float] = None, *args, **kwargs) → Tuple[Dict[str, numpy.ndarray], float][source]¶ - Parameters
batch – the batch to run through the ONNX model for inference benchmarking analysis in the neural magic system
num_iterations – number of iterations to run the analysis benchmark for
num_warmup_iterations – number of iterations to run warmup for before benchmarking
optimization_level – the optimization level to use in neural magic; 1 for optimizations on, 0 for limited optimizations
imposed_ks – kernel sparsity value to impose on all the prunable layers in the model. None or no imposed sparsity
- Returns
a tuple containing the result of the inference, the time to perform the inference
-
run
(data_loader: sparseml.onnx.utils.data.DataLoader, desc: str = '', show_progress: bool = True, max_steps: int = 1, num_iterations: int = 20, num_warmup_iterations: int = 5, optimization_level: int = 1, imposed_ks: Union[None, float] = None, *args, **kwargs) → Tuple[List[Dict], List[float]][source]¶ Run inference for a model for the data given in the data_loader iterator through neural magic inference engine model analysis function. The analysis function allows more granular control over how the model is executed such as optimization levels and imposing kernel sparsity. In addition, gives back layer by layer timings that were run through.
- Parameters
data_loader – the data_loader used to load batches of data to run through the model
desc – str to display if show_progress is True
show_progress – True to show a tqdm bar when running, False otherwise
max_steps – maximum number of steps to take for the data_loader instead of running over all the data
num_iterations – number of iterations to run the analysis benchmark for
num_warmup_iterations – number of iterations to run warmup for before benchmarking
optimization_level – the optimization level to use in neural magic; 1 for optimizations on, 0 for limited optimizations
imposed_ks – kernel sparsity value to impose on all the prunable layers in the model. None or no imposed sparsity
- Returns
a tuple containing the performance results for the run as returned from the analyze_model function, total time to run them
-
class
sparseml.onnx.utils.model.
DeepSparseModelRunner
(model: Union[str, onnx.onnx_ml_pb2.ModelProto], batch_size: int, num_cores: Optional[int] = None, loss: Optional[Callable[[Dict[str, numpy.ndarray], Dict[str, numpy.ndarray]], Any]] = None)[source]¶ Bases:
sparseml.onnx.utils.model._DeepSparseBaseModelRunner
Class for handling running inference for an ONNX model through Neural Magic :param model: the path to the ONNX model file or the loaded onnx.ModelProto :param batch_size: the size of the batch to create the model for :param num_cores: the number of physical cores to run the model on. Defaults
to run on all available cores
- Parameters
loss – the loss function, if any, to run for evaluation of the model
-
batch_forward
(batch: Dict[str, numpy.ndarray], *args, **kwargs) → Tuple[Dict[str, numpy.ndarray], float][source]¶ - Parameters
batch – the batch to run through the ONNX model for inference in the DeepSparse Engine
- Returns
a tuple containing the result of the inference, the time to perform the inference
-
run
(data_loader: sparseml.onnx.utils.data.DataLoader, desc: str = '', show_progress: bool = True, max_steps: int = - 1, *args, **kwargs) → Tuple[List[Any], List[float]][source]¶ Run inference for a model for the data given in the data_loader iterator through neural magic inference engine. :param data_loader: the data_loader used to load batches of data to
run through the model
- Parameters
desc – str to display if show_progress is True
show_progress – True to show a tqdm bar when running, False otherwise
max_steps – maximum number of steps to take for the data_loader instead of running over all the data
- Returns
a tuple containing the list of outputs and the list of times for running the data
-
class
sparseml.onnx.utils.model.
ModelRunner
(loss: Optional[Callable[[Dict[str, numpy.ndarray], Dict[str, numpy.ndarray]], Any]] = None)[source]¶ Bases:
abc.ABC
Abstract class for handling running inference for a model
- Parameters
loss – the loss function, if any, to run for evaluation of the model
-
abstract
batch_forward
(batch: Dict[str, numpy.ndarray], *args, **kwargs) → Tuple[Any, float][source]¶ Abstract method for subclasses to override to run a batch through the inference engine for the ONNX model it was constructed with
- Parameters
batch – the batch to run through the ONNX model for inference
- Returns
a tuple containing the result of the inference, the time to perform the inference
-
run
(data_loader: sparseml.onnx.utils.data.DataLoader, desc: str = '', show_progress: bool = True, max_steps: int = - 1, *args, **kwargs) → Tuple[List[Any], List[float]][source]¶ Run inference for a model for the data given in the data_loader iterator
- Parameters
data_loader – the data_loader used to load batches of data to run through the model
desc – str to display if show_progress is True
show_progress – True to show a tqdm bar when running, False otherwise
max_steps – maximum number of steps to take for the data_loader instead of running over all the data
- Returns
a tuple containing the list of outputs and the list of times for running the data
-
run_iter
(data_loader: sparseml.onnx.utils.data.DataLoader, desc: str = '', show_progress: bool = True, max_steps: int = - 1, *args, **kwargs)[source]¶ Iteratively runs inference for a model for the data given in the data_loader iterator
- Parameters
data_loader – the data_loader used to load batches of data to run through the model
desc – str to display if show_progress is True
show_progress – True to show a tqdm bar when running, False otherwise
max_steps – maximum number of steps to take for the data_loader instead of running over all the data
- Returns
an iterator to go through the tuples containing the list of outputs and the list of times for running the data
-
class
sparseml.onnx.utils.model.
ORTModelRunner
(model: Union[str, onnx.onnx_ml_pb2.ModelProto], loss: Optional[Callable[[Dict[str, numpy.ndarray], Dict[str, numpy.ndarray]], Any]] = None, overwrite_input_names: bool = True, nthreads: int = 0, batch_size: int = None, providers: List[str] = None, **kwargs)[source]¶ Bases:
sparseml.onnx.utils.model.ModelRunner
Class for handling running inference for an ONNX model through onnxruntime
- Parameters
model – the path to the ONNX model file or the loaded onnx.ModelProto
loss – the loss function, if any, to run for evaluation of the model
overwrite_input_names – True to overwrite the input data names to what is found in for the model inputs, False to keep as found in the loaded data
nthreads – number of threads used to run the model (single node); default to 0 for onnxruntime to choose
batch_size – if provided, and the model has a hardcoded batch size, will rewrite the model proto so that the model batch size matches batch_size
providers – list of ORT provider names. will default to ort.get_available_providers()
-
batch_forward
(batch: Dict[str, numpy.ndarray], *args, **kwargs) → Tuple[Dict[str, numpy.ndarray], float][source]¶ - Parameters
batch – the batch to run through the ONNX model for inference iin onnxruntime
- Returns
a tuple containing the result of the inference, the time to perform the inference
-
run
(data_loader: sparseml.onnx.utils.data.DataLoader, desc: str = '', show_progress: bool = True, max_steps: int = - 1, *args, **kwargs) → Tuple[List[Any], List[float]][source]¶ Run inference for a model for the data given in the data_loader iterator through ONNX Runtime.
- Parameters
data_loader – the data_loader used to load batches of data to run through the model
desc – str to display if show_progress is True
show_progress – True to show a tqdm bar when running, False otherwise
max_steps – maximum number of steps to take for the data_loader instead of running over all the data
- Returns
a tuple containing the list of outputs and the list of times for running the data
-
class
sparseml.onnx.utils.model.
OpenVINOModelRunner
(model: str, loss: Optional[Callable[[Dict[str, numpy.ndarray], Dict[str, numpy.ndarray]], Any]] = None, nthreads: int = 1, batch_size: int = 0, shape: str = '')[source]¶ Bases:
sparseml.onnx.utils.model.ModelRunner
OpenVINO model runner class
- Parameters
model – The path to the IR xml file after conversion
loss – loss function to run evaluation
nthreads – number of threads to run the model
batch_size – Batch size value. If not specified, the batch size value is determined from Intermediate Representation
shape – shape to be set for the input(s). For example, “input1[1,3,224,224],input2[1,4]” or “[1,3,224,224]” in case of one input size.
-
sparseml.onnx.utils.model.
correct_nm_analyze_model_node_ids
(nm_result: Dict, model: Union[str, onnx.onnx_ml_pb2.ModelProto])[source]¶ Correct the node ids returned from the deepsparse.analyze_model api. In some cases, it will return the ids for folded nodes due to ONNXRuntime folding. This finds the corrected node ids from those folded nodes. Additionally, ops that did not have an id are changed from the returned string <none> to proper None python type
- Parameters
nm_result – the result from the deepsparse.analyze_model api
model – the onnx model proto or path to the onnx file that the nm_result was for
-
sparseml.onnx.utils.model.
max_available_cores
() → int[source]¶ - Returns
the maximum number of physical cores detected on the system
-
sparseml.onnx.utils.model.
split_canonical_names
(nm_result: Dict)[source]¶ Splits analysis layer results from grouped canonical names by individual nodes. Stores the original grouped canonical name in the ‘meta_canonical_name’ field.
Will split on any canonical_name that includes ‘,’.
- Parameters
nm_result – the result from the deepsparse.analyze_model api
sparseml.onnx.utils.sparse_tensor module¶
Helper functions for handling ONNX SparseTensorProto objects. onnx >= 1.6.0 is a requirement for using sparse tensors
-
sparseml.onnx.utils.sparse_tensor.
convert_model_initializers_to_sparse
(model: onnx.onnx_ml_pb2.ModelProto, sparsity_threshold: float = 0.6, inplace: bool = True) → onnx.onnx_ml_pb2.ModelProto[source]¶ - Parameters
model – ONNX model with initializers to convert to sparse
sparsity_threshold – the minimum sparsity of a tensor to be converted to sparse representation. Default is 0.6
inplace – True to do model conversion in place. Default is True
- Returns
the given model with initializers above the sparsity threshold converted to sparse initializers
-
sparseml.onnx.utils.sparse_tensor.
convert_sparse_initializers_to_dense
(model: onnx.onnx_ml_pb2.ModelProto, inplace: bool = True) → onnx.onnx_ml_pb2.ModelProto[source]¶ - Parameters
model – ONNX model with sparse initializers to convert to dense representation
inplace – True to do model conversion in place. Default is True
- Returns
The given model with all sparse initializers converted to dense initializers
-
sparseml.onnx.utils.sparse_tensor.
create_sparse_tensor
(array: Union[numpy.ndarray, onnx.onnx_ml_pb2.TensorProto], name: Optional[str] = None) → Optional[onnx.onnx_ml_pb2.SparseTensorProto][source]¶ - Parameters
array – numpy array or TensorProto object to convert to sparse representation
name – name of this sparse tensor. Will be stored in SparseTensorProto.values.name. If the given array is a TensorProto, name will default to TensorProto.name
- Returns
SparseTensorProto object built from the sparse representation of the input array
Module contents¶
Generic code used as utilities and helpers for ONNX