sparseml.onnx.utils package

Submodules

sparseml.onnx.utils.data module

Utilities for data loading into numpy for use in ONNX supported systems

class sparseml.onnx.utils.data. DataLoader ( data : Union [ str , List [ Dict [ str , numpy.ndarray ] ] ] , labels : Union [ None , str , List [ Union [ numpy.ndarray , Dict [ str , numpy.ndarray ] ] ] ] , batch_size : int , iter_steps : int = 0 ) [source]

Bases: object

Data loader instance that supports loading numpy arrays from file or memory and creating an iterator to go through batches of that data.

Iterator returns a tuple containing (data, label). label is only returned if label data was passed in.

Parameters
  • data – a file glob pointing to numpy files, path to a tar ball of numpy files, or loaded numpy data

  • labels – a file glob pointing to numpy files path to a tar ball of numpy files, or loaded numpy data

  • batch_size – the size of batches to create for the iterator

  • iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps

property batch_size

the size of batches to create for the iterator

Type

return

static from_model_random ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] , batch_size : int , iter_steps : int = 0 , num_samples : int = 100 , create_labels : bool = False , strip_first_dim : bool = True ) [source]

Create a DataLoader from random data for a model’s input and output sizes

Parameters
  • model – the loaded model or a file path to the onnx model to create random data for

  • batch_size – the size of batches to create for the iterator

  • iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps

  • num_samples – number of random samples to create

  • create_labels – True to create random label data as well, False otherwise

  • strip_first_dim – True to strip the first dimension from the inputs and outputs, typically the batch dimension

Returns

the created DataLoader instance with the random data

static from_random ( data_shapes : Dict [ str , Tuple [ int , ] ] , label_shapes : Union [ None , Dict [ str , Tuple [ int , ] ] ] , batch_size : int , iter_steps : int = 0 , num_samples : int = 100 , data_types : Optional [ Dict [ str , numpy.dtype ] ] = None ) [source]

Create a DataLoader from random data

Parameters
  • data_shapes – shapes to create for the data items

  • label_shapes – shapes to create for the label items

  • batch_size – the size of batches to create for the iterator

  • iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps

  • num_samples – number of random samples to create

  • data_types – optional numpy data types for each of the data items

Returns

the created DataLoader instance with the random data

property infinite

True if the loader instance is setup to continually create batches, False otherwise

Type

return

property iter_steps

the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps

Type

return

property labeled_data

the loaded data and labels

Type

return

sparseml.onnx.utils.graph_editor module

Helper functions to edit ONNX Graphs.

class sparseml.onnx.utils.graph_editor. ONNXGraph ( model : onnx.onnx_ml_pb2.ModelProto ) [source]

Bases: object

Class for quick look-up of ONNX graph nodes and initializers. If graph state changes outside of ONNXGraph class functions, update() should be called.

Parameters

model – the ONNX graph to represent

add_node ( node : onnx.onnx_ml_pb2.NodeProto ) [source]

Adds the given node to the model and graph state

Parameters

node – node to add to the model

delete_initializers ( initializers : List [ Union [ str , onnx.onnx_ml_pb2.TensorProto ] ] ) [source]

deletes the given initializers from the model

Parameters

initializers – list of initializers or initializer names to delete

delete_node ( node : onnx.onnx_ml_pb2.NodeProto ) [source]

deletes the given node from the graph

Parameters

node – node to delete

delete_nodes ( nodes : List [ onnx.onnx_ml_pb2.NodeProto ] ) [source]

deletes the given nodes from the graph :param nodes: list of nodes to delete

delete_unused_initializers ( ) [source]

deletes tensors in the initializer list that are not listed as inputs to any node in the current graph state or directly passed as model outputs

get_init_by_name ( name : str ) Optional [ onnx.onnx_ml_pb2.TensorProto ] [source]
Parameters

name – name of initializer

Returns

tensor of initializer with given name, returns None if the name does not exist in the cached graph

get_node_by_output_id ( id : str ) Optional [ onnx.onnx_ml_pb2.TensorProto ] [source]
Parameters

id – name of output id of node

Returns

the associated node if it is present in the graph, None otherwise

get_node_children ( node : onnx.onnx_ml_pb2.NodeProto ) List [ onnx.onnx_ml_pb2.NodeProto ] [source]
Parameters

node – the node to get the children node of

Returns

list of nodes that include this node as an output

get_node_parents ( node : onnx.onnx_ml_pb2.NodeProto ) List [ Optional [ Union [ onnx.onnx_ml_pb2.NodeProto , onnx.onnx_ml_pb2.TensorProto ] ] ] [source]
Parameters

node – node to get the input objects for

Returns

input nodes or tensors of this node in order. if an input does not exist, None will be returned in its place

get_node_single_child ( node : onnx.onnx_ml_pb2.NodeProto ) Optional [ onnx.onnx_ml_pb2.NodeProto ] [source]
Parameters

node – the node to get the child node of

Returns

child of node if it only has one child, otherwise None

get_node_single_parent ( node : onnx.onnx_ml_pb2.NodeProto , index : int ) Optional [ onnx.onnx_ml_pb2.NodeProto ] [source]
Parameters
  • node – the node to get the parent node of

  • index – choose which input to search

Returns

parent of node if it only has one parent, otherwise None

sort_nodes_topologically ( ) [source]

Sorts the order of the graph Node repeated field in place in topological order as per the ONNX Model proto specifications

update ( model : Optional [ onnx.onnx_ml_pb2.ModelProto ] = None ) [source]

Update the graph state based on the model this graph represents or the given model.

Parameters

model – model to represent. defaults to current loaded model state

update_node_input ( node : onnx.onnx_ml_pb2.NodeProto , input_id : str , input_idx : Optional [ int ] = None ) [source]
Parameters
  • node – node to update the inputs of

  • input_id – new input_id to attach to the node

  • input_idx – optional index of the node input list to update, if none is given, the new input id will be appended to the input list

sparseml.onnx.utils.graph_editor. override_model_batch_size ( model : onnx.onnx_ml_pb2.ModelProto , batch_size : int ) onnx.onnx_ml_pb2.ModelProto [source]

Rewrites any positive batch dimensions in the model inputs or outputs to the given batch_size

Parameters
  • model – Model to modify

  • batch_size – Batch size to enforce

Returns

the given model with inputs and outputs set to batch_size if the batch dimensions are not -1.

sparseml.onnx.utils.graph_editor. prune_model_one_shot ( model : onnx.onnx_ml_pb2.ModelProto , nodes : List [ onnx.onnx_ml_pb2.NodeProto ] , sparsity : Union [ float , List [ float ] ] ) [source]

Prune a model in-place with one shot pruning (no retraining) according to magnitude pruning. Does so in an unstructured way currently

Parameters
  • model – the model to apply pruning to

  • nodes – the nodes within the model to prune to the desired sparsities

  • sparsity – the sparsity level to prune all nodes to if a float, or the sparsity level to prune each node to if a list of floats

Returns

the new, pruned model

sparseml.onnx.utils.graph_editor. prune_model_one_shot_iter ( model : onnx.onnx_ml_pb2.ModelProto , nodes : List [ onnx.onnx_ml_pb2.NodeProto ] , sparsity : Union [ float , List [ float ] ] ) [source]

Iteratively prune a model in-place with one shot pruning (no retraining) according to magnitude pruning. Does so in an unstructured way currently

Parameters
  • model – the model to apply pruning to

  • nodes – the nodes within the model to prune to the desired sparsities

  • sparsity – the sparsity level to prune all nodes to if a float, or the sparsity level to prune each node to if a list of floats

sparseml.onnx.utils.graph_editor. prune_unstructured ( array : numpy.ndarray , sparsity : float ) numpy.ndarray [source]

Prune a numpy array with unstructured sparsity according to magnitude pruning

Parameters
  • array – the array to prune (introduce zeros), will remove the lowest absolute values in the array

  • sparsity – the sparsity value, as a decimal, to impose in the array

Returns

the pruned array

sparseml.onnx.utils.graph_editor. remove_node_and_params_from_graph ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto , keep_params : Optional [ Iterable [ str ] ] = None ) None [source]

Deletes a node from the mdoel graph as well as its parameters listed in node.input

Parameters
  • model – Model to delete from

  • node – Node to delete

  • keep_params – Names of node input initializers not to remove from graph default is None.

sparseml.onnx.utils.graph_editor. swap_node_output ( node : onnx.onnx_ml_pb2.NodeProto , output : str ) None [source]

Deletes the current output of the node and replaces it with the provided value Assumes that the node only has one output

Parameters
  • node – Node to change the output of

  • output – New output value

sparseml.onnx.utils.graph_editor. update_model_param ( model : onnx.onnx_ml_pb2.ModelProto , param_name : str , val : numpy.ndarray ) None [source]

Removes the parameter with name param_name from the model Creates a new parameter using val Adds val to the model with name param_name as an update

Parameters
  • model – The model to update

  • param_name – The parameter name in the model to update

  • val – The new value of the parameter

sparseml.onnx.utils.graph_optimizer module

Helper functions to optimize ONNX Graphs.

sparseml.onnx.utils.graph_optimizer. fold_conv_bns ( onnx_file : str ) onnx.onnx_ml_pb2.ModelProto [source]

When a batch norm op is the only child operator of a conv op, this function will fold the batch norm into the conv and return the processed graph

Parameters

onnx_file – file path to ONNX model to process

Returns

A loaded ONNX model with BatchNormalization ops folded into Conv ops where possible

sparseml.onnx.utils.graph_optimizer. quantize_resnet_identity_add_inputs ( quantized_model : onnx.onnx_ml_pb2.ModelProto ) bool [source]

To avoid storing the identity value of a ResNet block in fp32, this optimization will pass the identity value through the same quantize operation as the ResNet block and add a de-quantize operation for the identity before the add.

Function will match to any add operation whose inputs are the output of a relu or add op and a quantize -> de-quantize block that takes the same relu as input. Performs this optimization in place.

Parameters

quantized_model – A loaded quantized model to perform this optimization on

Returns

True if an in-place optimization was made

sparseml.onnx.utils.graph_optimizer. quantized_residual_add_optim ( quantized_model : onnx.onnx_ml_pb2.ModelProto ) bool [source]

This optimization adds a quant/dequant block to the identity branch of a residual whose non-identity branch is quantized. This enables the add at the end of the residual to be fused at runtime.

Function will match to any node who has two children nodes - one add node and one quantize node whose branch eventually leads to the other add node.

Parameters

quantized_model – A loaded quantized model to perform this optimization on

Returns

True if an in-place optimization was made

sparseml.onnx.utils.helpers module

Utility / helper functions

class sparseml.onnx.utils.helpers. BatchNormParams ( epsilon , momentum , scale , bias , mean , var )

Bases: tuple

property bias

Alias for field number 3

property epsilon

Alias for field number 0

property mean

Alias for field number 4

property momentum

Alias for field number 1

property scale

Alias for field number 2

property var

Alias for field number 5

class sparseml.onnx.utils.helpers. NodeParam ( name , val )

Bases: tuple

property name

Alias for field number 0

property val

Alias for field number 1

class sparseml.onnx.utils.helpers. NodeShape ( id , input_shapes , output_shapes )

Bases: tuple

property id

Alias for field number 0

property input_shapes

Alias for field number 1

property output_shapes

Alias for field number 2

class sparseml.onnx.utils.helpers. SparsityMeasurement ( node_id , params_count , params_zero_count , sparsity , density )

Bases: tuple

property density

Alias for field number 4

property node_id

Alias for field number 0

property params_count

Alias for field number 1

property params_zero_count

Alias for field number 2

property sparsity

Alias for field number 3

sparseml.onnx.utils.helpers. calculate_flops ( op_type : str , input_shape : Optional [ List [ List ] ] = None , output_shape : Optional [ List [ List ] ] = None , weight_shape : Optional [ List ] = None , kernel_shape : Optional [ List ] = None , bias_shape : Optional [ List ] = None , attributes : Union [ None , Dict [ str , Any ] ] = None ) Optional [ float ] [source]

Calculate flops based on operation type and shape of certain attributes. If any fields necessary in operation are set to None, will return None

Parameters
  • op_type – Operation type of flop calculation

  • input_shape – List of input shapes of operation

  • output_shape – List of output shapes of operation

  • weight_shape – Shape of weights in operation if any, else None

  • kernel_shape – Shape of kernel in operation if any, else None

  • bias_shape – Shape of bias in operation if any, else None

  • attributes – The node attributes if any, else None

Returns

The amount of floating point operations in the operation

sparseml.onnx.utils.helpers. check_load_model ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) onnx.onnx_ml_pb2.ModelProto [source]

Load an ONNX model from a given file path if supplied. If already a model proto, then returns.

Parameters

model – the model proto or path to the model ONNX file to check for loading

Returns

the loaded ONNX ModelProto

sparseml.onnx.utils.helpers. conv_node_params ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto , include_values : bool = True ) Tuple [ sparseml.onnx.utils.helpers.NodeParam , Optional [ sparseml.onnx.utils.helpers.NodeParam ] ] [source]

Get the params (weight and bias) for a conv node in an ONNX ModelProto

Parameters
  • model – the model proto loaded from the ONNX file

  • node – the conv node to get the params for

  • include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None

Returns

a tuple containing the weight, bias (if it is present)

sparseml.onnx.utils.helpers. extract_node_id ( node : onnx.onnx_ml_pb2.NodeProto ) str [source]

Get the node id for a given node from an ONNX model. Grabs the first ouput id as the node id. This is because is guaranteed to be unique for this node by the ONNX spec.

Parameters

node – the node to grab an id for

Returns

the id for the node

sparseml.onnx.utils.helpers. extract_node_shapes ( model : onnx.onnx_ml_pb2.ModelProto ) Dict [ str , sparseml.onnx.utils.helpers.NodeShape ] [source]

Extracts the shape information for each node as a NodeShape object.

Parameters

model – the loaded onnx.ModelProto to extract node shape information from

Returns

a mapping of node id to a NodeShape object

sparseml.onnx.utils.helpers. extract_nodes_shapes_ort ( model : onnx.onnx_ml_pb2.ModelProto ) Dict [ str , List [ List [ int ] ] ] [source]

Creates a modified model to expose intermediate outputs and runs an ONNX Runtime InferenceSession to obtain the output shape of each node.

Parameters

model – an ONNX model

Returns

a list of NodeArg with their shape exposed

sparseml.onnx.utils.helpers. extract_nodes_shapes_shape_inference ( model : onnx.onnx_ml_pb2.ModelProto ) Dict [ str , List [ Union [ None , List [ int ] ] ] ] [source]

Creates a modified model to expose intermediate outputs and runs an ONNX shape inference to obtain the output shape of each node.

NOTE: The ONNX docs on shape inference have the following disclaimer on shape inference: Shape inference is not guaranteed to be complete. In particular, some dynamic behaviors block the flow of shape inference, for example a Reshape to a dynamically-provide shape. Also, all operators are not required to have a shape inference implementation.

Parameters

model – an ONNX model

Returns

a list of NodeProto with their shape exposed

sparseml.onnx.utils.helpers. extract_shape ( proto : Any ) Union [ None , Tuple [ Optional [ int ] , ] ] [source]

Extract the shape info from a proto. Convenient for inputs into a model for example to get the tensor dimension.

Parameters

proto – the proto to get tensor shape info for

Returns

a tuple containing shape info if found, else None

sparseml.onnx.utils.helpers. gemm_node_params ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto , include_values : bool = True ) Tuple [ sparseml.onnx.utils.helpers.NodeParam , Optional [ sparseml.onnx.utils.helpers.NodeParam ] ] [source]

Get the params (weight and bias) for a gemm node in an ONNX ModelProto

Parameters
  • model – the model proto loaded from the ONNX file

  • node – the conv node to get the params for

  • include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None

Returns

a tuple containing the weight, bias (if it is present)

sparseml.onnx.utils.helpers. get_attr_float_val_for_node ( node : onnx.onnx_ml_pb2.NodeProto , attr : str ) Optional [ float ] [source]
Parameters
  • node – Node to get the attribute value of

  • attr – Attribute name to match in the node

Returns

The value of the attribute if the attribute found in the node and is a float type. Otherwise returns None

sparseml.onnx.utils.helpers. get_batch_norm_params ( model : onnx.onnx_ml_pb2.ModelProto , bn_node : onnx.onnx_ml_pb2.NodeProto ) sparseml.onnx.utils.helpers.BatchNormParams [source]

Get the params and relevant attributes of a batch normalization operator. Following the ONNX operators spec, will default epsilon and momentum to 1e-5 and 0.9 respectively when not defined.

Parameters
  • model – the model proto loaded from the ONNX file

  • bn_node – the batch normalization node to get the params for

Returns

a BatchNormParams named tuple

sparseml.onnx.utils.helpers. get_init_by_name ( model : onnx.onnx_ml_pb2.ModelProto , init_name : str ) Optional [ Any ] [source]

Get an initializer by name from the ONNX model proto graph

Parameters
  • model – the model proto loaded from the ONNX file

  • init_name – the name of the initializer to retrieve

Returns

the initializer retrieved by name from the model

sparseml.onnx.utils.helpers. get_kernel_shape ( attributes : Dict [ str , Any ] ) Optional [ List [ float ] ] [source]

Get the kernel shape from a dictionary of a model’s attributes

Parameters

attributes – a dictionary of a model’s attributes

Returns

the kernel shape if attribute contains either the kernel or kernel_shape field, otherwise None

sparseml.onnx.utils.helpers. get_node_attributes ( node : onnx.onnx_ml_pb2.NodeProto ) Dict [ str , Any ] [source]
Parameters

node – the ONNX node to get the attibutes for

Returns

a dictionary containing all attributes for the node

sparseml.onnx.utils.helpers. get_node_by_id ( model : onnx.onnx_ml_pb2.ModelProto , node_id : str ) Optional [ onnx.onnx_ml_pb2.NodeProto ] [source]

Get a node from a model by the node_id generated from extract_node_id

Parameters
  • model – the model proto loaded from the ONNX file

  • node_id – id of the node to get from the model

Returns

the retrieved node or None if no node found

sparseml.onnx.utils.helpers. get_node_input_nodes ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto ) List [ onnx.onnx_ml_pb2.NodeProto ] [source]

Get all of the nodes that share an output edge for the inputs to a given node

Parameters
  • model – the model the node is from

  • node – the node to get all input nodes for

Returns

the list of nodes that share an output edge for the inputs to the given node

sparseml.onnx.utils.helpers. get_node_inputs ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto ) List [ str ] [source]
Parameters
  • model – the model the node is from

  • node – the node to get all inputs (non initializers) for

Returns

the names of all the inputs to the node that are not initializers

sparseml.onnx.utils.helpers. get_node_output_nodes ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto ) List [ onnx.onnx_ml_pb2.NodeProto ] [source]

Get all of the nodes that share an input edge for the outputs from a given node

Parameters
  • model – the model the node is from

  • node – the node to get all output nodes for

Returns

the list of nodes that share an input edge for the outputs from the given node

sparseml.onnx.utils.helpers. get_node_outputs ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto ) List [ str ] [source]
Parameters
  • model – the model the node is from

  • node – the node to get all outputs (non initializers) for

Returns

the names of all the outputs to the node that are not initializers

sparseml.onnx.utils.helpers. get_node_params ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto , include_values : bool = True ) Tuple [ sparseml.onnx.utils.helpers.NodeParam , Optional [ sparseml.onnx.utils.helpers.NodeParam ] ] [source]

Get the params (weight and bias) for a node in an ONNX ModelProto. Must be an op type of one of [conv, gemm, matmul]

Parameters
  • model – the model proto loaded from the ONNX file

  • node – the conv node to get the params for

  • include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None

Returns

a tuple containing the weight, bias (if it is present)

sparseml.onnx.utils.helpers. get_nodes_by_input_id ( model : onnx.onnx_ml_pb2.ModelProto , input_id : str ) List [ onnx.onnx_ml_pb2.NodeProto ] [source]

Get all the nodes in a model that have a given id as one of the inputs

Parameters
  • model – the model proto loaded from the ONNX file

  • input_id – id of the input to get nodes by

Returns

the retrieved nodes

sparseml.onnx.utils.helpers. get_nodes_by_output_id ( model : onnx.onnx_ml_pb2.ModelProto , output_id : str ) List [ onnx.onnx_ml_pb2.NodeProto ] [source]

Get all the nodes in a model that have a given id as one of the outputs

Parameters
  • model – the model proto loaded from the ONNX file

  • output_id – id of the output to get nodes by

Returns

the retrieved nodes

sparseml.onnx.utils.helpers. get_numpy_dtype ( tensor : onnx.onnx_ml_pb2.TensorProto ) Union [ None , numpy.dtype ] [source]

Extract the NumPy dtype of an ONNX tensor. Returns None if there is not a direct mapping from the ONNX data type to a NumPy dtype.

Parameters

tensor – the tensor to get the dtype of

Returns

a NumPy dtype for the tensor if available otherwise None

sparseml.onnx.utils.helpers. get_prunable_node_from_foldable ( model : onnx.onnx_ml_pb2.ModelProto , foldable_node : Union [ str , onnx.onnx_ml_pb2.NodeProto ] , traverse_previous : bool = True , max_node_distance : int = 3 ) Union [ None , onnx.onnx_ml_pb2.NodeProto ] [source]

Get a prunable node that is attached by foldable nodes to a given foldable node. Returns None if nothing could be found. Ex: get the convolution that would be folded for an attached BatchNormalization

Parameters
  • model – the model the node is from

  • foldable_node – the foldable node or node id to find prunable node from

  • traverse_previous – True to only search for previous prunable nodes that the foldable node could have been attached to for Conv -> BN patterns. False to only search for following prunable nodes that the foldable node could have been attached to for BN -> Conv patterns.

  • max_node_distance – The maximum distance (and therefore number of foldable nodes) the prunable node must be within to match. Ex: max_node_distance = 3, the prunable node must be within 3 other foldable nodes of the foldable node passed in to match

Returns

the found prunable node

sparseml.onnx.utils.helpers. get_prunable_nodes ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) List [ Any ] [source]

Get the prunable nodes in an ONNX model proto. Prunable nodes are defined as any conv, gemm, or matmul

Parameters

model – the model proto loaded from the ONNX file

Returns

a list of nodes from the model proto

sparseml.onnx.utils.helpers. get_quantize_parent_for_dequantize_node ( quantized_model : onnx.onnx_ml_pb2.ModelProto , dequantize_node : onnx.onnx_ml_pb2.NodeProto ) Optional [ onnx.onnx_ml_pb2.NodeProto ] [source]

Returns the first quantize node found by traversing the first node input of the given de-quantize node’s ancestors. All inputs to de-quantize nodes should have a quantize node ancestor.

Parameters
  • quantized_model – the model the de-quantize node is from

  • dequantize_node – the node to get an associated quantize node for

Returns

the first quantize node found by traversing the first node input of the given de-quantize node’s ancestors. If no quantize node is found, returns None

sparseml.onnx.utils.helpers. get_tensor_dim_shape ( tensor : onnx.onnx_ml_pb2.TensorProto , dim : int ) int [source]
Parameters
  • tensor – ONNX tensor to get the shape of a dimension of

  • dim – dimension index of the tensor to get the shape of

Returns

shape of the tensor at the given dimension

sparseml.onnx.utils.helpers. is_foldable_node ( node : Union [ str , onnx.onnx_ml_pb2.NodeProto ] ) bool [source]

Foldable nodes as defined by ONNX Runtime and what it supports layerwise folding in the ONNX graphs. More info can be found in their docs: https://www.onnxruntime.ai/docs/resources/graph-optimizations.html

Parameters

node – the node or node type to check if it is foldable or not according to the ONNX Runtime specs

Returns

True if the node is foldable and therefore can be combined with other nodes, False otherwise

sparseml.onnx.utils.helpers. is_prunable_node ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto ) bool [source]
Parameters
  • model – the model the node is from

  • node – an ONNX node or op_type string

Returns

True if the given node is prunable, False otherwise

sparseml.onnx.utils.helpers. matmul_node_params ( model : onnx.onnx_ml_pb2.ModelProto , node : onnx.onnx_ml_pb2.NodeProto , include_values : bool = True ) Tuple [ sparseml.onnx.utils.helpers.NodeParam , Optional [ sparseml.onnx.utils.helpers.NodeParam ] ] [source]

Get the params (weight) for a matmul node in an ONNX ModelProto. In the future will retrieve a following bias addition as the bias for the matmul.

Parameters
  • model – the model proto loaded from the ONNX file

  • node – the conv node to get the params for

  • include_values – True to include the param values as NumPy arrays in the returned NodeParam objects. False to not load the values – in this event NodeParam.val will be None

Returns

a tuple containing the weight, bias (if it is present)

sparseml.onnx.utils.helpers. model_inputs ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) List [source]

Get the input to the model from an ONNX model

Parameters

model – the loaded model or a file path to the ONNX model to get the model inputs for

Returns

the input to the model

sparseml.onnx.utils.helpers. model_outputs ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) List [source]

Get the output from an ONNX model

Parameters

model – the loaded model or a file path to the ONNX model to get the model outputs for

Returns

the output from the model

sparseml.onnx.utils.helpers. onnx_nodes_sparsities ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) Tuple [ sparseml.onnx.utils.helpers.SparsityMeasurement , Dict [ str , sparseml.onnx.utils.helpers.SparsityMeasurement ] ] [source]

Retrieve the sparsities for each Conv or Gemm op in an ONNX graph for the associated weight inputs.

Parameters

model – ONNX model to use

Returns

a tuple containing the overall sparsity measurement for the model, each conv or gemm node found in the model

sparseml.onnx.utils.helpers. set_tensor_dim_shape ( tensor : onnx.onnx_ml_pb2.TensorProto , dim : int , value : int ) [source]

Sets the shape of the tensor at the given dimension to the given value

Parameters
  • tensor – ONNX tensor to modify the shape of

  • dim – dimension index of the tensor to modify the shape of

  • value – new shape for the given dimension

sparseml.onnx.utils.helpers. validate_onnx_file ( path : str ) [source]

Validate that a file at a given path is a valid ONNX model

Parameters

path – the path of the file to validate

Raises

ValueError – if not a valid ONNX model

sparseml.onnx.utils.loss module

sparseml.onnx.utils.loss. kl_divergence ( predicted : numpy.ndarray , expected : numpy.ndarray , zero_point : float = 0.0 , min_value : float = 1.0 ) float [source]

Calculate the kl_divergence (entropy) between two input arrays.

Shifts all values such that the zero_point is at one. If a value is lower, then sets it equal to 1.

Parameters
  • predicted – the first array to compare with

  • expected – the second array to compare with

  • zero_point – the zero point that should be used to shift values above 1

  • min_value – the minimum value that all values will be truncated to if they are below

Returns

the calculated KL divergence

sparseml.onnx.utils.model module

Utilities for ONNX models and running inference with them

class sparseml.onnx.utils.model. DeepSparseAnalyzeModelRunner ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] , batch_size : int , num_cores : Optional [ int ] = None ) [source]

Bases: sparseml.onnx.utils.model._DeepSparseBaseModelRunner

Class for handling running inference for an ONNX model through Neural Magic’s analyze_model api

Parameters
  • model – the path to the ONNX model file or the loaded onnx.ModelProto

  • batch_size – the size of the batch to create the model for

  • num_cores – the number of physical cores to run the model on. Defaults to run on all available cores

batch_forward ( batch : Dict [ str , numpy.ndarray ] , num_iterations : int = 1 , num_warmup_iterations : int = 0 , optimization_level : int = 1 , imposed_ks : Union [ None , float ] = None , * args , ** kwargs ) Tuple [ Dict [ str , numpy.ndarray ] , float ] [source]
Parameters
  • batch – the batch to run through the ONNX model for inference benchmarking analysis in the neural magic system

  • num_iterations – number of iterations to run the analysis benchmark for

  • num_warmup_iterations – number of iterations to run warmup for before benchmarking

  • optimization_level – the optimization level to use in neural magic; 1 for optimizations on, 0 for limited optimizations

  • imposed_ks – kernel sparsity value to impose on all the prunable layers in the model. None or no imposed sparsity

Returns

a tuple containing the result of the inference, the time to perform the inference

run ( data_loader : sparseml.onnx.utils.data.DataLoader , desc : str = '' , show_progress : bool = True , max_steps : int = 1 , num_iterations : int = 20 , num_warmup_iterations : int = 5 , optimization_level : int = 1 , imposed_ks : Union [ None , float ] = None , * args , ** kwargs ) Tuple [ List [ Dict ] , List [ float ] ] [source]

Run inference for a model for the data given in the data_loader iterator through neural magic inference engine model analysis function. The analysis function allows more granular control over how the model is executed such as optimization levels and imposing kernel sparsity. In addition, gives back layer by layer timings that were run through.

Parameters
  • data_loader – the data_loader used to load batches of data to run through the model

  • desc – str to display if show_progress is True

  • show_progress – True to show a tqdm bar when running, False otherwise

  • max_steps – maximum number of steps to take for the data_loader instead of running over all the data

  • num_iterations – number of iterations to run the analysis benchmark for

  • num_warmup_iterations – number of iterations to run warmup for before benchmarking

  • optimization_level – the optimization level to use in neural magic; 1 for optimizations on, 0 for limited optimizations

  • imposed_ks – kernel sparsity value to impose on all the prunable layers in the model. None or no imposed sparsity

Returns

a tuple containing the performance results for the run as returned from the analyze_model function, total time to run them

class sparseml.onnx.utils.model. DeepSparseModelRunner ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] , batch_size : int , num_cores : Optional [ int ] = None , loss : Optional [ Callable [ [ Dict [ str , numpy.ndarray ] , Dict [ str , numpy.ndarray ] ] , Any ] ] = None ) [source]

Bases: sparseml.onnx.utils.model._DeepSparseBaseModelRunner

Class for handling running inference for an ONNX model through Neural Magic :param model: the path to the ONNX model file or the loaded onnx.ModelProto :param batch_size: the size of the batch to create the model for :param num_cores: the number of physical cores to run the model on. Defaults

to run on all available cores

Parameters

loss – the loss function, if any, to run for evaluation of the model

batch_forward ( batch : Dict [ str , numpy.ndarray ] , * args , ** kwargs ) Tuple [ Dict [ str , numpy.ndarray ] , float ] [source]
Parameters

batch – the batch to run through the ONNX model for inference in the DeepSparse Engine

Returns

a tuple containing the result of the inference, the time to perform the inference

run ( data_loader : sparseml.onnx.utils.data.DataLoader , desc : str = '' , show_progress : bool = True , max_steps : int = - 1 , * args , ** kwargs ) Tuple [ List [ Any ] , List [ float ] ] [source]

Run inference for a model for the data given in the data_loader iterator through neural magic inference engine. :param data_loader: the data_loader used to load batches of data to

run through the model

Parameters
  • desc – str to display if show_progress is True

  • show_progress – True to show a tqdm bar when running, False otherwise

  • max_steps – maximum number of steps to take for the data_loader instead of running over all the data

Returns

a tuple containing the list of outputs and the list of times for running the data

class sparseml.onnx.utils.model. ModelRunner ( loss : Optional [ Callable [ [ Dict [ str , numpy.ndarray ] , Dict [ str , numpy.ndarray ] ] , Any ] ] = None ) [source]

Bases: abc.ABC

Abstract class for handling running inference for a model

Parameters

loss – the loss function, if any, to run for evaluation of the model

abstract batch_forward ( batch : Dict [ str , numpy.ndarray ] , * args , ** kwargs ) Tuple [ Any , float ] [source]

Abstract method for subclasses to override to run a batch through the inference engine for the ONNX model it was constructed with

Parameters

batch – the batch to run through the ONNX model for inference

Returns

a tuple containing the result of the inference, the time to perform the inference

run ( data_loader : sparseml.onnx.utils.data.DataLoader , desc : str = '' , show_progress : bool = True , max_steps : int = - 1 , * args , ** kwargs ) Tuple [ List [ Any ] , List [ float ] ] [source]

Run inference for a model for the data given in the data_loader iterator

Parameters
  • data_loader – the data_loader used to load batches of data to run through the model

  • desc – str to display if show_progress is True

  • show_progress – True to show a tqdm bar when running, False otherwise

  • max_steps – maximum number of steps to take for the data_loader instead of running over all the data

Returns

a tuple containing the list of outputs and the list of times for running the data

run_iter ( data_loader : sparseml.onnx.utils.data.DataLoader , desc : str = '' , show_progress : bool = True , max_steps : int = - 1 , * args , ** kwargs ) [source]

Iteratively runs inference for a model for the data given in the data_loader iterator

Parameters
  • data_loader – the data_loader used to load batches of data to run through the model

  • desc – str to display if show_progress is True

  • show_progress – True to show a tqdm bar when running, False otherwise

  • max_steps – maximum number of steps to take for the data_loader instead of running over all the data

Returns

an iterator to go through the tuples containing the list of outputs and the list of times for running the data

class sparseml.onnx.utils.model. ORTModelRunner ( model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] , loss : Optional [ Callable [ [ Dict [ str , numpy.ndarray ] , Dict [ str , numpy.ndarray ] ] , Any ] ] = None , overwrite_input_names : bool = True , nthreads : int = 0 , batch_size : int = None , providers : List [ str ] = None , ** kwargs ) [source]

Bases: sparseml.onnx.utils.model.ModelRunner

Class for handling running inference for an ONNX model through onnxruntime

Parameters
  • model – the path to the ONNX model file or the loaded onnx.ModelProto

  • loss – the loss function, if any, to run for evaluation of the model

  • overwrite_input_names – True to overwrite the input data names to what is found in for the model inputs, False to keep as found in the loaded data

  • nthreads – number of threads used to run the model (single node); default to 0 for onnxruntime to choose

  • batch_size – if provided, and the model has a hardcoded batch size, will rewrite the model proto so that the model batch size matches batch_size

  • providers – list of ORT provider names. will default to ort.get_available_providers()

batch_forward ( batch : Dict [ str , numpy.ndarray ] , * args , ** kwargs ) Tuple [ Dict [ str , numpy.ndarray ] , float ] [source]
Parameters

batch – the batch to run through the ONNX model for inference iin onnxruntime

Returns

a tuple containing the result of the inference, the time to perform the inference

run ( data_loader : sparseml.onnx.utils.data.DataLoader , desc : str = '' , show_progress : bool = True , max_steps : int = - 1 , * args , ** kwargs ) Tuple [ List [ Any ] , List [ float ] ] [source]

Run inference for a model for the data given in the data_loader iterator through ONNX Runtime.

Parameters
  • data_loader – the data_loader used to load batches of data to run through the model

  • desc – str to display if show_progress is True

  • show_progress – True to show a tqdm bar when running, False otherwise

  • max_steps – maximum number of steps to take for the data_loader instead of running over all the data

Returns

a tuple containing the list of outputs and the list of times for running the data

class sparseml.onnx.utils.model. OpenVINOModelRunner ( model : str , loss : Optional [ Callable [ [ Dict [ str , numpy.ndarray ] , Dict [ str , numpy.ndarray ] ] , Any ] ] = None , nthreads : int = 1 , batch_size : int = 0 , shape : str = '' ) [source]

Bases: sparseml.onnx.utils.model.ModelRunner

OpenVINO model runner class

Parameters
  • model – The path to the IR xml file after conversion

  • loss – loss function to run evaluation

  • nthreads – number of threads to run the model

  • batch_size – Batch size value. If not specified, the batch size value is determined from Intermediate Representation

  • shape – shape to be set for the input(s). For example, “input1[1,3,224,224],input2[1,4]” or “[1,3,224,224]” in case of one input size.

static available ( ) bool [source]
batch_forward ( batch : Dict [ str , numpy.ndarray ] , * args , ** kwargs ) Tuple [ Any , float ] [source]

Run a batch through the model

Parameters

batch – batch of data

Returns

result of the inference as dictionary, and the inference time

network_input_shapes ( ) [source]

Get network input shapes :return: dictionary of shapes for each input key

sparseml.onnx.utils.model. correct_nm_analyze_model_node_ids ( nm_result : Dict , model : Union [ str , onnx.onnx_ml_pb2.ModelProto ] ) [source]

Correct the node ids returned from the deepsparse.analyze_model api. In some cases, it will return the ids for folded nodes due to ONNXRuntime folding. This finds the corrected node ids from those folded nodes. Additionally, ops that did not have an id are changed from the returned string <none> to proper None python type

Parameters
  • nm_result – the result from the deepsparse.analyze_model api

  • model – the onnx model proto or path to the onnx file that the nm_result was for

sparseml.onnx.utils.model. max_available_cores ( ) int [source]
Returns

the maximum number of physical cores detected on the system

sparseml.onnx.utils.model. split_canonical_names ( nm_result : Dict ) [source]

Splits analysis layer results from grouped canonical names by individual nodes. Stores the original grouped canonical name in the ‘meta_canonical_name’ field.

Will split on any canonical_name that includes ‘,’.

Parameters

nm_result – the result from the deepsparse.analyze_model api

sparseml.onnx.utils.sparse_tensor module

Helper functions for handling ONNX SparseTensorProto objects. onnx >= 1.6.0 is a requirement for using sparse tensors

sparseml.onnx.utils.sparse_tensor. convert_model_initializers_to_sparse ( model : onnx.onnx_ml_pb2.ModelProto , sparsity_threshold : float = 0.6 , inplace : bool = True ) onnx.onnx_ml_pb2.ModelProto [source]
Parameters
  • model – ONNX model with initializers to convert to sparse

  • sparsity_threshold – the minimum sparsity of a tensor to be converted to sparse representation. Default is 0.6

  • inplace – True to do model conversion in place. Default is True

Returns

the given model with initializers above the sparsity threshold converted to sparse initializers

sparseml.onnx.utils.sparse_tensor. convert_sparse_initializers_to_dense ( model : onnx.onnx_ml_pb2.ModelProto , inplace : bool = True ) onnx.onnx_ml_pb2.ModelProto [source]
Parameters
  • model – ONNX model with sparse initializers to convert to dense representation

  • inplace – True to do model conversion in place. Default is True

Returns

The given model with all sparse initializers converted to dense initializers

sparseml.onnx.utils.sparse_tensor. create_sparse_tensor ( array : Union [ numpy.ndarray , onnx.onnx_ml_pb2.TensorProto ] , name : Optional [ str ] = None ) Optional [ onnx.onnx_ml_pb2.SparseTensorProto ] [source]
Parameters
  • array – numpy array or TensorProto object to convert to sparse representation

  • name – name of this sparse tensor. Will be stored in SparseTensorProto.values.name. If the given array is a TensorProto, name will default to TensorProto.name

Returns

SparseTensorProto object built from the sparse representation of the input array

sparseml.onnx.utils.sparse_tensor. sparse_tensor_to_dense ( sparse_tensor : onnx.onnx_ml_pb2.SparseTensorProto ) onnx.onnx_ml_pb2.TensorProto [source]
Parameters

sparse_tensor – SparseTensorProto object

Returns

TensorProto object that is the dense representation of the given sparse tensor.

Module contents

Generic code used as utilities and helpers for ONNX