sparseml.utils package

Submodules

sparseml.utils.frameworks module

ML framework tokens

sparseml.utils.helpers module

General utility helper functions. Common functions for interfacing with python primitives and directories/files.

class sparseml.utils.helpers.NumpyArrayBatcher[source]

Bases: object

Batcher instance to handle taking in dictionaries of numpy arrays, appending multiple items to them to increase their batch size, and then stack them into a single batched numpy array for all keys in the dicts.

append(item: Union[numpy.ndarray, Dict[str, numpy.ndarray]])[source]

Append a new item into the current batch. All keys and shapes must match the current state.

Parameters

item – the item to add for batching

stack()Dict[str, numpy.ndarray][source]

Stack the current items into a batch along a new, zeroed dimension

Returns

the stacked items

sparseml.utils.helpers.bucket_iterable(val: Iterable[Any], num_buckets: int = 3, edge_percent: float = 0.05, sort_highest: bool = True, sort_key: Optional[Callable[Any, Any]] = None)List[Tuple[int, Any]][source]

Bucket iterable into subarray consisting of the first top percentage followed by the rest of the iterable sliced into equal sliced groups.

Parameters
  • val – The iterable to bucket

  • num_buckets – The number of buckets to group the iterable into, does not include the top bucket

  • edge_percent – Group the first percent into its own bucket. If sort_highest, then this is the top percent, else bottom percent. If <= 0, then will not create an edge bucket

  • sort_highest – True to sort such that the highest percent is first and will create buckets in descending order. False to sort so lowest is first and create buckets in ascending order.

  • sort_key – The sort_key, if any, to use for sorting the iterable after converting it to a list

Returns

a list of each value mapped to the bucket it was sorted into

sparseml.utils.helpers.clean_path(path: str)str[source]
Parameters

path – the directory or file path to clean

Returns

a cleaned version that expands the user path and creates an absolute path

sparseml.utils.helpers.convert_to_bool(val: Any)[source]
Parameters

val – the value to be converted to a bool, supports logical values as strings ie True, t, false, 0

Returns

the boolean representation of the value, if it can’t be determined, falls back on returning True

sparseml.utils.helpers.create_dirs(path: str)[source]
Parameters

path – the directory path to try and create

sparseml.utils.helpers.create_parent_dirs(path: str)[source]
Parameters

path – the file path to try to create the parent directories for

sparseml.utils.helpers.create_unique_dir(path: str, check_number: int = 0)str[source]
Parameters
  • path – the file path to create a unique version of (append numbers until one doesn’t exist)

  • check_number – the number to begin checking for unique versions at

Returns

the unique directory path

sparseml.utils.helpers.flatten_iterable(li: Iterable)[source]
Parameters

li – a possibly nested iterable of items to be flattened

Returns

a flattened version of the list where all elements are in a single list flattened in a depth first pattern

sparseml.utils.helpers.interpolate(x_cur: float, x0: float, x1: float, y0: Any, y1: Any, inter_func: str = 'linear')Any[source]

note, caps values at their min of x0 and max x1, designed to not work outside of that range for implementation reasons

Parameters
  • x_cur – the current value for x, should be between x0 and x1

  • x0 – the minimum for x to interpolate between

  • x1 – the maximum for x to interpolate between

  • y0 – the minimum for y to interpolate between

  • y1 – the maximum for y to interpolate between

  • inter_func – the type of function to interpolate with: linear, cubic, inverse_cubic

Returns

the interpolated value projecting x into y for the given interpolation function

sparseml.utils.helpers.interpolate_list_linear(measurements: List[Tuple[float, float]], x_val: Union[float, List[float]])List[Tuple[float, float]][source]

interpolate for input values within a list of measurements linearly

Parameters
  • measurements – the measurements to interpolate the output value between

  • x_val – the target values to interpolate to the second dimension

Returns

a list of tuples containing the target values, interpolated values

sparseml.utils.helpers.interpolated_integral(measurements: List[Tuple[float, float]])[source]

Calculate the interpolated integal for a group of measurements of the form [(x0, y0), (x1, y1), …]

Parameters

measurements – the measurements to calculate the integral for

Returns

the integral or area under the curve for the measurements given

sparseml.utils.helpers.is_url(val: str)[source]
Parameters

val – value to check if it is a url or not

Returns

True if value is a URL, False otherwise

sparseml.utils.helpers.load_labeled_data(data: Union[str, Iterable[Union[str, numpy.ndarray, Dict[str, numpy.ndarray]]]], labels: Union[None, str, Iterable[Union[str, numpy.ndarray, Dict[str, numpy.ndarray]]]], raise_on_error: bool = True)List[Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray]], Union[None, numpy.ndarray, Dict[str, numpy.ndarray]]]][source]

Load labels and data from disk or from memory and group them together. Assumes sorted ordering for on disk. Will match between when a file glob is passed for either data and/or labels.

Parameters
  • data – the file glob, file path to numpy data tar ball, or list of arrays to use for data

  • labels – the file glob, file path to numpy data tar ball, or list of arrays to use for labels, if any

  • raise_on_error – True to raise on any error that occurs; False to log a warning, ignore, and continue

Returns

a list containing tuples of the data, labels. If labels was passed in as None, will now contain a None for the second index in each tuple

sparseml.utils.helpers.load_numpy(file_path: str)Union[numpy.ndarray, Dict[str, numpy.ndarray]][source]

Load a numpy file into either an ndarray or an OrderedDict representing what was in the npz file

Parameters

file_path – the file_path to load

Returns

the loaded values from the file

sparseml.utils.helpers.parse_optimization_str(optim_full_name: str)Tuple[str, str, Any][source]
Parameters

optim_full_name – A name of a pretrained model optimization. i.e. ‘pruned-moderate-deepsparse’, ‘pruned-aggressive’, ‘base’

Returns

A tuple representing the corresponding SparseZoo model sparse_name, sparse_category, and sparse_target values with appropriate defaults when not present.

sparseml.utils.helpers.path_file_count(path: str, pattern: str = '*')int[source]

Return the number of files that match the given pattern under the given path

Parameters
  • path – the path to the directory to look for files under

  • pattern – the pattern the files must match to be counted

Returns

the number of files matching the pattern under the directory

sparseml.utils.helpers.path_file_size(path: str)int[source]

Return the total size, in bytes, for a path on the file system

Parameters

path – the path (directory or file) to get the size for

Returns

the size of the path, in bytes, as stored on disk

sparseml.utils.helpers.save_numpy(array: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name: str, npz: bool = True)[source]

Save a numpy array or collection of numpy arrays to disk

Parameters
  • array – the array or collection of arrays to save

  • export_dir – the directory to export the numpy file into

  • name – the name of the file to export to (without extension)

  • npz – True to save as an npz compressed file, False for standard npy. Note, npy can only be used for single numpy arrays

Returns

the saved path

sparseml.utils.helpers.tensor_export(tensor: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name: str, npz: bool = True)str[source]
Parameters
  • tensor – tensor to export to a saved numpy array file

  • export_dir – the directory to export the file in

  • name – the name of the file, .npy will be appended to it

  • npz – True to export as an npz file, False otherwise

Returns

the path of the numpy file the tensor was exported to

sparseml.utils.helpers.tensors_export(tensors: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name_prefix: str, counter: int = 0, break_batch: bool = False)List[str][source]
Parameters
  • tensors – the tensors to export to a saved numpy array file

  • export_dir – the directory to export the files in

  • name_prefix – the prefix name for the tensors to save as, will append info about the position of the tensor in a list or dict in addition to the .npy file format

  • counter – the current counter to save the tensor at

  • break_batch – treat the tensor as a batch and break apart into multiple tensors

Returns

the exported paths

sparseml.utils.helpers.validate_str_iterable(val: Union[str, Iterable[str]], error_desc: str = '')Union[str, Iterable[str]][source]
Parameters
  • val – the value to validate, check that it is a list (and flattens it), otherwise checks that it’s an __ALL__ or __ALL_PRUNABLE__ string, otherwise raises a ValueError

  • error_desc – the description to raise an error with in the event that the val wasn’t valid

Returns

the validated version of the param

sparseml.utils.singleton module

Code related to the Singleton design pattern

class sparseml.utils.singleton.Singleton[source]

Bases: type

A singleton class implementation meant to be added to others as a metaclass.

Ex: class Logger(metaclass=Singleton)

sparseml.utils.worker module

General code for parallelizing the workers

class sparseml.utils.worker.ParallelWorker(worker_func: Callable, num_workers: int, indefinite: bool, max_source_size: int = - 1)[source]

Bases: object

Multi threading worker to parallelize tasks

Parameters
  • worker_func – the function to parallelize across multiple tasks

  • num_workers – number of workers to use

  • indefinite – True to keep the thread pooling running so that more tasks can be added, False to stop after no more tasks are added

  • max_source_size – the maximum size for the source queue

add(vals: List[Any])[source]
Parameters

vals – the values to add for processing work

add_async(vals: List[Any])[source]
Parameters

vals – the values to add for async workers

add_async_generator(gen: Iterator[Any])[source]
Parameters

gen – add an async generator to pull values from for processing

add_item(val: Any)[source]
Parameters

val – add a single item for processing

property indefinite

True to keep the thread pooling running so that more tasks can be added, False to stop after no more tasks are added

Type

return

shutdown()[source]

Stop the workers

start()[source]

Start the workers

sparseml.utils.wrapper module

Code for properly merging function attributes for decorated / wrapped functions. Merges docs, annotations, dicts, etc.

sparseml.utils.wrapper.wrapper_decorator(wrapped: Callable)[source]

A wrapper decorator to be applied as a decorator to a function. Merges the decorated function properties with wrapped.

Parameters

wrapped – the wrapped function to merge decorations with

Returns

the decorator to apply to the function

Module contents

General utility functions used throughout sparseml