sparsezoo.utils package
Submodules
sparsezoo.utils.data module
Utilities for data loading into numpy for use in ONNX supported systems
-
class
sparsezoo.utils.data.
DataLoader
(*datasets: sparsezoo.utils.data.Dataset, batch_size: int, iter_steps: int = 0, batch_as_list: bool = False)[source] Bases:
Iterable
Data loader instance that supports loading numpy arrays from file or memory and creating an iterator to go through batches of that data. Iterator returns a list containing all data originally loaded.
- Parameters
datasets – any number of datasets to load for the dataloader
batch_size – the size of batches to create for the iterator
iter_steps – the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
batch_as_list – True to create the items from each dataset as a list, False for an ordereddict
-
property
batch_as_list
True to create the items from each dataset as a list, False for an ordereddict
- Type
return
-
property
batch_size
the size of batches to create for the iterator
- Type
return
-
property
datasets
any number of datasets to load for the dataloader
- Type
return
-
get_batch
(bath_index: int) → Union[Dict[str, Union[List[numpy.ndarray], Dict[str, numpy.ndarray]]], List[numpy.ndarray], Dict[str, numpy.ndarray]][source] Get a batch from the data at the given index
- Parameters
bath_index – the index of the batch to get
- Returns
the created batch
-
property
infinite
True if the loader instance is setup to continually create batches, False otherwise
- Type
return
-
property
iter_steps
the number of steps (batches) to create. Set to -1 for infinite, 0 for running through the loaded data once, or a positive integer for the desired number of steps
- Type
return
-
property
num_items
the number of items in each dataset
- Type
return
-
class
sparsezoo.utils.data.
Dataset
(name: str, data: Union[str, Iterable[Union[str, numpy.ndarray, Dict[str, numpy.ndarray]]]])[source] Bases:
Iterable
A numpy dataset implementation
- Parameters
name – The name for the dataset
data – The data for the dataset. Can be one of [str - path to a folder containing numpy files, Iterable[str] - list of paths to numpy files, Iterable[ndarray], Iterable[Dict[str, ndarray]] ]
-
property
data
The list of data items for the dataset.
- Type
return
-
property
name
The name for the dataset
- Type
return
-
class
sparsezoo.utils.data.
RandomDataset
(name: str, typed_shapes: Dict[str, Tuple[Iterable[int], Optional[numpy.dtype]]], num_samples: int = 20)[source] Bases:
sparsezoo.utils.data.Dataset
A numpy dataset created from random data
- Parameters
name – The name for the dataset
typed_shapes – A dictionary containing the info for the random data to create, the names of the items in the data map to a tuple (shapes, numpy type). If numpy type is none, it will default to float32. Ex: {“inp”: ([3, 224, 224], None)}
num_samples – The number of random samples to create
sparsezoo.utils.downloader module
Code related to efficiently downloading multiple files with parallel workers
-
class
sparsezoo.utils.downloader.
DownloadProgress
(chunk_size, downloaded, content_length, path) Bases:
tuple
-
property
chunk_size
Alias for field number 0
-
property
content_length
Alias for field number 2
-
property
downloaded
Alias for field number 1
-
property
path
Alias for field number 3
-
property
-
exception
sparsezoo.utils.downloader.
PreviouslyDownloadedError
(*args: object)[source] Bases:
Exception
Error raised when a file has already been downloaded and overwrite is False
-
sparsezoo.utils.downloader.
download_file
(url_path: str, dest_path: str, overwrite: bool, num_retries: int = 3, show_progress: bool = True, progress_title: Optional[str] = None)[source] Download a file from the given url to the desired local path
- Parameters
url_path – the source url to download the file from
dest_path – the local file path to save the downloaded file to
overwrite – True to overwrite any previous files if they exist, False to not overwrite and raise an error if a file exists
num_retries – number of times to retry the download if it fails
show_progress – True to show a progress bar for the download, False otherwise
progress_title – The title to show with the progress bar
- Raises
PreviouslyDownloadedError – raised if file already exists at dest_path nad overwrite is False
-
sparsezoo.utils.downloader.
download_file_iter
(url_path: str, dest_path: str, overwrite: bool, num_retries: int = 3) → Iterator[sparsezoo.utils.downloader.DownloadProgress][source] Download a file from the given url to the desired local path
- Parameters
url_path – the source url to download the file from
dest_path – the local file path to save the downloaded file to
overwrite – True to overwrite any previous files if they exist, False to not overwrite and raise an error if a file exists
num_retries – number of times to retry the download if it fails
- Returns
an iterator representing the progress for the file download
- Raises
PreviouslyDownloadedError – raised if file already exists at dest_path nad overwrite is False
sparsezoo.utils.helpers module
Code related to helper functions for model zoo
-
sparsezoo.utils.helpers.
clean_path
(path: str) → str[source] - Parameters
path – the directory or file path to clean
- Returns
a cleaned version that expands the user path and creates an absolute path
-
sparsezoo.utils.helpers.
convert_to_bool
(val: Any)[source] - Parameters
val – a value
- Returns
False if value is a Falsy value e.g. 0, f, false, None, otherwise True.
-
sparsezoo.utils.helpers.
create_dirs
(path: str)[source] - Parameters
path – the directory path to try and create
-
sparsezoo.utils.helpers.
create_parent_dirs
(path: str)[source] - Parameters
path – the file path to try to create the parent directories for
-
sparsezoo.utils.helpers.
create_tqdm_auto_constructor
() → Union[tqdm.std.tqdm, tqdm.tqdm_notebook][source] - Returns
the tqdm instance to use for progress. If ipywidgets is installed then will return auto.tqdm, if not will return tqdm so that notebooks will not break
-
sparsezoo.utils.helpers.
tqdm_auto
alias of
tqdm.std.tqdm
sparsezoo.utils.numpy module
Code related to numpy array files
-
class
sparsezoo.utils.numpy.
NumpyArrayBatcher
[source] Bases:
object
Batcher instance to handle taking in dictionaries of numpy arrays, appending multiple items to them to increase their batch size, and then stack them into a single batched numpy array for all keys in the dicts.
-
sparsezoo.utils.numpy.
load_numpy
(file_path: str) → Union[numpy.ndarray, Dict[str, numpy.ndarray]][source] Load a numpy file into either an ndarray or an OrderedDict representing what was in the npz file :param file_path: the file_path to load :return: the loaded values from the file
-
sparsezoo.utils.numpy.
load_numpy_from_tar
(path: str) → List[Union[numpy.ndarray, Dict[str, numpy.ndarray]]][source] Load numpy data into a list from a tar file. All files contained in the tar are expected to be the numpy files.
- Parameters
path – path to the tarfile to load the numpy data from
- Returns
the list of loaded numpy data, either arrays or ordereddicts of arrays
-
sparsezoo.utils.numpy.
load_numpy_list
(data: Union[str, Iterable[Union[str, numpy.ndarray, Dict[str, numpy.ndarray]]]]) → List[Union[numpy.ndarray, Dict[str, numpy.ndarray]]][source] Load numpy data into a list
- Parameters
data – the data to load, one of: [folder path, iterable of file paths, iterable of numpy arrays]
- Returns
the list of loaded data items
-
sparsezoo.utils.numpy.
save_numpy
(array: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name: str, npz: bool = True)[source] Save a numpy array or collection of numpy arrays to disk
- Parameters
array – the array or collection of arrays to save
export_dir – the directory to export the numpy file into
name – the name of the file to export to (without extension)
npz – True to save as an npz compressed file, False for standard npy. Note, npy can only be used for single numpy arrays
- Returns
the saved path
-
sparsezoo.utils.numpy.
tensor_export
(tensor: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name: str, npz: bool = True) → str[source] - Parameters
tensor – tensor to export to a saved numpy array file
export_dir – the directory to export the file in
name – the name of the file, .npy will be appended to it
npz – True to export as an npz file, False otherwise
- Returns
the path of the numpy file the tensor was exported to
-
sparsezoo.utils.numpy.
tensors_export
(tensors: Union[numpy.ndarray, Dict[str, numpy.ndarray], Iterable[numpy.ndarray]], export_dir: str, name_prefix: str, counter: int = 0, break_batch: bool = False) → List[str][source] - Parameters
tensors – the tensors to export to a saved numpy array file
export_dir – the directory to export the files in
name_prefix – the prefix name for the tensors to save as, will append info about the position of the tensor in a list or dict in addition to the .npy file format
counter – the current counter to save the tensor at
break_batch – treat the tensor as a batch and break apart into multiple tensors
- Returns
the exported paths
Module contents
Utils for working with the sparsezoo