sparseml.pytorch.datasets package

Submodules

sparseml.pytorch.datasets.generic module

class sparseml.pytorch.datasets.generic. CacheableDataset ( original : torch.utils.data.dataset.Dataset ) [source]

Bases: torch.utils.data.dataset.Dataset

Generates a cacheable dataset, ie stores the data in a cache in cpu memory so it doesn’t have to be loaded from disk every time.

Note, this can only be used with a data loader that has num_workers=0

Parameters

original – the original dataset to cache

class sparseml.pytorch.datasets.generic. EarlyStopDataset ( original : torch.utils.data.dataset.Dataset , early_stop : int ) [source]

Bases: torch.utils.data.dataset.Dataset

Dataset that handles applying an early stop when iterating through the dataset IE will allow indexing between [0, early_stop)

Parameters
  • original – the original dataset to apply an early stop to

  • early_stop – the total number of data items to run through, if -1 then will go through whole dataset

class sparseml.pytorch.datasets.generic. NoisyDataset ( original : torch.utils.data.dataset.Dataset , intensity : float ) [source]

Bases: torch.utils.data.dataset.Dataset

Add random noise from a standard distribution mean(0) and stdev(intensity) on top of a dataset

Parameters
  • original – the dataset to add noise on top of

  • intensity – the level of noise to add (creates the noise with this standard deviation)

class sparseml.pytorch.datasets.generic. RandNDataset ( length : int , shape : Union [ int , Tuple [ int , ] ] , normalize : bool ) [source]

Bases: torch.utils.data.dataset.Dataset

Generates a random dataset

Parameters
  • length – the number of random items to create in the dataset

  • shape – the shape of the data to create

  • normalize – Normalize the data according to imagenet distribution (shape must match 3,x,x)

sparseml.pytorch.datasets.registry module

Code related to the PyTorch dataset registry for easily creating datasets.

class sparseml.pytorch.datasets.registry. DatasetRegistry [source]

Bases: object

Registry class for creating datasets

static attributes ( key : str ) Dict [ str , Any ] [source]
Parameters

key – the dataset key (name) to create

Returns

the specified attributes for the dataset

static create ( key : str , * args , ** kwargs ) torch.utils.data.dataset.Dataset [source]

Create a new dataset for the given key

Parameters

key – the dataset key (name) to create

Returns

the instantiated model

static register ( key : Union [ str , List [ str ] ] , attributes : Dict [ str , Any ] ) [source]

Register a dataset with the registry. Should be used as a decorator

Parameters
  • key – the model key (name) to create

  • attributes – the specified attributes for the dataset

Returns

the decorator

Module contents

Code for creating and loading datasets in PyTorch