sparseml.pytorch.optim package¶
Submodules¶
sparseml.pytorch.optim.analyzer_as module¶
Code related to analyzing activation sparsity within PyTorch neural networks. More information can be found in the paper here.
-
class
sparseml.pytorch.optim.analyzer_as.
ASResultType
(value)[source]¶ Bases:
enum.Enum
Result type to track for activation sparsity.
-
inputs_sample
= 'inputs_sample'¶
-
inputs_sparsity
= 'inputs_sparsity'¶
-
outputs_sample
= 'outputs_sample'¶
-
outputs_sparsity
= 'outputs_sparsity'¶
-
-
class
sparseml.pytorch.optim.analyzer_as.
ModuleASAnalyzer
(module: torch.nn.modules.module.Module, dim: Union[None, int, Tuple[int, …]] = None, track_inputs_sparsity: bool = False, track_outputs_sparsity: bool = False, inputs_sample_size: int = 0, outputs_sample_size: int = 0, enabled: bool = True)[source]¶ Bases:
object
An analyzer implementation used to monitor the activation sparsity with a module. Generally used to monitor an individual layer.
- Parameters
module – The module to analyze activation sparsity for
dim – Any dims within the tensor such as across batch, channel, etc. Ex: 0 for batch, 1 for channel, [0, 1] for batch and channel
track_inputs_sparsity – True to track the input sparsity to the module, False otherwise
track_outputs_sparsity – True to track the output sparsity to the module, False otherwise
inputs_sample_size – The number of samples to grab from the input tensor on each forward pass. If <= 0, then will not sample any values.
outputs_sample_size – The number of samples to grab from the output tensor on each forward pass. If <= 0, then will not sample any values.
enabled – True to enable the hooks for analyzing and actively track, False to disable and not track
-
static
analyze_layers
(module: torch.nn.modules.module.Module, layers: List[str], dim: Union[None, int, Tuple[int, …]] = None, track_inputs_sparsity: bool = False, track_outputs_sparsity: bool = False, inputs_sample_size: int = 0, outputs_sample_size: int = 0, enabled: bool = True)[source]¶ - Parameters
module – the module to analyze multiple layers activation sparsity in
layers – the names of the layers to analyze (from module.named_modules())
dim – Any dims within the tensor such as across batch, channel, etc. Ex: 0 for batch, 1 for channel, [0, 1] for batch and channel
track_inputs_sparsity – True to track the input sparsity to the module, False otherwise
track_outputs_sparsity – True to track the output sparsity to the module, False otherwise
inputs_sample_size – The number of samples to grab from the input tensor on each forward pass. If <= 0, then will not sample any values.
outputs_sample_size – The number of samples to grab from the output tensor on each forward pass. If <= 0, then will not sample any values.
enabled – True to enable the hooks for analyzing and actively track, False to disable and not track
- Returns
a list of the created analyzers, matches the ordering in layers
-
clear
(specific_result_type: Union[None, sparseml.pytorch.optim.analyzer_as.ASResultType] = None)[source]¶
-
property
dim
¶
-
property
enabled
¶
-
property
inputs_sample
¶
-
property
inputs_sample_max
¶
-
property
inputs_sample_mean
¶
-
property
inputs_sample_min
¶
-
property
inputs_sample_size
¶
-
property
inputs_sample_std
¶
-
property
inputs_sparsity
¶
-
property
inputs_sparsity_max
¶
-
property
inputs_sparsity_mean
¶
-
property
inputs_sparsity_min
¶
-
property
inputs_sparsity_std
¶
-
property
module
¶
-
property
outputs_sample
¶
-
property
outputs_sample_max
¶
-
property
outputs_sample_mean
¶
-
property
outputs_sample_min
¶
-
property
outputs_sample_size
¶
-
property
outputs_sample_std
¶
-
property
outputs_sparsity
¶
-
property
outputs_sparsity_max
¶
-
property
outputs_sparsity_mean
¶
-
property
outputs_sparsity_min
¶
-
property
outputs_sparsity_std
¶
-
results
(result_type: sparseml.pytorch.optim.analyzer_as.ASResultType) → List[torch.Tensor][source]¶
-
results_max
(result_type: sparseml.pytorch.optim.analyzer_as.ASResultType) → torch.Tensor[source]¶
-
results_mean
(result_type: sparseml.pytorch.optim.analyzer_as.ASResultType) → torch.Tensor[source]¶
-
results_min
(result_type: sparseml.pytorch.optim.analyzer_as.ASResultType) → torch.Tensor[source]¶
-
results_std
(result_type: sparseml.pytorch.optim.analyzer_as.ASResultType) → torch.Tensor[source]¶
-
property
track_inputs_sparsity
¶
-
property
track_outputs_sparsity
¶
sparseml.pytorch.optim.analyzer_module module¶
Code related to monitoring, analyzing, and reporting info for Modules in PyTorch. Records things like FLOPS, input and output shapes, kernel shapes, etc.
-
class
sparseml.pytorch.optim.analyzer_module.
ModuleAnalyzer
(module: torch.nn.modules.module.Module, enabled: bool = False)[source]¶ Bases:
object
An analyzer implementation for monitoring the execution profile and graph of a Module in PyTorch.
- Parameters
module – the module to analyze
enabled – True to enable the hooks for analyzing and actively track, False to disable and not track
-
property
enabled
¶ True if enabled and the hooks for analyzing are active, False otherwise
- Type
return
-
ks_layer_descs
() → List[sparseml.optim.analyzer.AnalyzedLayerDesc][source]¶ Get the descriptions for all layers in the module that support kernel sparsity (model pruning). Ex: all convolutions and linear layers.
- Returns
a list of descriptions for all layers in the module that support ks
-
layer_desc
(name: Optional[str] = None) → sparseml.optim.analyzer.AnalyzedLayerDesc[source]¶ Get a specific layer’s description within the Module. Set to None to get the overall Module’s description.
- Parameters
name – name of the layer to get a description for, None for an overall description
- Returns
the analyzed layer description for the given name
-
property
module
¶ The module that is being actively analyzed
- Type
return
sparseml.pytorch.optim.analyzer_pruning module¶
Code related to monitoring, analyzing, and reporting the kernel sparsity (model pruning) for a model’s layers and params. More info on kernel sparsity can be found here <https://arxiv.org/abs/1902.09574> __.
-
class
sparseml.pytorch.optim.analyzer_pruning.
ModulePruningAnalyzer
(module: torch.nn.modules.module.Module, name: str, param_name: str = 'weight')[source]¶ Bases:
object
An analyzer implementation monitoring the kernel sparsity of a given param in a module.
- Parameters
module – the module containing the param to analyze the sparsity for
name – name of the layer, used for tracking
param_name – name of the parameter to analyze the sparsity for, defaults to weight
-
static
analyze_layers
(module: torch.nn.modules.module.Module, layers: List[str], param_name: str = 'weight')[source]¶ - Parameters
module – the module to create multiple analyzers for
layers – the names of the layers to create analyzer for that are in the module
param_name – the name of the param to monitor within each layer
- Returns
a list of analyzers, one for each layer passed in and in the same order
-
property
module
¶ the module containing the param to analyze the sparsity for
- Type
return
-
property
name
¶ name of the layer, used for tracking
- Type
return
-
property
param
¶ the parameter that is being monitored for kernel sparsity
- Type
return
-
property
param_name
¶ name of the parameter to analyze the sparsity for, defaults to weight
- Type
return
-
property
param_sparsity
¶ the sparsity of the contained parameter (how many zeros are in it)
- Type
return
-
param_sparsity_dim
(dim: Union[None, int, Tuple[int, …]] = None) → torch.Tensor[source]¶ - Parameters
dim – a dimension(s) to calculate the sparsity over, ex over channels
- Returns
the sparsity of the contained parameter structured according to the dim passed in
-
property
tag
¶ combines the layer name and param name in to a single string separated by a period
- Type
return
sparseml.pytorch.optim.manager module¶
Contains base code related to modifier managers: modifier managers handle grouping modifiers and running them together. Also handles loading modifiers from yaml files
-
class
sparseml.pytorch.optim.manager.
RecipeManagerStepWrapper
(wrap: Any, optimizer: torch.optim.optimizer.Optimizer, module: torch.nn.modules.module.Module, manager: Any, epoch: float, steps_per_epoch: int)[source]¶ Bases:
object
A wrapper class to handle wrapping an optimizer or optimizer like object and override the step function. The override calls into the ScheduledModifierManager when appropriate and enabled and then calls step() as usual on the function with the original arguments. All original attributes and methods are forwarded to the wrapped object so this class can be a direct substitute for it.
- Parameters
wrap – The object to wrap the step function and properties for.
optimizer – The optimizer used in the training process.
module – The model/module used in the training process.
manager – The manager to forward lifecycle calls into such as step.
epoch – The epoch to start the modifying process at.
steps_per_epoch – The number of optimizer steps (batches) in each epoch.
-
emulated_step
()[source]¶ Emulated step function to be called in place of step when the number of steps_per_epoch vary across epochs. The emulated function should be called to keep the steps_per_epoch thee same. Does not call into the step function for the wrapped object, but does call into the manager to increment the steps.
-
loss_update
(loss: torch.Tensor) → torch.Tensor[source]¶ Optional call to update modifiers based on the calculated loss. Not needed unless one or more of the modifier is using the loss to make a modification or is modifying the loss itself.
- Parameters
loss – the calculated loss after running a forward pass and loss_fn
- Returns
the modified loss tensor
-
step
(*args, **kwargs)[source]¶ Override for the step function. Calls into the base step function with the args and kwargs.
- Parameters
args – Any args to pass to the wrapped objects step function.
kwargs – Any kwargs to pass to the wrapped objects step function.
- Returns
The return, if any, from the wrapped objects step function
-
property
wrapped
¶ The object to wrap the step function and properties for.
- Type
return
-
property
wrapped_epoch
¶ The current epoch the wrapped object is at.
- Type
return
-
property
wrapped_manager
¶ The manager to forward lifecycle calls into such as step.
- Type
return
-
property
wrapped_module
¶ The model/module used in the training process.
- Type
return
-
property
wrapped_optimizer
¶ The optimizer used in the training process.
- Type
return
-
property
wrapped_steps
¶ The current number of steps that have been called for the wrapped object.
- Type
return
-
property
wrapped_steps_per_epoch
¶ The number of optimizer steps (batches) in each epoch.
- Type
return
-
class
sparseml.pytorch.optim.manager.
ScheduledModifierManager
(modifiers: List[sparseml.pytorch.sparsification.modifier.ScheduledModifier], metadata: Optional[Dict[str, Any]] = None)[source]¶ Bases:
sparseml.optim.manager.BaseManager
,sparseml.pytorch.sparsification.modifier.Modifier
The base modifier manager, handles managing multiple ScheduledModifers.
Lifecycle:- initialize- initialize_loggers- modify- finalize- Parameters
modifiers – the modifiers to wrap
-
apply
(module: torch.nn.modules.module.Module, epoch: float = inf, loggers: Optional[sparseml.pytorch.utils.logger.LoggerManager] = None, finalize: bool = True, **kwargs)[source]¶ Applies the lifecycle of each stage in the manager/recipe by calling into initialize and finalize for each modifier for each stage
- Parameters
module – the PyTorch model/module to modify
epoch – the epoch to apply the modifier at, defaults to math.inf (end)
loggers – Optional logger manager to log the modification process to
finalize – True to invoke finalize after initialize, False otherwise. If training after one shot, set finalize=False to keep modifiers applied.
kwargs – Optional kwargs to support specific arguments for individual modifiers (passed to initialize and finalize).
-
apply_structure
(module: torch.nn.modules.module.Module, epoch: float = 0.0, loggers: Union[None, sparseml.pytorch.utils.logger.LoggerManager, List[sparseml.pytorch.utils.logger.BaseLogger]] = None, finalize: bool = False, **kwargs)[source]¶ Initialize/apply the modifier for a given model/module at the given epoch if the modifier affects the structure of the module such as quantization, layer pruning, or filter pruning. Calls into initialize(module, epoch, loggers, **kwargs) if structured.
- Parameters
module – the PyTorch model/module to modify
epoch – the epoch to apply the modifier at, defaults to 0.0 (start)
loggers – Optional logger manager to log the modification process to
finalize – True to invoke finalize after initialize, False otherwise. Set finalize to True and epoch to math.inf for one shot application.
kwargs – Optional kwargs to support specific arguments for individual modifiers (passed to initialize and finalize).
-
finalize
(module: Optional[torch.nn.modules.module.Module] = None, reset_loggers: bool = True, **kwargs)[source]¶ Handles any finalization of the modifier for the given model/module. Applies any remaining logic and cleans up any hooks or attachments to the model.
- Parameters
module – The model/module to finalize the modifier for. Marked optional so state can still be cleaned up on delete, but generally should always be passed in.
reset_loggers – True to remove any currently attached loggers (default), False to keep the loggers attached.
kwargs – Optional kwargs to support specific arguments for individual modifiers.
-
static
from_yaml
(file_path: Union[str, sparsezoo.objects.recipe.Recipe], add_modifiers: Optional[List[sparseml.pytorch.sparsification.modifier.Modifier]] = None, recipe_variables: Optional[Union[Dict[str, Any], str]] = None, metadata: Optional[Dict[str, Any]] = None)[source]¶ Convenience function used to create the manager of multiple modifiers from a recipe file.
- Parameters
file_path –
the path to the recipe file to load the modifier from, or a SparseZoo model stub to load a recipe for a model stored in SparseZoo. SparseZoo stubs should be preceded by ‘zoo:’, and can contain an optional ‘?recipe_type=<type>’ parameter. Can also be a SparseZoo Recipe object. i.e. ‘/path/to/local/recipe.yaml’, ‘zoo:model/stub/path’, ‘zoo:model/stub/path?recipe_type=transfer’. Additionally, a raw
yaml str is also supported in place of a file path.
add_modifiers – additional modifiers that should be added to the returned manager alongside the ones loaded from the recipe file
recipe_variables – additional arguments to override any root variables in the recipe with (i.e. num_epochs, init_lr)
- Metadata
additional (to the information provided in the recipe) data to be preserved and utilized in the future - for reproducibility and completeness.
- Returns
ScheduledModifierManager() created from the recipe file
-
initialize
(module: torch.nn.modules.module.Module, epoch: float = 0, loggers: Union[None, sparseml.pytorch.utils.logger.LoggerManager, List[sparseml.pytorch.utils.logger.BaseLogger]] = None, **kwargs)[source]¶ Handles any initialization of the manager for the given model/module. epoch and steps_per_epoch can optionally be passed in to initialize the manager and module at a specific point in the training process. If loggers is not None, will additionally call initialize_loggers.
- Parameters
module – the PyTorch model/module to modify
epoch – The epoch to initialize the manager and module at. Defaults to 0 (start of the training process)
loggers – Optional logger manager to log the modification process to
kwargs – Optional kwargs to support specific arguments for individual modifiers.
-
initialize_loggers
(loggers: Union[None, sparseml.pytorch.utils.logger.LoggerManager, List[sparseml.pytorch.utils.logger.BaseLogger]])[source]¶ Handles initializing and setting up the loggers for the contained modifiers.
- Parameters
loggers – the logger manager to setup this manager with for logging important info and milestones to
-
load_state_dict
(state_dict: Dict[str, Dict], strict: bool = True)[source]¶ Loads the given state dict into this manager. All modifiers that match will be loaded. If any are missing or extra and strict=True, then will raise a KeyError
- Parameters
state_dict – dictionary object as generated by this object’s state_dict function
strict – True to raise a KeyError for any missing or extra information in the state dict, False to ignore
- Raises
IndexError – If any keys in the state dict do not correspond to a valid index for this manager and strict=True
-
loss_update
(loss: torch.Tensor, module: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, epoch: float, steps_per_epoch: int, **kwargs) → torch.Tensor[source]¶ Optional call that can be made on the optimizer to update the contained modifiers once loss has been calculated
- Parameters
loss – The calculated loss tensor
module – module to modify
optimizer – optimizer to modify
epoch – current epoch and progress within the current epoch
steps_per_epoch – number of steps taken within each epoch (calculate batch number using this and epoch)
- Returns
the modified loss tensor
-
modify
(module: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, steps_per_epoch: int, wrap_optim: Optional[Any] = None, epoch: Optional[float] = None, allow_parallel_module: bool = True, **kwargs) → sparseml.pytorch.optim.manager.RecipeManagerStepWrapper[source]¶ Modify the given module and optimizer for training aware algorithms such as pruning and quantization. Initialize must be called first. After training is complete, finalize should be called.
- Parameters
module – The model/module to modify
optimizer – The optimizer to modify
steps_per_epoch – The number of optimizer steps (batches) in each epoch
wrap_optim – Optional object to wrap instead of the optimizer. Useful for cases like amp (fp16 training) where a it should be wrapped in place of the original optimizer since it doesn’t always call into the optimizer.step() function.
epoch – Optional epoch that can be passed in to start modifying at. Defaults to the epoch that was supplied to the initialize function.
allow_parallel_module – if False, a DataParallel or DistributedDataParallel module passed to this function will be unwrapped to its base module during recipe initialization by referencing module.module. This is useful so a recipe may reference the base module parameters instead of the wrapped distributed ones. Set to True to not unwrap the distributed module. Default is True
kwargs – Key word arguments that are passed to the intialize call if initilaize has not been called yet
- Returns
A wrapped optimizer object. The wrapped object makes all the original properties for the wrapped object available so it can be used without any additional code changes.
-
optimizer_post_step
(module: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, epoch: float, steps_per_epoch: int)[source]¶ Called after the optimizer step happens and weights have updated Calls into the contained modifiers
- Parameters
module – module to modify
optimizer – optimizer to modify
epoch – current epoch and progress within the current epoch
steps_per_epoch – number of steps taken within each epoch (calculate batch number using this and epoch)
-
optimizer_pre_step
(module: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, epoch: float, steps_per_epoch: int)[source]¶ Called before the optimizer step happens (after backward has been called, before optimizer.step) Calls into the contained modifiers
- Parameters
module – module to modify
optimizer – optimizer to modify
epoch – current epoch and progress within the current epoch
steps_per_epoch – number of steps taken within each epoch (calculate batch number using this and epoch)
-
state_dict
() → Dict[str, Dict][source]¶ - Returns
Dictionary to store any state variables for this manager. Includes all modifiers nested under this manager as sub keys in the dict. Only modifiers that a non empty state dict are included.
-
update
(module: torch.nn.modules.module.Module, optimizer: torch.optim.optimizer.Optimizer, epoch: float, steps_per_epoch: int, log_updates: bool = True)[source]¶ Handles updating the contained modifiers’ states, module, or optimizer Only calls scheduled_update on the each modifier if modifier.update_ready()
- Parameters
module – module to modify
optimizer – optimizer to modify
epoch – current epoch and progress within the current epoch
steps_per_epoch – number of steps taken within each epoch (calculate batch number using this and epoch)
log_updates – True to log the updates for each modifier to the loggers, False to skip logging
sparseml.pytorch.optim.mask_creator_pruning module¶
Classes for defining sparsity masks based on model parameters.
NOTE: this file is in the process of being phased out in favor of the sparsification package. Once all references to mask utils in the optim package are migrated, this file will be deleted
-
class
sparseml.pytorch.optim.mask_creator_pruning.
BlockPruningMaskCreator
(block_shape: List[int], grouping_fn_name: str = 'mean')[source]¶ Bases:
sparseml.pytorch.optim.mask_creator_pruning.GroupedPruningMaskCreator
Structured sparsity mask creator that groups the input tensor into blocks of shape block_shape.
- Parameters
block_shape – The shape in and out channel should take in blocks. Should be a list of exactly two integers that divide the input tensors evenly on the channel dimensions. -1 for a dimension blocks across the entire dimension
grouping_fn_name – The name of the torch grouping function to reduce dimensions by
-
class
sparseml.pytorch.optim.mask_creator_pruning.
DimensionSparsityMaskCreator
(dim: Union[str, int, List[int]], grouping_fn_name: str = 'l2', tensor_group_idxs: Optional[List[List[int]]] = None)[source]¶ Bases:
sparseml.pytorch.optim.mask_creator_pruning.GroupedPruningMaskCreator
Structured sparsity mask creator that groups sparsity blocks by the given dimension(s)
- Parameters
dim – The index or list of indices of dimensions to group the mask by or the type of dims to prune ([‘channel’, ‘filter’])
grouping_fn_name – The name of the torch grouping function to reduce dimensions by. Default is ‘l2’
tensor_group_idxs – list of lists of input tensor idxs whose given dimensions should be scored together. If set, all idxs in the range of provided tensors must be included in exactly one group (tensors in their own group should be a list of length 1). If None, no tensor groups will be used
-
create_sparsity_masks
(tensors: List[torch.Tensor], sparsity: Union[float, List[float]], global_sparsity: bool = False) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate masks from based on their contained values
sparsity – the desired sparsity to reach within the mask (decimal fraction of zeros) can also be a list where each element is a sparsity for a tensor in the same position in the tensor list, If global sparsity is enabled, all values of the sparsity list must be the same
global_sparsity – do not set True, unsupported for DimensionSparsityMaskCreator
- Returns
list of masks (0.0 for values that are masked, 1.0 for values that are unmasked) calculated from the tensors such that the desired number of zeros matches the sparsity and all values mapped to the same group have the same value
-
group_tensor
(tensor: torch.Tensor) → torch.Tensor[source]¶ - Parameters
tensor – The tensor to transform
- Returns
The mean values of the tensor grouped by the dimension(s) in self._dim
-
set_tensor_group_idxs
(tensor_group_idxs: Optional[List[List[int]]])[source]¶ - Parameters
tensor_group_idxs – list of lists of input tensor idxs whose given dimensions should be scored together. If set, all idxs in the range of provided tensors must be included in exactly one group (tensors in their own group should be a list of length 1). If None, no tensor groups will be used
-
property
structure_type
¶ the type of structure pruned masks this mask creator produces must be either ‘channel’ or ‘filter’
- Type
return
-
class
sparseml.pytorch.optim.mask_creator_pruning.
FourBlockMaskCreator
(grouping_fn_name: str = 'mean')[source]¶ Bases:
sparseml.pytorch.optim.mask_creator_pruning.GroupedPruningMaskCreator
semi-structured sparsity mask creator that groups sparsity blocks in groups of four along the input-channel dimension (assumed to be dimension 1 for pytorch)
Equivalent to BlockPruningMaskCreator([1, 4]) without restrictions on number of dimensions, or divisibility
- Parameters
grouping_fn_name – The name of the torch grouping function to reduce dimensions by
-
class
sparseml.pytorch.optim.mask_creator_pruning.
GroupedPruningMaskCreator
[source]¶ Bases:
sparseml.pytorch.optim.mask_creator_pruning.UnstructuredPruningMaskCreator
Abstract class for a sparsity mask creator that structures masks according to grouping functions. Subclasses should implement group_tensor and _map_mask_to_tensor
-
create_sparsity_masks
(tensors: List[torch.Tensor], sparsity: Union[float, List[float]], global_sparsity: bool = False) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate masks from based on their contained values
sparsity – the desired sparsity to reach within the mask (decimal fraction of zeros) can also be a list where each element is a sparsity for a tensor in the same position in the tensor list. If global sparsity is enabled, all values of the sparsity list must be the same
global_sparsity – if True, sparsity masks will be created such that the average sparsity across all given tensors is the target sparsity with the lowest global values masked. If False, each tensor will be masked to the target sparsity ranking values within each individual tensor. Default is False
- Returns
list of masks (0.0 for values that are masked, 1.0 for values that are unmasked) calculated from the tensors such that the desired number of zeros matches the sparsity and all values mapped to the same group have the same value
-
create_sparsity_masks_from_tensor
(tensors: List[torch.Tensor]) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate masks based on their values
- Returns
list of masks derived from the values of the tensors grouped by the group_tensor function.
-
create_sparsity_masks_from_threshold
(tensors: List[torch.Tensor], threshold: Union[float, torch.Tensor]) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate masks from based on their contained values
threshold – a threshold of group_tensor values to determine cutoff for sparsification
- Returns
list of masks derived from the tensors and the grouped threshold
-
abstract
group_tensor
(tensor: torch.Tensor) → torch.Tensor[source]¶ - Parameters
tensor – The tensor to reduce in groups
- Returns
The grouped tensor
-
static
reduce_tensor
(tensor: torch.Tensor, dim: Union[int, List[int]], reduce_fn_name: str, keepdim: bool = True) → torch.Tensor[source]¶ - Parameters
tensor – the tensor to reduce
dim – dimension or list of dimension to reduce along
reduce_fn_name – function name to reduce tensor with. valid options are ‘l2’, ‘mean’, ‘max’, ‘min’
keepdim – preserves the reduced dimension(s) in returned tensor shape as shape 1. default is True
- Returns
Tensor reduced along the given dimension(s)
-
-
class
sparseml.pytorch.optim.mask_creator_pruning.
PruningMaskCreator
[source]¶ Bases:
abc.ABC
Base abstract class for a sparsity mask creator. Subclasses should define all methods for creating masks
-
abstract
create_sparsity_masks
(tensors: List[torch.Tensor], sparsity: Union[float, List[float]], global_sparsity: bool = False) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate a masks based on their contained values
sparsity – the desired sparsity to reach within the mask (decimal fraction of zeros) can also be a list where each element is a sparsity for a tensor in the same position in the tensor list. If global sparsity is enabled, all values of the sparsity list must be the same
global_sparsity – if True, sparsity masks will be created such that the average sparsity across all given tensors is the target sparsity with the lowest global values masked. If False, each tensor will be masked to the target sparsity ranking values within each individual tensor. Default is False
- Returns
list of masks (0.0 for values that are masked, 1.0 for values that are unmasked) calculated from the tensors such that the desired number of zeros matches the sparsity.
-
create_sparsity_masks_from_tensor
(tensors: List[torch.Tensor]) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate a masks based on their values
- Returns
list of masks derived from each of the given tensors
-
abstract
create_sparsity_masks_from_threshold
(tensors: List[torch.Tensor], threshold: Union[float, torch.Tensor]) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate a masks based on their contained values
threshold – a threshold to determine cutoff for sparsification
- Returns
list of masks derived from each of the given tensors and the threshold
-
abstract
-
class
sparseml.pytorch.optim.mask_creator_pruning.
UnstructuredPruningMaskCreator
[source]¶ Bases:
sparseml.pytorch.optim.mask_creator_pruning.PruningMaskCreator
Class for creating unstructured sparsity masks. Masks will be created using unstructured sparsity by pruning weights ranked by their value. Each mask will correspond to the given tensor.
-
create_sparsity_masks
(tensors: List[torch.Tensor], sparsity: Union[float, List[float]], global_sparsity: bool = False) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate a mask from based on their contained values
sparsity – the desired sparsity to reach within the mask (decimal fraction of zeros) can also be a list where each element is a sparsity for a tensor in the same position in the tensor list. If global sparsity is enabled, all values of the sparsity list must be the same
global_sparsity – if True, sparsity masks will be created such that the average sparsity across all given tensors is the target sparsity with the lowest global values masked. If False, each tensor will be masked to the target sparsity ranking values within each individual tensor. Default is False
- Returns
list of masks (0.0 for values that are masked, 1.0 for values that are unmasked) calculated from the tensors such that the desired number of zeros matches the sparsity. If there are more zeros than the desired sparsity, zeros will be randomly chosen to match the target sparsity
-
create_sparsity_masks_from_threshold
(tensors: List[torch.Tensor], threshold: Union[float, torch.Tensor]) → List[torch.Tensor][source]¶ - Parameters
tensors – list of tensors to calculate a masks based on their contained values
threshold – a threshold at which to mask values if they are less than it or equal
- Returns
list of masks (0.0 for values that are masked, 1.0 for values that are unmasked) calculated from the tensors values <= threshold are masked, all others are unmasked
-
-
sparseml.pytorch.optim.mask_creator_pruning.
load_mask_creator
(obj: Union[str, Iterable[int]]) → sparseml.pytorch.optim.mask_creator_pruning.PruningMaskCreator[source]¶ - Parameters
obj – Formatted string or block shape iterable specifying SparsityMaskCreator object to return
- Returns
SparsityMaskCreator object created from obj
sparseml.pytorch.optim.mask_pruning module¶
Code related to applying a mask onto a parameter to impose kernel sparsity, aka model pruning
NOTE: this file is in the process of being phased out in favor of the sparsification package. Once all references to mask utils in the optim package are migrated, this file will be deleted
-
class
sparseml.pytorch.optim.mask_pruning.
ModuleParamPruningMask
(layers: List[torch.nn.modules.module.Module], param_names: Union[str, List[str]] = 'weight', store_init: bool = False, store_unmasked: bool = False, track_grad_mom: float = - 1.0, mask_creator: sparseml.pytorch.optim.mask_creator_pruning.PruningMaskCreator = unstructured, layer_names: Optional[List[str]] = None, global_sparsity: bool = False, score_type: str = 'magnitude')[source]¶ Bases:
object
Mask to apply kernel sparsity (model pruning) to a specific parameter in a layer
- Parameters
layers – the layers containing the parameters to mask
param_names – the names of the parameter to mask in each layer. If only one name is given, that name will be applied to all layers that this object masks. default is weight
store_init – store the init weights in a separate variable that can be used and referenced later
store_unmasked – store the unmasked weights in a separate variable that can be used and referenced later
track_grad_mom – store the gradient updates to the parameter with a momentum variable must be in the range [0.0, 1.0), if set to 0.0 then will only keep most recent
mask_creator – object to define sparisty mask creation, default is unstructured mask
layer_names – the name of the layers the parameters to mask are located in
global_sparsity – set True to enable global pruning. if True, when creating sparsity masks for a target sparsity sparsity masks will be created such that the average sparsity across all given layers is the target sparsity with the lowest global values masked. If False, each layer will be masked to the target sparsity ranking values within each individual tensor. Default is False
score_type – the method used to score parameters for masking, i.e. ‘magnitude’, ‘movement’. Default is ‘magnitude’
-
property
allow_reintroduction
¶ True if weight reintroduction is allowed
- Type
return
-
apply
(param_idx: Optional[int] = None)[source]¶ apply the current mask to the params tensor (zero out the desired values)
- Parameters
param_idx – index of parameter to apply mask to. if not set, then masks will be applied to all parameters with available masks
-
disable_reintroduction
()[source]¶ if weight reintroduction is enabled (only during movement pruning), disables further weight reintroduction
-
property
enabled
¶ True if the parameter is currently being masked, False otherwise
- Type
return
-
property
global_sparsity
¶ True if global pruning is enabled, False otherwise
- Type
return
-
property
layer_names
¶ the names of the layers the parameter to mask is located in
- Type
return
-
property
layers
¶ the layers containing the parameters to mask
- Type
return
-
property
mask_creator
¶ SparsityMaskCreator object used to generate masks
- Type
return
-
property
names
¶ the full names of the sparsity masks in the following format: <LAYER>.<PARAM>.sparsity_mask
- Type
return
-
property
param_masks
¶ the current masks applied to each of the parameters
- Type
return
-
property
param_names
¶ the names of the parameters to mask in the layers
- Type
return
-
property
params_data
¶ the current tensors in each of the parameters
- Type
return
-
property
params_grad
¶ the current gradient values for each parameter
- Type
return
-
property
params_init
¶ the initial values of the parameters before being masked
- Type
return
-
property
params_unmasked
¶ the unmasked values of the parameters (stores the last unmasked value before masking)
- Type
return
-
pre_optim_step_update
()[source]¶ updates scores and buffers that depend on gradients. Should be called before Optimizer.step() to grab the latest gradients
-
pruning_end
(leave_enabled: bool)[source]¶ Performs any cleanup necessary for this pruning method. Disables weight reintroduction if enabled and applies masks
- Parameters
leave_enabled – if False, all pruning hooks will be destroyed. Default is True
-
reset
()[source]¶ resets the current stored tensors such that they will be on the same device and have the initial data
-
property
score_type
¶ the scoring method used to create masks (i.e. magnitude, movement)
- Type
return
-
set_param_data
(value: torch.Tensor, param_idx: int)[source]¶ - Parameters
value – the value to set as the current tensor for the parameter, if enabled the mask will be applied
param_idx – index of the parameter in this object to set the data of
-
set_param_masks
(masks: List[torch.Tensor])[source]¶ - Parameters
masks – the masks to set and apply as the current param tensors, if enabled mask is applied immediately
-
set_param_masks_from_abs_threshold
(threshold: Union[float, torch.Tensor]) → List[torch.Tensor][source]¶ Convenience function to set the parameter masks such that if abs(value) <= threshold the it a value is masked to 0
- Parameters
threshold – the threshold at which all values will be masked to 0
-
set_param_masks_from_sparsity
(sparsity: Union[float, List[float]]) → List[torch.Tensor][source]¶ Convenience function to set the parameter masks such that each masks have an amount of masked values such that the percentage equals the sparsity amount given. Masks the absolute smallest values up until sparsity is reached.
- Parameters
sparsity – the decimal sparsity to set the param mask to can also be a list where each element is a sparsity for a tensor in the same position in the tensor list. If global sparsity is enabled, all values of the sparsity list must be the same
-
set_param_masks_from_weights
() → List[torch.Tensor][source]¶ Convenience function to set the parameter masks such that the mask is 1 if a parameter value is non zero and 0 otherwise, unless otherwise defined by this object’s mask_creator
-
property
store_init
¶ store the init weights in a separate variable that can be used and referenced later
- Type
return
-
property
store_unmasked
¶ store the unmasked weights in a separate variable that can be used and referenced later
- Type
return
-
property
track_grad_mom
¶ store the gradient updates to the parameter with a momentum variable must be in the range [0.0, 1.0), if set to 0.0 then will only keep most recent
- Type
return
sparseml.pytorch.optim.modifier module¶
sparseml.pytorch.optim.modifier_as module¶
sparseml.pytorch.optim.modifier_epoch module¶
sparseml.pytorch.optim.modifier_lr module¶
sparseml.pytorch.optim.modifier_params module¶
sparseml.pytorch.optim.modifier_pruning module¶
sparseml.pytorch.optim.modifier_quantization module¶
sparseml.pytorch.optim.modifier_regularizer module¶
sparseml.pytorch.optim.optimizer module¶
Optimizer wrapper for enforcing Modifiers on the training process of a Module.
-
class
sparseml.pytorch.optim.optimizer.
ScheduledOptimizer
(optimizer: torch.optim.optimizer.Optimizer, module: torch.nn.modules.module.Module, manager: sparseml.pytorch.optim.manager.ScheduledModifierManager, steps_per_epoch: int, loggers: Optional[List[sparseml.pytorch.utils.logger.BaseLogger]] = None, initialize_kwargs: Optional[Dict[str, Any]] = None)[source]¶ Bases:
torch.optim.optimizer.Optimizer
An optimizer wrapper to handle applying modifiers according to their schedule to both the passed in optimizer and the module.
Overrides the step() function so that this method can call before and after on the modifiers to apply appropriate modifications to both the optimizer and the module.
The epoch_start and epoch_end are based on how many steps have been taken along with the steps_per_epoch.
Lifecycle:- training cycle- zero_grad- loss_update- modifiers.loss_update- step- modifiers.update- modifiers.optimizer_pre_step- optimizer.step- modifiers.optimizers_post_step- Parameters
module – module to modify
optimizer – optimizer to modify
manager – the manager or list of managers used to apply modifications
steps_per_epoch – the number of steps or batches in each epoch, not strictly required and can be set to -1. used to calculate decimals within the epoch, when not using can result in irregularities
loggers – logger manager to log important info to within the modifiers; ex tensorboard or to the console
initialize_kwargs – key word arguments and values to be passed to the recipe manager initialize function
-
adjust_current_step
(epoch: int, step: int)[source]¶ Adjust the current step for the manager’s schedule to the given epoch and step.
- Parameters
epoch – the epoch to set the current global step to match
step – the step (batch) within the epoch to set the current global step to match
-
property
learning_rate
¶ convenience function to get the first learning rate for any of the param groups in the optimizer
- Type
return
-
loss_update
(loss: torch.Tensor) → torch.Tensor[source]¶ Optional call to update modifiers based on the calculated loss. Not needed unless one or more of the modifier is using the loss to make a modification or is modifying the loss itself.
- Parameters
loss – the calculated loss after running a forward pass and loss_fn
- Returns
the modified loss tensor
-
property
manager
¶ The ScheduledModifierManager for this optimizer
- Type
return
-
step
(closure=None)[source]¶ Called to perform a step on the optimizer activation normal. Updates the current epoch based on the step count. Calls into modifiers before the step happens. Calls into modifiers after the step happens.
- Parameters
closure – optional closure passed into the contained optimizer for the step
sparseml.pytorch.optim.sensitivity_as module¶
Sensitivity analysis implementations for increasing activation sparsity by using FATReLU
-
class
sparseml.pytorch.optim.sensitivity_as.
ASLayerTracker
(layer: torch.nn.modules.module.Module, track_input: bool = False, track_output: bool = False, input_func: Union[None, Callable] = None, output_func: Union[None, Callable] = None)[source]¶ Bases:
object
An implementation for tracking activation sparsity properties for a module.
- Parameters
layer – the module to track activation sparsity for
track_input – track the input sparsity for the module
track_output – track the output sparsity for the module
input_func – the function to call on input to the layer and receives the input tensor
output_func – the function to call on output to the layer and receives the output tensor
-
property
tracked_input
¶ the current tracked input results
- Type
return
-
property
tracked_output
¶ the current tracked output results
- Type
return
-
class
sparseml.pytorch.optim.sensitivity_as.
LayerBoostResults
(name: str, threshold: float, boosted_as: torch.Tensor, boosted_loss: sparseml.pytorch.utils.module.ModuleRunResults, baseline_as: torch.Tensor, baseline_loss: sparseml.pytorch.utils.module.ModuleRunResults)[source]¶ Bases:
object
Results for a specific threshold set in a FATReLU layer.
- Parameters
name – the name of the layer the results are for
threshold – the threshold used in the FATReLU layer
boosted_as – the measured activation sparsity after threshold is applied
boosted_loss – the measured loss after threshold is applied
baseline_as – the measured activation sparsity before threshold is applied
baseline_loss – the measured loss before threshold is applied
-
property
baseline_as
¶ the measured activation sparsity before threshold is applied
- Type
return
-
property
baseline_loss
¶ the measured loss before threshold is applied
- Type
return
-
property
boosted_as
¶ the measured activation sparsity after threshold is applied
- Type
return
-
property
boosted_loss
¶ the measured loss after threshold is applied
- Type
return
-
property
name
¶ the name of the layer the results are for
- Type
return
-
property
threshold
¶ the threshold used in the FATReLU layer
- Type
return
-
class
sparseml.pytorch.optim.sensitivity_as.
ModuleASOneShootBooster
(module: torch.nn.modules.module.Module, device: str, dataset: torch.utils.data.dataset.Dataset, batch_size: int, loss: sparseml.pytorch.utils.loss.LossWrapper, data_loader_kwargs: Dict)[source]¶ Bases:
object
Implementation class for boosting the activation sparsity in a given module using FATReLUs. Programmatically goes through and figures out the best thresholds to limit loss based on provided parameters.
- Parameters
module – the module to boost
device – the device to run the analysis on; ex [cpu, cuda, cuda:1]
dataset – the dataset used to evaluate the boosting on
batch_size – the batch size to run through the module in test mode
loss – the loss function to use for calculations
data_loader_kwargs – any keyword arguments to supply to a the DataLoader constructor
-
run_layers
(layers: List[str], max_target_metric_loss: float, metric_key: str, metric_increases: bool, precision: float = 0.001) → Dict[str, sparseml.pytorch.optim.sensitivity_as.LayerBoostResults][source]¶ Run the booster for the specified layers.
- Parameters
layers – names of the layers to run boosting on
max_target_metric_loss – the max loss in the target metric that can happen while boosting
metric_key – the name of the metric to evaluate while boosting; ex: [__loss__, top1acc, top5acc]. Must exist in the LossWrapper
metric_increases – True if the metric increases for worse loss such as in a CrossEntropyLoss, False if the metric decreases for worse such as in accuracy
precision – the precision to check the results to. Larger values here will give less precise results but won’t take as long
- Returns
The results for the boosting
sparseml.pytorch.optim.sensitivity_lr module¶
Sensitivity analysis implementations for learning rate on Modules against loss funcs.
-
sparseml.pytorch.optim.sensitivity_lr.
default_exponential_check_lrs
(init_lr: float = 1e-06, final_lr: float = 0.5, lr_mult: float = 1.1) → Tuple[float, …][source]¶ Get the default learning rates to check between init_lr and final_lr.
- Parameters
init_lr – the initial learning rate in the returned list
final_lr – the final learning rate in the returned list
lr_mult – the multiplier increase for each step between init_lr and final_lr
- Returns
the list of created lrs that increase exponentially between init_lr and final_lr according to lr_mult
-
sparseml.pytorch.optim.sensitivity_lr.
lr_loss_sensitivity
(module: torch.nn.modules.module.Module, data: torch.utils.data.dataloader.DataLoader, loss: Union[sparseml.pytorch.utils.loss.LossWrapper, Callable[[Any, Any], torch.Tensor]], optim: torch.optim.optimizer.Optimizer, device: str, steps_per_measurement: int, check_lrs: Union[List[float], Tuple[float, …]] = (1e-06, 1.1e-06, 1.21e-06, 1.3310000000000003e-06, 1.4641000000000003e-06, 1.6105100000000006e-06, 1.7715610000000007e-06, 1.948717100000001e-06, 2.1435888100000012e-06, 2.3579476910000015e-06, 2.5937424601000017e-06, 2.853116706110002e-06, 3.1384283767210024e-06, 3.452271214393103e-06, 3.7974983358324136e-06, 4.177248169415655e-06, 4.594972986357221e-06, 5.0544702849929435e-06, 5.559917313492238e-06, 6.115909044841462e-06, 6.727499949325609e-06, 7.40024994425817e-06, 8.140274938683989e-06, 8.954302432552388e-06, 9.849732675807628e-06, 1.0834705943388392e-05, 1.1918176537727232e-05, 1.3109994191499957e-05, 1.4420993610649954e-05, 1.586309297171495e-05, 1.7449402268886447e-05, 1.9194342495775094e-05, 2.1113776745352607e-05, 2.322515441988787e-05, 2.554766986187666e-05, 2.8102436848064327e-05, 3.091268053287076e-05, 3.4003948586157844e-05, 3.7404343444773634e-05, 4.1144777789251e-05, 4.52592555681761e-05, 4.978518112499371e-05, 5.4763699237493086e-05, 6.02400691612424e-05, 6.626407607736664e-05, 7.289048368510331e-05, 8.017953205361364e-05, 8.819748525897502e-05, 9.701723378487253e-05, 0.00010671895716335979, 0.00011739085287969578, 0.00012912993816766537, 0.00014204293198443192, 0.00015624722518287512, 0.00017187194770116264, 0.00018905914247127894, 0.00020796505671840686, 0.00022876156239024756, 0.00025163771862927233, 0.0002768014904921996, 0.0003044816395414196, 0.00033492980349556157, 0.00036842278384511775, 0.0004052650622296296, 0.0004457915684525926, 0.0004903707252978519, 0.0005394077978276372, 0.000593348577610401, 0.0006526834353714411, 0.0007179517789085853, 0.0007897469567994438, 0.0008687216524793883, 0.0009555938177273272, 0.00105115319950006, 0.001156268519450066, 0.0012718953713950728, 0.0013990849085345801, 0.0015389933993880383, 0.0016928927393268422, 0.0018621820132595267, 0.0020484002145854797, 0.0022532402360440277, 0.0024785642596484307, 0.002726420685613274, 0.0029990627541746015, 0.003298969029592062, 0.0036288659325512686, 0.003991752525806396, 0.0043909277783870364, 0.004830020556225741, 0.005313022611848316, 0.005844324873033148, 0.006428757360336463, 0.00707163309637011, 0.007778796406007121, 0.008556676046607835, 0.009412343651268619, 0.010353578016395481, 0.01138893581803503, 0.012527829399838533, 0.013780612339822387, 0.015158673573804626, 0.01667454093118509, 0.0183419950243036, 0.020176194526733963, 0.02219381397940736, 0.0244131953773481, 0.02685451491508291, 0.029539966406591206, 0.03249396304725033, 0.03574335935197537, 0.03931769528717291, 0.043249464815890204, 0.047574411297479226, 0.052331852427227155, 0.05756503766994987, 0.06332154143694486, 0.06965369558063936, 0.0766190651387033, 0.08428097165257363, 0.092709068817831, 0.10197997569961412, 0.11217797326957554, 0.1233957705965331, 0.13573534765618642, 0.14930888242180507, 0.1642397706639856, 0.18066374773038418, 0.19873012250342262, 0.2186031347537649, 0.2404634482291414, 0.2645097930520556, 0.29096077235726114, 0.3200568495929873, 0.3520625345522861, 0.38726878800751474, 0.4259956668082662, 0.4685952334890929, 0.5154547568380022, 0.5), loss_key: str = '__loss__', trainer_run_funcs: Optional[sparseml.pytorch.utils.module.ModuleRunFuncs] = None, trainer_loggers: Optional[List[sparseml.pytorch.utils.logger.BaseLogger]] = None, show_progress: bool = True) → sparseml.optim.sensitivity.LRLossSensitivityAnalysis[source]¶ Implementation for handling running sensitivity analysis for learning rates on modules.
- Parameters
module – the module to run the learning rate sensitivity analysis over, it is expected to already be on the correct device
data – the data to run through the module for calculating the sensitivity analysis
loss – the loss function to use for the sensitivity analysis
optim – the optimizer to run the sensitivity analysis with
device – the device to run the analysis on; ex: cpu, cuda. module must already be on that device, this is used to place then data on that same device.
steps_per_measurement – the number of batches to run through for the analysis at each LR
check_lrs – the learning rates to check for analysis (will sort them small to large before running)
loss_key – the key for the loss function to track in the returned dict
trainer_run_funcs – override functions for ModuleTrainer class
trainer_loggers – loggers to log data to while running the analysis
show_progress – track progress of the runs if True
- Returns
a list of tuples containing the analyzed learning rate at 0 and the ModuleRunResults in 1, ModuleRunResults being a collection of all the batch results run through the module at that LR
sparseml.pytorch.optim.sensitivity_pruning module¶
Sensitivity analysis implementations for kernel sparsity on Modules against loss funcs.
-
sparseml.pytorch.optim.sensitivity_pruning.
model_prunability_magnitude
(module: torch.nn.modules.module.Module)[source]¶ Calculate the approximate sensitivity for an overall model. Range of the values are not scaled to anything, so must be taken in context with other known models.
- Parameters
module – the model to calculate the sensitivity for
- Returns
the approximated sensitivity
-
sparseml.pytorch.optim.sensitivity_pruning.
pruning_loss_sens_magnitude
(module: torch.nn.modules.module.Module, sparsity_levels: Union[List[float], Tuple[float, …]] = (0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.8, 0.81, 0.82, 0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.9, 0.91, 0.92, 0.93, 0.94, 0.95, 0.96, 0.97, 0.98, 0.99)) → sparseml.optim.sensitivity.PruningLossSensitivityAnalysis[source]¶ Approximated kernel sparsity (pruning) loss analysis for a given model. Returns the results for each prunable param (conv, linear) in the model.
- Parameters
module – the model to calculate the sparse sensitivity analysis for
sparsity_levels – the sparsity levels to calculate the loss for for each param
- Returns
the analysis results for the model
-
sparseml.pytorch.optim.sensitivity_pruning.
pruning_loss_sens_one_shot
(module: torch.nn.modules.module.Module, data: torch.utils.data.dataloader.DataLoader, loss: Union[sparseml.pytorch.utils.loss.LossWrapper, Callable[[Any, Any], torch.Tensor]], device: str, steps_per_measurement: int, sparsity_levels: List[int] = (0.0, 0.2, 0.4, 0.6, 0.7, 0.8, 0.85, 0.9, 0.95, 0.99), loss_key: str = '__loss__', tester_run_funcs: Optional[sparseml.pytorch.utils.module.ModuleRunFuncs] = None, tester_loggers: Optional[List[sparseml.pytorch.utils.logger.BaseLogger]] = None, show_progress: bool = True) → sparseml.optim.sensitivity.PruningLossSensitivityAnalysis[source]¶ Run a one shot sensitivity analysis for kernel sparsity. It does not retrain, and instead puts the model to eval mode. Moves layer by layer to calculate the sensitivity analysis for each and resets the previously run layers. Note, by default it caches the data. This means it is not parallel for data loading and the first run can take longer. Subsequent sparsity checks for layers and levels will be much faster.
- Parameters
module – the module to run the kernel sparsity sensitivity analysis over will extract all prunable layers out
data – the data to run through the module for calculating the sensitivity analysis
loss – the loss function to use for the sensitivity analysis
device – the device to run the analysis on; ex: cpu, cuda
steps_per_measurement – the number of samples or items to take for each measurement at each sparsity lev
sparsity_levels – the sparsity levels to check for each layer to calculate sensitivity
loss_key – the key for the loss function to track in the returned dict
tester_run_funcs – override functions to use in the ModuleTester that runs
tester_loggers – loggers to log data to while running the analysis
show_progress – track progress of the runs if True
- Returns
the sensitivity results for every layer that is prunable
Module contents¶
Recalibration code for the PyTorch framework. Handles things like model pruning and increasing activation sparsity.