sparseml.pytorch.nn package
Submodules
sparseml.pytorch.nn.activations module
Implementations related to activations for neural networks in PyTorch
-
class
sparseml.pytorch.nn.activations.
Hardswish
(num_channels: int = - 1, inplace: bool = False)[source] Bases:
torch.nn.modules.module.Module
Hardswish layer implementation:0 for x <= -3x for x >= 3x * (x + 3) / 6 otherwiseMore information can be found in the paper here.
- Parameters
num_channels – number of channels for the layer
inplace – True to run the operation in place in memory, False otherwise
-
forward
(inp: torch.Tensor)[source] Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training
: bool
-
class
sparseml.pytorch.nn.activations.
ReLU
(num_channels: int = - 1, inplace: bool = False)[source] Bases:
torch.nn.modules.activation.ReLU
ReLU wrapper to enforce that number of channels for the layer is passed in. Useful for activation sparsity work.
- Parameters
num_channels – number of channels for the layer
inplace – True to run the operation in place in memory, False otherwise
-
inplace
: bool
-
class
sparseml.pytorch.nn.activations.
ReLU6
(num_channels: int = - 1, inplace: bool = False)[source] Bases:
torch.nn.modules.activation.ReLU6
ReLU6 wrapper to enforce that number of channels for the layer is passed in. Useful for activation sparsity work.
- Parameters
num_channels – number of channels for the layer
inplace – True to run the operation in place in memory, False otherwise
-
inplace
: bool
-
max_val
: float
-
min_val
: float
-
class
sparseml.pytorch.nn.activations.
Swish
(num_channels: int = - 1)[source] Bases:
torch.nn.modules.module.Module
Swish layer OOP implementation: x * sigmoid(x). More information can be found in the paper here.
- Parameters
num_channels – number of channels for the layer
-
forward
(inp: torch.Tensor)[source] Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training
: bool
-
sparseml.pytorch.nn.activations.
create_activation
(act_type: str, inplace: bool, num_channels: int, **kwargs) → torch.nn.modules.module.Module[source] Create an activation function using the given parameters.
- Parameters
act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, hardswish, silu]
inplace – True to create the activation as an inplace, False otherwise
num_channels – The number of channels to create the activation for
kwargs – Additional kwargs to pass to the activation constructor
- Returns
the created activation layer
-
sparseml.pytorch.nn.activations.
hard_swish
(x_tens: torch.Tensor, inplace: bool = False)[source] - Hardswish layer implementation:0 for x <= -3x for x >= 3x * (x + 3) / 6 otherwise
More information can be found in the paper here.
- Parameters
x_tens – the input tensor to perform the swish op on
inplace – True to run the operation in place in memory, False otherwise
- Returns
0 for x <= -3, x for x >= 3, x * (x + 3) / 6 otherwise
-
sparseml.pytorch.nn.activations.
is_activation
(module: torch.nn.modules.module.Module) → bool[source] - Parameters
module – the module to check whether it is a common activation function or not
- Returns
True if the module is an instance of a common activation function, False otherwise
-
sparseml.pytorch.nn.activations.
replace_activation
(module: torch.nn.modules.module.Module, name: str, act_type: str, inplace: bool = False, num_channels: Optional[int] = None, **kwargs) → torch.nn.modules.module.Module[source] General function to replace the activation for a specific layer in a Module with a new one.
- Parameters
module – the module to replace the activation function in
name – the name of the layer to replace the activation for
act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, silu]
inplace – True to create the activation as an inplace, False otherwise
num_channels – The number of channels to create the activation for
kwargs – Additional kwargs to pass to the activation constructor
- Returns
the created activation layer
-
sparseml.pytorch.nn.activations.
replace_activations
(module: torch.nn.modules.module.Module, act_type: str, inplace: bool = False, num_channels: Optional[int] = None, **kwargs) → torch.nn.modules.module.Module[source] General function to replace all activation functions in a Module with a new one.
- Parameters
module – the module to replace the activation function in
act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, silu]
inplace – True to create the activation as an inplace, False otherwise
num_channels – The number of channels to create the activation for
kwargs – Additional kwargs to pass to the activation constructor
- Returns
the updated module
sparseml.pytorch.nn.fatrelu module
Implementations for the FATReLU (Forced Activation Threshold) activation function. Used to increase the activation sparsity of neural networks.
-
class
sparseml.pytorch.nn.fatrelu.
FATReLU
(threshold: Union[float, List[float]] = 0.0, inplace: bool = False)[source] Bases:
torch.nn.modules.module.Module
Applies a FAT ReLU (forced activation threshold) over the input. Instead of setting all negative values to 0 like with ReLU, this sets all values < threshold equal to 0
- Parameters
threshold – the threshold that all values < threshold will be set to 0. if type float then f(x) = x if x >= threshold else 0. if type list then f(x[:, chan]) = x[:, chan] if x[:, chan] >= threshold[chan] else 0. if type list and empty, applies activation as the list option but dynamically initializes to the num chan
inplace – perform the operation inplace or create a new tensor
-
property
channel_wise
True if the FATReLU is applied per channel, False otherwise
- Type
return
-
property
dynamic
True if the layer is in dynamic mode (gathering the number of channels), False otherwise
- Type
return
-
extra_repr
()[source] Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
-
forward
(inp: torch.Tensor)[source] Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
get_threshold
() → Union[float, List[float]][source] - Returns
the current threshold being applied for the activation
-
load_state_dict
(state_dict, strict=True)[source] Copies parameters and buffers from
state_dict
into this module and its descendants. Ifstrict
isTrue
, then the keys ofstate_dict
must exactly match the keys returned by this module’sstate_dict()
function.- Parameters
state_dict (dict) – a dict containing parameters and persistent buffers.
strict (bool, optional) – whether to strictly enforce that the keys in
state_dict
match the keys returned by this module’sstate_dict()
function. Default:True
- Returns
missing_keys is a list of str containing the missing keys
unexpected_keys is a list of str containing the unexpected keys
- Return type
NamedTuple
withmissing_keys
andunexpected_keys
fields
-
property
num_channels
The number of channels the FATReLU is acting on
- Type
return
-
set_threshold
(threshold: Union[float, List[float]])[source] - Parameters
threshold – the threshold value to set for the activation
-
training
: bool
-
sparseml.pytorch.nn.fatrelu.
convert_relus_to_fat
(module: torch.nn.modules.module.Module, **kwargs) → Dict[str, sparseml.pytorch.nn.fatrelu.FATReLU][source] Replace all of the ReLUs in a module with FATReLU instances.
Note: only works if the ReLUs are layers in the module, will not work with torch.functional ones.
- Parameters
module – the module to replace all ReLUs with FATReLU
kwargs – the kwargs to pass to the FATReLU constructor
- Returns
a dictionary containing a mapping from the names of the replaced layers to the replaced FATReLU
-
sparseml.pytorch.nn.fatrelu.
fat_exp_relu
(tens: torch.Tensor, threshold: torch.Tensor, compression: torch.Tensor) → torch.Tensor[source] Create a piecewise separable exp approximated FATReLU function to a tensor (forced activation threshold): f(x, t, c) = 0 if x <= 0; = x if x >= t; = x * e^(c(x-t)) if x > 0 and x < t
Note: there is no option for inplace with this function
- Parameters
tens – the tensor to apply the exponential fat relu to
threshold – the threshold at which all values will be zero or approximated in the exponential region
compression – the compression or slope to use in the exponential region
- Returns
f(x, t, c) = 0 if x <= 0; = x if x >= t; = x * e^(c(x-t)) if x > 0 and x < t
-
sparseml.pytorch.nn.fatrelu.
fat_pw_relu
(tens: torch.Tensor, threshold: torch.Tensor, compression: torch.Tensor, inplace: bool) → torch.Tensor[source] Apply a piecewise separable FATReLU function to a tensor (forced activation threshold): f(x, t, c) = 0 if x <= (t - t/c); x if x >= t; c(x - (t - t/c)) if x > (t - t/c) and x < t
- Parameters
tens – the tensor to apply the piecewise fat relu to
threshold – the threshold at which all values will be zero or interpolated between threshold and 0
compression – the compression or slope to interpolate between 0 and the threshold with
inplace – false to create a new tensor, true to overwrite the current tensor’s values
- Returns
f(x, t, c) = 0 if x <= (t - t/c); x if x >= t; c(x - (t - t/c)) if x > (t - t/c) and x < t
-
sparseml.pytorch.nn.fatrelu.
fat_relu
(tens: torch.Tensor, threshold: Union[torch.Tensor, float], inplace: bool) → torch.Tensor[source] Apply a FATReLU function to a tensor (forced activation threshold): f(x, t) = 0 if x < t; x if x >= t
- Parameters
tens – the tensor to apply the fat relu to
threshold – the threshold to apply. if not a single value then the dimension to broadcast across must be last in the tensor
inplace – False to create a new tensor, True to overwrite the current tensor’s values
- Returns
f(x, t) = 0 if x < t; x if x >= t
-
sparseml.pytorch.nn.fatrelu.
fat_sig_relu
(tens: torch.Tensor, threshold: torch.Tensor, compression: torch.Tensor) → torch.Tensor[source] Create a sigmoid approximated FATReLU function to a tensor (forced activation threshold): f(x, t, c) = x / e^(c*(t-x))
Note: there is no option for inplace with this function.
- Parameters
tens – the tensor to apply the sigmoid fat relu to
threshold – the threshold at which all values will be zero or approximated in the sigmoid region
compression – the compression or slope to use in the sigmoid region
- Returns
f(x, t, c) = x / e^(c*(t-x))
-
sparseml.pytorch.nn.fatrelu.
set_relu_to_fat
(module: torch.nn.modules.module.Module, layer_name: str, **kwargs) → sparseml.pytorch.nn.fatrelu.FATReLU[source] Replace a given layer in a module to a FATReLU instance.
- Parameters
module – the module to replace the given layer with a FATReLU implementation
layer_name – the name of the layer to replace with a FATReLU
kwargs – the kwargs to pass to the FATReLU constructor
- Returns
the created FATReLU instance
sparseml.pytorch.nn.se module
Implementations for Squeeze Excite in PyTorch. More information can be found in the paper here.
-
class
sparseml.pytorch.nn.se.
SqueezeExcite
(expanded_channels: int, squeezed_channels: int, act_type: str = 'relu')[source] Bases:
torch.nn.modules.module.Module
Standard implementation for SqueezeExcite in PyTorch
- Parameters
expanded_channels – the number of channels to expand to in the SE layer
squeezed_channels – the number of channels to squeeze down to in the SE layer
act_type – the activation type to use in the SE layer; options: [relu, relu6, prelu, lrelu, swish]
-
forward
(inp: torch.Tensor)[source] Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training
: bool
Module contents
Layers / operators for PyTorch models