sparseml.pytorch.nn package

Submodules

sparseml.pytorch.nn.activations module

Implementations related to activations for neural networks in PyTorch

class sparseml.pytorch.nn.activations. Hardswish ( num_channels : int = - 1 , inplace : bool = False ) [source]

Bases: torch.nn.modules.module.Module

Hardswish layer implementation:
0 for x <= -3
x for x >= 3
x * (x + 3) / 6 otherwise

More information can be found in the paper here .

Parameters
  • num_channels – number of channels for the layer

  • inplace – True to run the operation in place in memory, False otherwise

forward ( inp : torch.Tensor ) [source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training : bool
class sparseml.pytorch.nn.activations. ReLU ( num_channels : int = - 1 , inplace : bool = False ) [source]

Bases: torch.nn.modules.activation.ReLU

ReLU wrapper to enforce that number of channels for the layer is passed in. Useful for activation sparsity work.

Parameters
  • num_channels – number of channels for the layer

  • inplace – True to run the operation in place in memory, False otherwise

inplace : bool
class sparseml.pytorch.nn.activations. ReLU6 ( num_channels : int = - 1 , inplace : bool = False ) [source]

Bases: torch.nn.modules.activation.ReLU6

ReLU6 wrapper to enforce that number of channels for the layer is passed in. Useful for activation sparsity work.

Parameters
  • num_channels – number of channels for the layer

  • inplace – True to run the operation in place in memory, False otherwise

inplace : bool
max_val : float
min_val : float
class sparseml.pytorch.nn.activations. Swish ( num_channels : int = - 1 ) [source]

Bases: torch.nn.modules.module.Module

Swish layer OOP implementation: x * sigmoid(x). More information can be found in the paper here .

Parameters

num_channels – number of channels for the layer

forward ( inp : torch.Tensor ) [source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training : bool
sparseml.pytorch.nn.activations. create_activation ( act_type : str , inplace : bool , num_channels : int , ** kwargs ) torch.nn.modules.module.Module [source]

Create an activation function using the given parameters.

Parameters
  • act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, hardswish, silu]

  • inplace – True to create the activation as an inplace, False otherwise

  • num_channels – The number of channels to create the activation for

  • kwargs – Additional kwargs to pass to the activation constructor

Returns

the created activation layer

sparseml.pytorch.nn.activations. hard_swish ( x_tens : torch.Tensor , inplace : bool = False ) [source]
Hardswish layer implementation:
0 for x <= -3
x for x >= 3
x * (x + 3) / 6 otherwise

More information can be found in the paper here .

Parameters
  • x_tens – the input tensor to perform the swish op on

  • inplace – True to run the operation in place in memory, False otherwise

Returns

0 for x <= -3, x for x >= 3, x * (x + 3) / 6 otherwise

sparseml.pytorch.nn.activations. is_activation ( module : torch.nn.modules.module.Module ) bool [source]
Parameters

module – the module to check whether it is a common activation function or not

Returns

True if the module is an instance of a common activation function, False otherwise

sparseml.pytorch.nn.activations. replace_activation ( module : torch.nn.modules.module.Module , name : str , act_type : str , inplace : bool = False , num_channels : Optional [ int ] = None , ** kwargs ) torch.nn.modules.module.Module [source]

General function to replace the activation for a specific layer in a Module with a new one.

Parameters
  • module – the module to replace the activation function in

  • name – the name of the layer to replace the activation for

  • act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, silu]

  • inplace – True to create the activation as an inplace, False otherwise

  • num_channels – The number of channels to create the activation for

  • kwargs – Additional kwargs to pass to the activation constructor

Returns

the created activation layer

sparseml.pytorch.nn.activations. replace_activations ( module : torch.nn.modules.module.Module , act_type : str , inplace : bool = False , num_channels : Optional [ int ] = None , ** kwargs ) torch.nn.modules.module.Module [source]

General function to replace all activation functions in a Module with a new one.

Parameters
  • module – the module to replace the activation function in

  • act_type – the type of activation to replace with; options: [relu, relu6, prelu, lrelu, swish, silu]

  • inplace – True to create the activation as an inplace, False otherwise

  • num_channels – The number of channels to create the activation for

  • kwargs – Additional kwargs to pass to the activation constructor

Returns

the updated module

sparseml.pytorch.nn.activations. swish ( x_tens : torch.Tensor ) [source]

Swish layer functional implementation: x * sigmoid(x). More information can be found in the paper here .

Parameters

x_tens – the input tensor to perform the swish op on

Returns

the output of x_tens * sigmoid(x_tens)

sparseml.pytorch.nn.fatrelu module

Implementations for the FATReLU (Forced Activation Threshold) activation function. Used to increase the activation sparsity of neural networks.

class sparseml.pytorch.nn.fatrelu. FATReLU ( threshold : Union [ float , List [ float ] ] = 0.0 , inplace : bool = False ) [source]

Bases: torch.nn.modules.module.Module

Applies a FAT ReLU (forced activation threshold) over the input. Instead of setting all negative values to 0 like with ReLU, this sets all values < threshold equal to 0

Parameters
  • threshold – the threshold that all values < threshold will be set to 0. if type float then f(x) = x if x >= threshold else 0. if type list then f(x[:, chan]) = x[:, chan] if x[:, chan] >= threshold[chan] else 0. if type list and empty, applies activation as the list option but dynamically initializes to the num chan

  • inplace – perform the operation inplace or create a new tensor

property channel_wise

True if the FATReLU is applied per channel, False otherwise

Type

return

property dynamic

True if the layer is in dynamic mode (gathering the number of channels), False otherwise

Type

return

extra_repr ( ) [source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward ( inp : torch.Tensor ) [source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_threshold ( ) Union [ float , List [ float ] ] [source]
Returns

the current threshold being applied for the activation

load_state_dict ( state_dict , strict = True ) [source]

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True , then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Parameters
  • state_dict ( dict ) – a dict containing parameters and persistent buffers.

  • strict ( bool , optional ) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: True

Returns

  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Return type

NamedTuple with missing_keys and unexpected_keys fields

property num_channels

The number of channels the FATReLU is acting on

Type

return

set_threshold ( threshold : Union [ float , List [ float ] ] ) [source]
Parameters

threshold – the threshold value to set for the activation

training : bool
sparseml.pytorch.nn.fatrelu. convert_relus_to_fat ( module : torch.nn.modules.module.Module , ** kwargs ) Dict [ str , sparseml.pytorch.nn.fatrelu.FATReLU ] [source]

Replace all of the ReLUs in a module with FATReLU instances.

Note: only works if the ReLUs are layers in the module, will not work with torch.functional ones.

Parameters
  • module – the module to replace all ReLUs with FATReLU

  • kwargs – the kwargs to pass to the FATReLU constructor

Returns

a dictionary containing a mapping from the names of the replaced layers to the replaced FATReLU

sparseml.pytorch.nn.fatrelu. fat_exp_relu ( tens : torch.Tensor , threshold : torch.Tensor , compression : torch.Tensor ) torch.Tensor [source]

Create a piecewise separable exp approximated FATReLU function to a tensor (forced activation threshold): f(x, t, c) = 0 if x <= 0; = x if x >= t; = x * e^(c(x-t)) if x > 0 and x < t

Note: there is no option for inplace with this function

Parameters
  • tens – the tensor to apply the exponential fat relu to

  • threshold – the threshold at which all values will be zero or approximated in the exponential region

  • compression – the compression or slope to use in the exponential region

Returns

f(x, t, c) = 0 if x <= 0; = x if x >= t; = x * e^(c(x-t)) if x > 0 and x < t

sparseml.pytorch.nn.fatrelu. fat_pw_relu ( tens : torch.Tensor , threshold : torch.Tensor , compression : torch.Tensor , inplace : bool ) torch.Tensor [source]

Apply a piecewise separable FATReLU function to a tensor (forced activation threshold): f(x, t, c) = 0 if x <= (t - t/c); x if x >= t; c(x - (t - t/c)) if x > (t - t/c) and x < t

Parameters
  • tens – the tensor to apply the piecewise fat relu to

  • threshold – the threshold at which all values will be zero or interpolated between threshold and 0

  • compression – the compression or slope to interpolate between 0 and the threshold with

  • inplace – false to create a new tensor, true to overwrite the current tensor’s values

Returns

f(x, t, c) = 0 if x <= (t - t/c); x if x >= t; c(x - (t - t/c)) if x > (t - t/c) and x < t

sparseml.pytorch.nn.fatrelu. fat_relu ( tens : torch.Tensor , threshold : Union [ torch.Tensor , float ] , inplace : bool ) torch.Tensor [source]

Apply a FATReLU function to a tensor (forced activation threshold): f(x, t) = 0 if x < t; x if x >= t

Parameters
  • tens – the tensor to apply the fat relu to

  • threshold – the threshold to apply. if not a single value then the dimension to broadcast across must be last in the tensor

  • inplace – False to create a new tensor, True to overwrite the current tensor’s values

Returns

f(x, t) = 0 if x < t; x if x >= t

sparseml.pytorch.nn.fatrelu. fat_sig_relu ( tens : torch.Tensor , threshold : torch.Tensor , compression : torch.Tensor ) torch.Tensor [source]

Create a sigmoid approximated FATReLU function to a tensor (forced activation threshold): f(x, t, c) = x / e^(c*(t-x))

Note: there is no option for inplace with this function.

Parameters
  • tens – the tensor to apply the sigmoid fat relu to

  • threshold – the threshold at which all values will be zero or approximated in the sigmoid region

  • compression – the compression or slope to use in the sigmoid region

Returns

f(x, t, c) = x / e^(c*(t-x))

sparseml.pytorch.nn.fatrelu. set_relu_to_fat ( module : torch.nn.modules.module.Module , layer_name : str , ** kwargs ) sparseml.pytorch.nn.fatrelu.FATReLU [source]

Replace a given layer in a module to a FATReLU instance.

Parameters
  • module – the module to replace the given layer with a FATReLU implementation

  • layer_name – the name of the layer to replace with a FATReLU

  • kwargs – the kwargs to pass to the FATReLU constructor

Returns

the created FATReLU instance

sparseml.pytorch.nn.se module

Implementations for Squeeze Excite in PyTorch. More information can be found in the paper here .

class sparseml.pytorch.nn.se. SqueezeExcite ( expanded_channels : int , squeezed_channels : int , act_type : str = 'relu' ) [source]

Bases: torch.nn.modules.module.Module

Standard implementation for SqueezeExcite in PyTorch

Parameters
  • expanded_channels – the number of channels to expand to in the SE layer

  • squeezed_channels – the number of channels to squeeze down to in the SE layer

  • act_type – the activation type to use in the SE layer; options: [relu, relu6, prelu, lrelu, swish]

forward ( inp : torch.Tensor ) [source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training : bool

Module contents

Layers / operators for PyTorch models