All SparseML Sparsification APIs are designed to work with recipes. The files encode the instructions needed for modifying the model and/or training process as a list of modifiers. Example modifiers can be anything from setting the learning rate for the optimizer to gradual magnitude pruning. The files are written in YAML and stored in YAML or Markdown files using YAML front matter. The rest of the SparseML system is coded to parse the recipe files into a native format for the desired framework, and apply the modifications to the model and training pipeline.
In a recipe, modifiers must be written in a list that includes "modifiers" in its name.
The easiest ways to get or create recipes are by either using the pre-configured recipes in SparseZoo or using Sparsify's automatic creation. Especially for users performing sparse transfer learning from our pre-sparsified models in the SparseZoo, we highly reccomend using the pre-made transfer learning recipes found on SparseZoo. However, power users may be inclined to create their recipes to enable more fine-grained control or add custom modifiers when sparsifying a new model from scratch.
A sample recipe for pruning a model generally looks like the following:
1version: 0.1.02modifiers:3 - !EpochRangeModifier4 start_epoch: 0.05 end_epoch: 70.067 - !LearningRateModifier8 start_epoch: 09 end_epoch: -1.010 update_frequency: -1.011 init_lr: 0.00512 lr_class: MultiStepLR13 lr_kwargs: {'milestones': [43, 60], 'gamma': 0.1}1415 - !GMPruningModifier16 start_epoch: 017 end_epoch: 4018 update_frequency: 1.019 init_sparsity: 0.0520 final_sparsity: 0.8521 mask_type: unstructured22 params: ['sections.0.0.conv1.weight', 'sections.0.0.conv2.weight', 'sections.0.0.conv3.weight']
Recipes can contain multiple modifiers, each modifying a portion of the training process in a different way.
In general, each modifier will have a start and end epoch for when the modifier should be active.
The modifiers will start at start_epoch
and run until end_epoch
.
Note that it does not run through end_epoch
.
Additionally, all epoch values support decimal values such that they can be started anywhere within an epoch.
For example, start_epoch: 2.5
will start in the middle of the second training epoch.
The most commonly used modifiers are enumerated as subsections below.
The EpochRangeModifier
controls the range of epochs for training a model.
Each supported ML framework has an implementation to enable easily retrieving this number of epochs.
Note that this is not a hard rule and, if other modifiers have a larger end_epoch
or smaller start_epoch
,
those values will be used instead.
The only parameters that can be controlled for EpochRangeModifier
are the start_epoch
and end_epoch
.
Both parameters are required:
start_epoch
indicates the start range for the epoch (0 indexed).end_epoch
indicates the end range for the epoch.For example:
1 - !EpochRangeModifier2 start_epoch: 0.03 end_epoch: 25.0
The pruning modifiers handle pruning the specified layer(s) in a given model.
The ConstantPruningModifier
enforces the sparsity structure and level for an already pruned layer(s) in a model.
The modifier is generally used for transfer learning from an already pruned model or
to enforce sparsity while quantizing.
The weights remain trainable in this setup; however, the sparsity is unchanged.
The required parameter is:
params
indicates the parameters in the model to prune.
This can be set to a string containing __ALL__
to prune all parameters, a list to specify the targeted parameters,
or regex patterns prefixed by 're:' of parameter name patterns to match.
For example: ['blocks.1.conv']
for PyTorch and ['mnist_net/blocks/conv0/conv']
for TensorFlow.
Regex can also be used to match all conv params: ['re:.*conv']
for PyTorch and ['re:.*/conv']
for TensorFlow.For example:
1 - !ConstantPruningModifier2 params: __ALL__
The GMPruningModifier
prunes the parameter(s) in a model to a
target sparsity (percentage of 0s for a layer's parameter/variable)
using gradual magnitude pruning.
This is done gradually from an initial to final sparsity (init_sparsity
, final_sparsity
)
over a range of epochs (start_epoch
, end_epoch
) and updated at a specific interval defined by the update_frequency
.
For example, using the following settings:
start_epoch: 0
, end_epoch: 5
, update_frequency: 1
,
init_sparsity: 0.05
, final_sparsity: 0.8
will do the following.
The required parameters are:
params
indicates the parameters in the model to prune.
This can be set to a string containing __ALL__
to prune all parameters, a list to specify the targeted parameters,
or regex patterns prefixed by 're:' of parameter name patterns to match.
For example: ['blocks.1.conv']
for PyTorch and ['mnist_net/blocks/conv0/conv']
for TensorFlow.
Regex can also be used to match all conv params: ['re:.*conv']
for PyTorch and ['re:.*/conv']
for TensorFlow.init_sparsity
is the decimal value for the initial sparsity with which to start pruning.
start_epoch
will set the sparsity for the parameter/variable to this value.
Generally, this is kept at 0.05 (5%).final_sparsity
is the decimal value for the final sparsity with which to end pruning.
By the start of end_epoch
will set the sparsity for the parameter/variable to this value.
Generally, this is kept in a range from 0.6 to 0.95, depending on the model and layer.
Anything less than 0.4 is not useful for performance.start_epoch
sets the epoch at which to start the pruning (0 indexed).
This supports floating point values to enable starting pruning between epochs.end_epoch
sets the epoch before which to stop pruning.
This supports floating point values to enable stopping pruning between epochs.update_frequency
is the number of epochs/fractions of an epoch between each pruning step.
It supports floating point values to enable updating inside of epochs.
Generally, this is set to update once per epoch (1.0
).
However, if the loss for the model recovers quickly, it should be set to a lesser value.
For example, set it to 0.5
for once every half epoch (twice per epoch).For example:
1 - !GMPruningModifier2 params: ['blocks.1.conv']3 init_sparsity: 0.054 final_sparsity: 0.85 start_epoch: 5.06 end_epoch: 20.07 update_frequency: 1.0
The QuantizationModifier
sets the model to run with
quantization aware training (QAT).
QAT emulates the precision loss of int8 quantization during training so weights can be
learned to limit any accuracy loss from quantization.
Once the QuantizationModifier
is enabled, it cannot be disabled (no end_epoch
).
Quantization zero points are set to be asymmetric for activations and symmetric for weights.
Currently, quantization modifiers are available only in PyTorch.
Notes:
scripts/pytorch/model_quantize_qat_export.py
or the function
neuralmagicML.pytorch.quantization.quantize_qat_export
.ConstantPruningModifier
or have already used a GMPruningModifier
with
leave_enabled
set to True.The required parameter is:
start_epoch
sets the epoch to start QAT. This supports floating-point values to enable
starting pruning between epochs.For example:
1 - !QuantizationModifier2 start_epoch: 0.0
The learning rate modifiers set the learning rate (LR) for an optimizer during training. If you are using an Adam optimizer, then generally, these are not useful. If you are using a standard stochastic gradient descent optimizer, these give a convenient way to control the LR.
The SetLearningRateModifier
sets the LR for the optimizer to a specific value at a specific point
in the training process.
Required parameters are:
start_epoch
is the epoch in the training process to set the learning_rate
value for the optimizer.
This supports floating point values to enable setting the LR between epochs.learning_rate
is the floating-point value to set as the LR for the optimizer at start_epoch
.For example:
1 - !SetLearningRateModifier2 start_epoch: 5.03 learning_rate: 0.1
The LearningRateModifier
sets schedules for controlling the LR for an optimizer during training.
If you are using an Adam optimizer, then generally, these are not useful.
If you are using a standard stochastic gradient descent optimizer, these give a convenient way to control the LR.
Provided schedules from which to choose are:
ExponentialLR
multiplies the LR by a gamma
value every epoch.
To use this, lr_kwargs
should be set to a dictionary containing gamma
.
For example: {'gamma': 0.9}
StepLR
multiplies the LR by a gamma
value after a certain epoch period defined by step
.
To use this, lr_kwargs
must be set to a dictionary containing gamma
and step_size
.
For example: {'gamma': 0.9, 'step_size': 2.0}
MultiStepLR
multiplies the LR by a gamma
value at specific epoch points defined by milestones
.
To use this, lr_kwargs
must be set to a dictionary containing gamma
and milestones
.
For example: {'gamma': 0.9, 'milestones': [2.0, 5.5, 10.0]}
Required parameters are:
start_epoch
sets the epoch at which to start modifying the LR (0 indexed).
This supports floating point values to enable starting pruning between epochs.end_epoch
sets the epoch before which to stop modifying the LR.
This supports floating point values to enable stopping pruning between epochs.lr_class
is the LR class to use, one of [ExponentialLR
, StepLR
, MultiStepLR
].lr_kwargs
is the named argument for the lr_class
.init_lr
[Optional] is the initial LR to set at start_epoch
and to use for creating the schedules.
If not given, the optimizer's current LR will be used at startup.For example:
1 - !LearningRateModifier2 start_epoch: 0.03 end_epoch: 25.04 lr_class: MultiStepLR5 lr_kwargs:6 gamma: 0.97 milestones: [2.0, 5.5, 10.0]8 init_lr: 0.1
The TrainableParamsModifier
controls the parameters that are marked as trainable for the current optimizer.
This is generally useful when transfer learning to easily mark which parameters should or should not be frozen/trained.
The required parameter is:
params
indicates the names of parameters to mark as trainable or not.
This can be set to a string containing __ALL__
to mark all parameters, a list to specify the targeted parameters,
or regex patterns prefixed by 're:' of parameter name patterns to match.
For example: ['blocks.1.conv']
for PyTorch and ['mnist_net/blocks/conv0/conv']
for TensorFlow.
Regex can also be used to match all conv params: ['re:.*conv']
for PyTorch and ['re:.*/conv']
for TensorFlow.For example:
1 - !TrainableParamsModifier2 params: __ALL__
The SetWeightDecayModifier
sets the weight decay (L2 penalty) for the optimizer to a
specific value at a specific point in the training process.
Required parameters are:
start_epoch
is the epoch in the training process to set the weight_decay
value for the
optimizer. This supports floating point values to enable setting the weight decay
between epochs.weight_decay
is the floating point value to set as the weight decay for the optimizer
at start_epoch
.For example:
1 - !SetWeightDecayModifier2 start_epoch: 5.03 weight_decay: 0.0