Optimize

After you analyze your model, you are ready to make optimizations on it. When you apply model optimizations (such as pruning), retraining the model is required. During model optimization, you will use Sparsify to apply the latest techniques to make your model run faster. Your goals are to:

  • Create and edit an automatic model optimization configuration using one of the Sparsify techniques:

    • Pruning

    • Quantization (future)

    • Sparse transfer learning (future)

  • Optionally, benchmark the model to get measured (rather than estimated) values.

Running Sparsify for Model Optimization

There are two ways to use model optimization settings:

  • Custom—Pruning, Quantization (future), or Sparse transfer learning (future)

  • Presets (future)

    (Model optimization settings)

    Note: If you already ran optimization for your model, you will not see these options. Instead, the Optimization screen will be displayed.

To run Sparsify for optimization:

  1. Select your optimization choice and click Next .

  2. To help Sparsify set up a pruning profile, indicate how you trained your model. Select:

    • Optimizer used.

    • Number of epochs used for your original training. This value must be greater than 1 and is generally 20 or greater.

    • Learning rate range that you used for training. The Initial LR should be less than 1 and the Final LR must be less than the Initial LR.

    For example:

    (Initial LR and Final LR)

  3. Click APPLY .

    Note: If you need to change these settings in the future, you can do so by changing the Settings . However, changing these values later will not change the modifiers on the Optimization screen.

The Optimization screen is displayed. For example:

(Optimization screen)

The Optimization screen consists of three sections:

  • Pruning Modifier

  • Learning Rate Modifier (displayed only if you entered the learning rates)

  • Training Summary

Pruning Modifier

The Pruning Modifier represents the sparsity profile, which shows all of the modifications that will be done to the training process to make it faster. For example:

(Pruning modifier)

The graph provides a visual of the modifier—the sparsity across all the layers in the model. This gives you a sense of what is changing. Tooltips provide additional information at each layer.

Values on the left of this section show the modifier results—what is expected to happen during optimization. You want to see as much speedup as possible. Below the values, notice that you can scroll through the following to see different types of information:

  • Summary

    (Summary)

    Recovery Confidence is the estimated chance of recovering the original accuracy of your model after pruning. Recovery Confidence compares the sparsity values with the optimal for loss use case. This value should be as close to 1 as possible (or greater than 1).

  • Performance

    (Performance)

  • FLOPS

    (FLOPS)

  • Params

    (Params)

Pruning settings and the active epoch range are displayed to the right of the graph.

(Params)

The average sparsity is displayed. You can use the slider to redistribute the sparsity across all layers to determine what will provide the best performance and the best chance for recovery. As you move the sparsity slider, you will see instant feedback in the measurement values and the graph. You will see changes in the estimated speedup, time, and recovery confidence. This gives you a relative sense of how much better the profile is. Sparsify redistributes the average sparsity for the model such that layers that affect the loss less and performance more are pruned to a higher level than others.

Click the (Adjustment icon) icon to see additional pruning settings. You can filter the layers by adjusting the minimum sparsity, minimum speedup, and/or minimum recovery. In addition, you can establish the optimization balance between performance and recovery.

(Pruning settings)

The active epoch range indicates when pruning will be active while training. The specified range affects accuracy and loss. If the range starts at 1 and ends at 31, pruning will run from epochs 1 through 31. The update frequency indicates the pruning steps taken (how often pruning occurs) while training. For example, if the update frequency is set to 1, pruning will take place once every epoch. If the update frequency is set to 0.5, pruning will take place at a rate of twice per epoch.

(Pruning settings)

Pruning Editor

Click (Editor icon) to access the Pruning Editor.

(Pruning editor)

The table below the graph lists information for all layers. You can open each layer for more detail that you can adjust for optimization on a layer-by-layer basis. You can click > icon to display more detail for a layer. For example:

(Layer graphs)

Learning Rate Modifier

This section is displayed only if you entered the learning rates on the Model Optimization Settings dialog.

When you apply model optimizations (such as pruning), retraining the model is required. So, the learning rate modifier indicates (as you are training) how the system will control the learning rate for that training process. The graph represents a set learning rate schedule. It shows that at each learning rate step, increasingly fine-grained details of the model are learned.

(Learning rate modifier)

These values apply to the learning rate range when retraining the model. The initial learning rate value corresponds to the starting epoch and the final learning rate corresponds to the ending epoch.

(Learning rate modifier)

Learning Rate Editor

The recovery confidence is closely tied to the learning rate range. You can adjust learning rate information and save multiple schedules.

Click (Editor icon) to access the Learning Rate Editor. For example:

(Learning rate editor)

(Remove icon) (to the right of a schedule) removes the schedule.

(+ icon) (at the top right of the learning rate schedules table) adds a learning rate schedule.

(Save icon) saves the schedule.

(X icon) removes the current settings.

Training Summary

The training summary lays out all of the modifiers that are running and when those modifiers are active while you are training. In the following example, the pruning modifier is active during the pruning stage from epochs 1 to 31, and the learning rate modifier is active throughout the training (epochs 0 to 53) during the stabilization, pruning, and fine-tuning stages.

(Pruning and LR Modifiers)

Exporting

When you are satisfied with the optimization setting, click the EXPORT button to export a configuration file and integrate code into your training, as described in the Integrate section.


Next steps…

Continue by Benchmarking . Otherwise, you are ready to Integrate .