Sparsify 0.2

Easy-to-use UI for automatically sparsifying neural networks and creating sparsification recipes for better inference performance and a smaller footprint

Overview

Sparsify is an easy-to-use UI tool that simplifies the deep learning model optimization process to rapidly achieve the best combination of size, speed, and accuracy. Sparsify sparsifies and benchmarks models informed by industry research insights for ML practitioners, including ML engineers and operators, who need to deploy performant deep learning models fast and at scale. Sparsify shows visual performance potential for your model, including a sliding scale between performance and recovery, ultimately speeding up the model sparsification process from weeks to minutes.

This repository contains the package to locally launch Sparsify where you can create projects to load and sparsify your deep learning models. At the end, you can export sparsification recipes to integrate with your training workflow.

Sparsification

Sparsification is the process of taking a trained deep learning model and removing redundant information from the overprecise and over-parameterized network resulting in a faster and smaller model. Techniques for sparsification are all encompassing including everything from inducing sparsity using pruning and quantization to enabling naturally occurring sparsity using activation sparsity or winograd/FFT . When implemented correctly, these techniques result in significantly more performant and smaller models with limited to no effect on the baseline metrics. For example, pruning plus quantization can give noticeable improvements in performance while recovering to nearly the same baseline accuracy.

The Deep Sparse product suite builds on top of sparsification enabling you to easily apply the techniques to your datasets and models using recipe-driven approaches. Recipes encode the directions for how to sparsify a model into a simple, easily editable format. - Download a sparsification recipe and sparsified model from the SparseZoo . - Alternatively, create a recipe for your model using Sparsify . - Apply your recipe with only a few lines of code using SparseML . - Finally, for GPU-level performance on CPUs, deploy your sparse-quantized model with the DeepSparse Engine .

Full Deep Sparse product flow: