Version: nightly

DeepSparse

Neural Magic's DeepSparse is a CPU runtime that takes advantage of sparsity within neural networks to reduce compute.

Supporting various LLMs, computer vision, and NLP models, DeepSparse integrates into popular deep learning libraries, such as Hugging Face and Ultralytics, allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Coupled with SparseML, our optimization library for pruning and quantizing your models, DeepSparse delivers exceptional inference performance on CPU hardware.

Prerequisites

Review deployment, training, and software requirements to confirm DeepSparse is compatible with your use case.

Editions and Licenses

DeepSparse is available in two editions and is licensed accordingly for end users:

DeepSparse Community is licensed under the Neural Magic DeepSparse Community License.
- See DeepSparse Community Installation for further installation options.
DeepSparse Enterprise requires a Trial License or can be fully licensed for production, commercial applications. To learn more about DeepSparse Enterprise pricing, contact our Sales team to discuss your use case further for a custom quote.
- See DeepSparse Enterprise Installation for further installation options, including license activation.

Features

Guides to get you started.

Deployment Options

Ready to deploy?

GitHub

Some source code, example files, and scripts included in the DeepSparse GitHub repository are licensed under the Apache License Version 2.0 as noted.

DeepSparse

Sparsity-aware deep learning inference runtime for CPUs.

DeepSparse

Prerequisites

Editions and Licenses

Features

Deployment Options

GitHub

DeepSparse

Content

Actions

Support

Issues

DeepSparse

Prerequisites​

Editions and Licenses​

Features​

Deployment Options​

GitHub​

DeepSparse

Content

Actions

Support

Issues

Prerequisites

Editions and Licenses

Features

Deployment Options

GitHub