DeepSparse
Neural Magic's DeepSparse is a CPU runtime that takes advantage of sparsity within neural networks to reduce compute.
Supporting various LLMs, computer vision, and NLP models, DeepSparse integrates into popular deep learning libraries, such as Hugging Face and Ultralytics, allowing you to leverage DeepSparse for loading and deploying sparse models with ONNX. ONNX gives the flexibility to serve your model in a framework-agnostic environment. Coupled with SparseML, our optimization library for pruning and quantizing your models, DeepSparse delivers exceptional inference performance on CPU hardware.
Prerequisites
Review deployment, training, and software requirements to confirm DeepSparse is compatible with your use case.
Editions and Licenses
DeepSparse is available in two editions and is licensed accordingly for end users:
-
DeepSparse Community is licensed under the Neural Magic DeepSparse Community License.
- See DeepSparse Community Installation for further installation options.
-
DeepSparse Enterprise requires a Trial License or can be fully licensed for production, commercial applications. To learn more about DeepSparse Enterprise pricing, contact our Sales team to discuss your use case further for a custom quote.
- See DeepSparse Enterprise Installation for further installation options, including license activation.
Features
Deployment Options
GitHub
Some source code, example files, and scripts included in the DeepSparse GitHub repository are licensed under the Apache License Version 2.0 as noted.
DeepSparse
Sparsity-aware deep learning inference runtime for CPUs.