Neural Magic LogoNeural Magic Logo
DeepSparse EngineSparseMLSparseZoo
Get Started
Use a Model

Use a Model

DeepSparse supports fast inference on CPUs for sparse and dense models. For sparse models in particular, it achieves GPU-level performance in many use cases.

Around the engine, the DeepSparse package includes various utilities to simplify benchmarking performance and model deployment. For instance:

  • Trained models are passed in the open ONNX file format, enabling easy exporting from common packages like PyTorch, Keras, and TensorFlow.
  • Benchmaking latency and performance is available via a single CLI call, with various arguments to test scenarios.
  • Pipelines utilities wrap the model execution with input pre-processing and output post-processing, simplifying deployment and adding functionality like multi-stream, bucketing, and dynamic shape.

Use Case Examples

The examples below walk through use cases leveraging DeepSparse for testing and benchmarking ONNX models for integrated use cases.

Other Use Cases

More documentation, models, use cases, and examples are continually being added. If you don't see one you're interested in, search the DeepSparse GitHub repo, SparseML GitHub repo, or SparseZoo website. Or, ask in the Neural Magic Slack.

SparseZoo Installation
Use a Text Classification Model