DeepSparse supports fast inference on CPUs for sparse and dense models. For sparse models in particular, it achieves GPU-level performance in many use cases.
Around the engine, the DeepSparse package includes various utilities to simplify benchmarking performance and model deployment. For instance:
The examples below walk through use cases leveraging DeepSparse for testing and benchmarking ONNX models for integrated use cases.
More documentation, models, use cases, and examples are continually being added. If you don't see one you're interested in, search the DeepSparse GitHub repo, SparseML GitHub repo, or SparseZoo website. Or, ask in the Neural Magic Slack.