Neural Magic LogoNeural Magic Logo
Products
menu-icon
Products
DeepSparse EngineSparseMLSparseZoo
Get Started
Deploy a Model
CV Object Detection

Deploy an Object Detection Model

This page walks through an example of deploying an object detection model with DeepSparse Server.

The DeepSparse Server is a server wrapper around Pipelines, including the object detection pipeline. As such, the server provides and HTTP interface that accepts images and image files as inputs and outputs the labeled predictions. With all of this built on top of the DeepSparse Engine, the simplicity of servable pipelines is combined with GPU-class performance on CPUs for sparse models.

Install Requirements

This example requires DeepSparse Server+YOLO Install.

Start the Server

Before starting the server, the model must be set up in the format expected for DeepSparse Pipelines. See an example of how to setup Pipelines in the Try a Model section.

Once the Pipelines are set up, the deepsparse.server command launches a server with the model at --model_path inside. The model_path can either be a SparseZoo stub or a path to a local model.onnx file.

The command below shows how to start up the DeepSparse Server for a sparsified YOLOv5l model trained on the COCO dataset from the SparseZoo. The output confirms the server was started on port :5543 with a /docs route for general info and a /predict/from_files route for inference.

$deepsparse.server \
--task "yolo" \
--model_path "zoo:cv/detection/yolov5-l/pytorch/ultralytics/coco/pruned_quant-aggressive_95"
>deepsparse.server.main INFO created FastAPI app for inference serving
>deepsparse.server.main INFO created general routes, visit `/docs` to view available
>DeepSparse Engine, Copyright 2021-present / Neuralmagic, Inc. version: 1.1.0 COMMUNITY EDITION (a436ca67) (release) (optimized) (system=avx512_vnni, binary=avx512)
>deepsparse.server.main INFO created route /predict
>deepsparse.server.main INFO created route /predict/from_files
>INFO:uvicorn.error:Started server process [31382]
>INFO:uvicorn.error:Waiting for application startup.
>INFO:uvicorn.error:Application startup complete.
>INFO:uvicorn.error:Uvicorn running on http://0.0.0.0:5543 (Press CTRL+C to quit)

View the Request Specs

As noted in the startup command, a /docs route was created; it contains OpenAPI specs and definitions for the expected inputs and responses. Visiting the http://localhost:5543/docs in a browser shows the available routes on the server. The important one for object detection is the /predict/from_files POST route which takes the form of a standard files argument. The files argument enables uploading one or more image files for object detection processing.

Make a Request

With the expected input payload and method type defined, any HTTP request package can be used to make the request.

First, a CURL request is made to download a sample image for use with the sample request.

wget -O basilica.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolo/sample_images/basilica.jpg

Next, for simplicity and generality, the Python requests package is used to make a POST method request to the /predict/from_files pathway on localhost:5543 with the downloaded file. The predicted outputs can then be printed out or used in a later pipeline.

1import requests
2import json
3
4resp = requests.post(
5 url="http://localhost:5543/predict/from_files",
6 files=[('request', open('basilica.jpg', 'rb'))]
7)
8print(resp.text)
>{"predictions":[[[175.6397705078125,487.64117431640625,346.1619873046875,616.2821655273438,0.8640249371528625,3.0],...

After that request completes, the server will also log the request as the following:

>INFO:uvicorn.error:Uvicorn running on http://0.0.0.0:5543 (Press CTRL+C to quit)
>INFO: 127.0.0.1:50604 - "POST /predict/from_files HTTP/1.1" 200 OK
Deploy a Text Classification Model
NLP Question Answering