🚨 Note: The current Docs site is outdated. Neural Magic's 1.7 release slated for January 2024 will include a Docs refresh. Meanwhile, please consult our GitHub repositories for the content:   DeepSparse,   SparseML,   SparseZoo.
Neural Magic LogoNeural Magic Logo
User Guides
DeepSparse Logging

DeepSparse Logging

This page explains how to use DeepSparse Logging to monitor your deployment.

There are many types of monitoring tasks that you may want to perform to confirm your production system is working correctly. The difficulty of the tasks varies from relatively easy (simple system performance analysis) to challenging (assessing the accuracy of the system in the wild by manually labeling the input data distribution post-factum). Examples include:

  • System performance: what is the latency/throughput of a query?
  • Data quality: is there an issue getting data to my model?
  • Data distribution shift: does the input data distribution deviates over time to the point where the model stops to deliver reliable predictions?
  • Model accuracy: what is the percentage of correct predictions that a model achieves?

DeepSparse Logging is designed to provide maximum flexibility for you to extract whatever data is needed from a production inference pipeline into the logging system of your choice.


This page requires the DeepSparse Server Install.


DeepSparse Logging provides access to two types of metrics.

System Logging Metrics

System Logging gives you access to granular performance metrics for quick and efficient diagnosis of system health.

There is one group of System Logging Metrics currently available: Inference Latency. For each inference request, DeepSparse Server logs the following:

  1. Pre-processing Time - seconds in the pre-processing step
  2. Engine Time - seconds in the engine forward pass step
  3. Post-processing Time - seconds in the post-processing step
  4. Total Time - second for the end-to-end response time (sum of the prior three)

Data Logging Metrics

Data Logging gives you access to data at each stage of an inference pipeline. This facilitates inspection of the data, understanding of its properties, detecting edge cases, and possible data drift.

There are four stages in the inference pipeline where Data Logging can occur:

  • pipeline_inputs: raw input passed to the inference pipeline by the user
  • engine_inputs: pre-processed tensors passed to the engine for the forward pass
  • engine_outputs: result of the engine forward pass (e.g., the raw logits)
  • pipeline_outputs: final output returned to the pipeline caller

At each stage, you can specify functions to be applied to the data before logging. Example functions include the identity function (for logging the raw input/output) or the mean function (e.g., for monitoring the mean pixel value of an image).

There are three types of functions that can be applied to target data at each stage:

  • Built-in functions: pre-written functions provided by DeepSparse (see list on GitHub).
  • Framework functions: functions from torch or numpy.
  • Custom functions: custom user-provided functions.


The YAML-based Server Config file is used to configure both System and Data Logging.

  • System Logging is enabled by default. If no logger is specified, Python Logger is used.
  • Data Logging is disabled by default. The config allows you to specify what data to log.

See the Server documentation for more details on the Server Config file.

Logging YAML Syntax

There are two key elements that should be added to the Server Config to setup logging.

First is loggers. This element configures the loggers that are used by the Server. Each element is a dictionary of the form {logger_name: {arg_1: arg_value}}.

Second is data_logging. This element identifies which/how data should be logged for an endpoint. It is a dictionary of the form {identifier: [log_config]}.

  • identifier specifies the stages where logging should occur. It can either be a pipeline stage (see stages above) or stage.property if the data type at a particular stage has a property. If the data type at a stage is a dictionary or list, you can access via slicing, indexing, or dict access, for example stage[0][:,:,0]['key3'].

  • log_config specifies which function to apply, which logger(s) to use, and how often to log. It is a dictionary of the form {func: name, frequency: freq, target_loggers: [logger_names]}.

Tangible Example

Here's an example for an image classification server:

1# example-config.yaml
3 python: # logs to stdout
4 prometheus: # logs to prometheus on port 6100
5 port: 6100
8 - task: image_classification
9 route: /image_classification/predict
10 model: zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none
11 data_logging:
12 pipeline_inputs.images: # applies to the images (of the form stage.property)
13 - func: np.shape # framework function
14 frequency: 1
15 target_loggers:
16 - python
18 pipeline_inputs.images[0]: # applies to the first image (of the form stage.property[idx])
19 - func: mean_pixels_per_channel # built-in function
20 frequency: 2
21 target_loggers:
22 - python
23 - func: fraction_zeros # built-in function
24 frequency: 2
25 target_loggers:
26 - prometheus
28 engine_inputs: # applies to the engine_inputs data (of the form stage)
29 - func: np.shape # framework function
30 frequency: 1
31 target_loggers:
32 - python

This configuration does the following data logging at each respective stage of the pipeline:

  • System logging is enabled by default and logs to Prometheus and StdOut
  • Logs the shape of the input batch provided by the user to stdout
  • Logs the mean pixels and % of 0 pixels of the first image in the batch to Prometheus
  • Logs the raw data and shape of the input passed to the engine to Python
  • No logging occurs at any other pipeline stages


DeepSparse Logging includes options to log to Standard Output and to Prometheus out of the box as well as the ability to create a Custom Logger.

Python Logger

Python Logger logs data to Standard Output. It is useful for debugging and inspecting an inference pipeline. It accepts no arguments and is configured with the following:

2 python:

Prometheus Logger

DeepSparse is integrated with Prometheus, enabling you to easily instrument your model service. The Prometheus Logger accepts some optional arguments and is configured as follows:

2 prometheus:
3 port: 6100
4 text_log_save_frequency: 10 # optional
5 text_log_save_dir: text/log/save/dir # optional
6 text_log_file_name: text_log_file_name # optional

There are four types of metrics in Prometheus (Counter, Gauge, Summary, and Histogram). DeepSparse uses Summary under the hood, so make sure the data you are logging to Prometheus is an Int or a Float.

Custom Logger

If you need a custom logger, you can create a class that inherits from the BaseLogger and implements the log method. The log method is called at each pipeline stage and should handle exposing the metric to the Logger.

1from deepsparse.loggers import BaseLogger
2from typing import Any, Optional
4class CustomLogger(BaseLogger):
5 def log(self, identifier: str, value: Any, category: Optional[str]=None):
6 """
7 :param identifier: The name of the item that is being logged.
8 By default, in the simplest case, that would be a string in the form
9 of "<pipeline_name>/<logging_target>"
10 e.g. "image_classification/pipeline_inputs"
11 :param value: The item that is logged along with the identifier
12 :param category: The metric category that the log belongs to.
13 By default, we recommend sticking to our internal convention
14 established in the MetricsCategories enum.
15 """
16 print("Logging from a custom logger")
17 print(identifier)
18 print(value)

Once a custom logger is implemented, it can be referenced from a config file:

1# server-config-with-custom-logger.yaml
3 custom_logger:
4 path: example_custom_logger.py:CustomLogger
5 # arg_1: your_arg_1
8 - task: sentiment_analysis
9 route: /sentiment_analysis/predict
10 model: zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/12layer_pruned80_quant-none-vnni
11 name: sentiment_analysis_pipeline
12 data_logging:
13 pipeline_inputs:
14 - func: identity
15 frequency: 1
16 target_loggers:
17 - custom_logger

Download the following for an example of a custom logger:

1wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/example_custom_logger.py
2wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/server-config-with-custom-logger.yaml

Launch the server:

deepsparse.server --config-file server-config-with-custom-logger.yaml

Submit a request:

1import requests
2url = ""
3obj = {"sequences": "Snorlax loves my Tesla!"}
4resp = requests.post(url=url, json=obj)

You should see data printed to the Server's standard output.

See our Prometheus logger implementation for inspiration on implementing a logger.


DeepSparse Logging is currently supported for usage with DeepSparse Server.

Server Usage

The Server startup CLI command accepts a YAML configuration file (which contains both logging-specific and general configuration details) via the --config-file argument.

Data Logging is configured at the endpoint level. The configuration file below creates a Server with two endpoints (one for image classification and one for sentiment analysis):

1# server-config.yaml
3 python:
4 prometheus:
5 port: 6100
8 - task: image_classification
9 route: /image_classification/predict
10 model: zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none
11 name: image_classification_pipeline
12 data_logging:
13 pipeline_inputs.images:
14 - func: np.shape
15 frequency: 1
16 target_loggers:
17 - python
19 pipeline_inputs.images[0]:
20 - func: max_pixels_per_channel
21 frequency: 1
22 target_loggers:
23 - python
24 - func: mean_pixels_per_channel
25 frequency: 1
26 target_loggers:
27 - python
28 - func: fraction_zeros
29 frequency: 1
30 target_loggers:
31 - prometheus
33 pipeline_outputs.scores[0]:
34 - func: identity
35 frequency: 1
36 target_loggers:
37 - prometheus
39 - task: sentiment_analysis
40 route: /sentiment_analysis/predict
41 model: zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/12layer_pruned80_quant-none-vnni
42 name: sentiment_analysis_pipeline
43 data_logging:
44 engine_inputs:
45 - func: example_custom_fn.py:sequence_length
46 frequency: 1
47 target_loggers:
48 - python
49 - prometheus
51 pipeline_outputs.scores[0]:
52 - func: identity
53 frequency: 1
54 target_loggers:
55 - python
56 - prometheus

Custom Data Logging Function

The example above included a custom function for computing sequence lengths. Custom Functions should be defined in a local Python file. They should accept one argument and return a single output.

The example_custom_fn.py file could look like the following:

1import numpy as np
2from typing import List
4# Engine inputs to transformers is 3 lists of np.arrays representing
5# the encoded input, the attention mask, and token types.
6# Each of the np.arrays is of shape (batch, max_seq_len), so
7# engine_inputs[0][0] gives the encodings of the first item in the batch.
8# The number of non-zeros in this slice is the sequence length.
9def sequence_length(engine_inputs: List[np.ndarray]):
10 return np.count_nonzero(engine_inputs[0][0])

Launching the Server and Logging Metrics

Download the server-config.yaml, example_custom_fn.py, and goldfish.jpeg for the demo.

1wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/server-config.yaml
2wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/example_custom_fn.py
3wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/goldfish.jpg

Launch the Server with the following:

deepsparse.server --config-file server-config.yaml

Submit a request to the image classification endpoint.

1import requests
2url = ""
3paths = ["goldfish.jpg"]
4files = [("request", open(img, 'rb')) for img in paths]
5resp = requests.post(url=url, files=files)

Submit a request to the sentiment analysis endpoint with the following:

1import requests
2url = ""
3obj = {"sequences": "Snorlax loves my Tesla!"}
4resp = requests.post(url=url, json=obj)

You should see the metrics logged to the Server's standard output and to Prometheus (see at http://localhost:6100 to quickly inspect the exposed metrics).

Using the numactl Utility to Control Resource Utilization With DeepSparse
Deploying DeepSparse