This page explains how to use DeepSparse Logging to monitor your deployment.
There are many types of monitoring tasks that you may want to perform to confirm your production system is working correctly. The difficulty of the tasks varies from relatively easy (simple system performance analysis) to challenging (assessing the accuracy of the system in the wild by manually labeling the input data distribution post-factum). Examples include:
DeepSparse Logging is designed to provide maximum flexibility for you to extract whatever data is needed from a production inference pipeline into the logging system of your choice.
This page requires the DeepSparse Server Install.
DeepSparse Logging provides access to two types of metrics.
System Logging gives you access to granular performance metrics for quick and efficient diagnosis of system health.
There is one group of System Logging Metrics currently available: Inference Latency. For each inference request, DeepSparse Server logs the following:
Data Logging gives you access to data at each stage of an inference pipeline. This facilitates inspection of the data, understanding of its properties, detecting edge cases, and possible data drift.
There are four stages in the inference pipeline where Data Logging can occur:
pipeline_inputs
: raw input passed to the inference pipeline by the userengine_inputs
: pre-processed tensors passed to the engine for the forward passengine_outputs
: result of the engine forward pass (e.g., the raw logits)pipeline_outputs
: final output returned to the pipeline callerAt each stage, you can specify functions to be applied to the data before logging. Example functions include the identity function (for logging the raw input/output) or the mean function (e.g., for monitoring the mean pixel value of an image).
There are three types of functions that can be applied to target data at each stage:
torch
or numpy
The YAML-based Server Config file is used to configure both System and Data Logging.
See the Server documentation for more details on the Server config file.
There are two key elements that should be added to the Server Config to setup logging.
First is loggers
. This element configures the loggers that are used by the Server. Each element is a dictionary of the form {logger_name: {arg_1: arg_value}}
.
Second is data_logging
. This element identifies which/how data should be logged for an endpoint. It is a dictionary of the form {identifier: [log_config]}
.
identifier
specifies the stages where logging should occur. It can either be a pipeline stage
(see stages above) or stage.property
if the data type
at a particular stage has a property. If the data type at a stage
is a dictionary or list, you can access via slicing, indexing, or dict access,
for example stage[0][:,:,0]['key3']
.
log_config
specifies which function to apply, which logger(s) to use, and how often to log. It is a dictionary of the form
{func: name, frequency: freq, target_loggers: [logger_names]}
.
Here's an example for an image classification server:
1# example-config.yaml2loggers:3 python: # logs to stdout4 prometheus: # logs to prometheus on port 61005 port: 610067endpoints:8 - task: image_classification9 route: /image_classification/predict10 model: zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none11 data_logging:12 pipeline_inputs.images: # applies to the images (of the form stage.property)13 - func: np.shape # framework function14 frequency: 115 target_loggers:16 - python1718 pipeline_inputs.images[0]: # applies to the first image (of the form stage.property[idx])19 - func: mean_pixels_per_channel # built-in function20 frequency: 221 target_loggers:22 - python23 - func: fraction_zeros # built-in function24 frequency: 225 target_loggers:26 - prometheus2728 engine_inputs: # applies to the engine_inputs data (of the form stage)29 - func: np.shape # framework function30 frequency: 131 target_loggers:32 - python
This configuration does the following data logging at each respective stage of the pipeline:
DeepSparse Logging includes options to log to Standard Output and to Prometheus out of the box as well as the ability to create a Custom Logger.
Python Logger logs data to Standard Output. It is useful for debugging and inspecting an inference pipeline. It accepts no arguments and is configured with the following:
1loggers:2 python:
DeepSparse is integrated with Prometheus, enabling you to easily instrument your model service. The Prometheus Logger accepts some optional arguments and is configured as follows:
1loggers:2 prometheus:3 port: 61004 text_log_save_frequency: 10 # optional5 text_log_save_dir: text/log/save/dir # optional6 text_log_file_name: text_log_file_name # optional
There are four types of metrics in Prometheus (Counter, Gauge, Summary, and Histogram). DeepSparse uses Summary under the hood, so make sure the data you are logging to Prometheus is an Int or a Float.
If you need a custom logger, you can create a class that inherits from the BaseLogger
and implements the log
method. The log
method is called at each pipeline stage and should handle exposing the metric to the Logger.
1from deepsparse.loggers import BaseLogger2from typing import Any, Optional34class CustomLogger(BaseLogger):5 def log(self, identifier: str, value: Any, category: Optional[str]=None):6 """7 :param identifier: The name of the item that is being logged.8 By default, in the simplest case, that would be a string in the form9 of "<pipeline_name>/<logging_target>"10 e.g. "image_classification/pipeline_inputs"11 :param value: The item that is logged along with the identifier12 :param category: The metric category that the log belongs to.13 By default, we recommend sticking to our internal convention14 established in the MetricsCategories enum.15 """16 print("Logging from a custom logger")17 print(identifier)18 print(value)
Once a custom logger is implemented, it can be referenced from a config file:
1# server-config-with-custom-logger.yaml2loggers:3 custom_logger:4 path: example_custom_logger.py:CustomLogger5 # arg_1: your_arg_167endpoints:8 - task: sentiment_analysis9 route: /sentiment_analysis/predict10 model: zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/12layer_pruned80_quant-none-vnni11 name: sentiment_analysis_pipeline12 data_logging:13 pipeline_inputs:14 - func: identity15 frequency: 116 target_loggers:17 - custom_logger
Download the following for an example of a custom logger:
1wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/example_custom_logger.py2wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/server-config-with-custom-logger.yaml
Launch the server:
deepsparse.server --config-file server-config-with-custom-logger.yaml
Submit a request:
1import requests2url = "http://0.0.0.0:5543/sentiment_analysis/predict"3obj = {"sequences": "Snorlax loves my Tesla!"}4resp = requests.post(url=url, json=obj)5print(resp.text)
You should see data printed to the Server's standard output.
See our Prometheus logger implementation for inspiration on implementing a logger.
DeepSparse Logging is currently supported for usage with DeepSparse Server.
The Server startup CLI command accepts a YAML configuration file (which contains both logging-specific and general
configuration details) via the --config-file
argument.
Data Logging is configured at the endpoint level. The configuration file below creates a Server with two endpoints (one for image classification and one for sentiment analysis):
1# server-config.yaml2loggers:3 python:4 prometheus:5 port: 610067endpoints:8 - task: image_classification9 route: /image_classification/predict10 model: zoo:cv/classification/resnet_v1-50/pytorch/sparseml/imagenet/pruned95_quant-none11 name: image_classification_pipeline12 data_logging:13 pipeline_inputs.images:14 - func: np.shape15 frequency: 116 target_loggers:17 - python1819 pipeline_inputs.images[0]:20 - func: max_pixels_per_channel21 frequency: 122 target_loggers:23 - python24 - func: mean_pixels_per_channel25 frequency: 126 target_loggers:27 - python28 - func: fraction_zeros29 frequency: 130 target_loggers:31 - prometheus3233 pipeline_outputs.scores[0]:34 - func: identity35 frequency: 136 target_loggers:37 - prometheus3839 - task: sentiment_analysis40 route: /sentiment_analysis/predict41 model: zoo:nlp/sentiment_analysis/bert-base/pytorch/huggingface/sst2/12layer_pruned80_quant-none-vnni42 name: sentiment_analysis_pipeline43 data_logging:44 engine_inputs:45 - func: example_custom_fn.py:sequence_length46 frequency: 147 target_loggers:48 - python49 - prometheus5051 pipeline_outputs.scores[0]:52 - func: identity53 frequency: 154 target_loggers:55 - python56 - prometheus
The example above included a custom function for computing sequence lengths. Custom Functions should be defined in a local Python file. They should accept one argument and return a single output.
The example_custom_fn.py
file could look like the following:
1import numpy as np2from typing import List34# Engine inputs to transformers is 3 lists of np.arrays representing5# the encoded input, the attention mask, and token types.6# Each of the np.arrays is of shape (batch, max_seq_len), so7# engine_inputs[0][0] gives the encodings of the first item in the batch.8# The number of non-zeros in this slice is the sequence length.9def sequence_length(engine_inputs: List[np.ndarray]):10 return np.count_nonzero(engine_inputs[0][0])
Download the server-config.yaml
, example_custom_fn.py
, and goldfish.jpeg
for the demo.
1wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/server-config.yaml2wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/example_custom_fn.py3wget https://raw.githubusercontent.com/neuralmagic/docs/rs/embedding-extraction-feature/files-for-examples/user-guide/deepsparse/logging/goldfish.jpg
Launch the Server with the following:
deepsparse.server --config-file server-config.yaml
Submit a request to the image classification endpoint.
1import requests2url = "http://0.0.0.0:5543/image_classification/predict/from_files"3paths = ["goldfish.jpg"]4files = [("request", open(img, 'rb')) for img in paths]5resp = requests.post(url=url, files=files)6print(resp.text)
Submit a request to the sentiment analysis endpoint with the following:
1import requests2url = "http://0.0.0.0:5543/sentiment_analysis/predict"3obj = {"sequences": "Snorlax loves my Tesla!"}4resp = requests.post(url=url, json=obj)5print(resp.text)
You should see the metrics logged to the Server's standard output and to Prometheus (see at http://localhost:6100
to quickly inspect the exposed metrics).