The machine learning community includes a vast array of terminology that can have variations in meaning depending on context. This glossary is not intended as a comprehensive list, but rather a clarification of terms you may encounter with Neural Magic and machine learning.

AutoML Automated Machine Learning. Platform that aims to reduce or eliminate the need for skilled data scientists to build ML and deep learning models. Google AutoML, for example, is a suite of cloud-based ML products.
AVX2 Advanced Vector Extensions 2. Instruction set used for applications on an Intel CPU.
AVX-512 Advanced Vector Extensions. Instruction set on Intel CPUs that impacts compute, storage, and network instructions. AVX-512 yields higher performance for demanding computational tasks.
Cascade Lake Chips Intel CPU chips up to 28 cores that are improved for machine learning and added VNNI instructions. Cascade Lake Chips support FP16 and INT8 floating point operations.
Convolutional Neural Network (CNN) Artificial neural network used in image recognition and object detection as well as processing that is specifically designed to process pixel data.
Deep Learning (DL) Subset of machine learning in which artificial neural networks (algorithms inspired by the human brain) learn from large amounts of data.
Deep Learning Frameworks Interface, library, or tool that allows one to build deep learning models more easily and quickly without getting into details of underlying algorithms.
DLRM Open-source Deep Learning Recommendation Model from Facebook.
Fully Connected Network Network in which every node in a layer (except the input and output layer) is connected to every node in the previous layer and following layer.
Image Classification Supervised learning problem to define a set of target classes (objects to identify in images) and train a model to recognize them using labeled example photos.
Image Segmentation In computer vision, the process of partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something more meaningful and easier to analyze.
Inference Process of using a trained machine learning algorithm to make predictions (done by machine learning engineers).
MobileNets A family of mobile-first computer vision models for TensorFlow, designed to effectively maximize accuracy while being mindful of the restricted resources for an on-device or embedded application.
Model pipelines In machine learning deployment, multiple models chained together to achieve business goals (such as a detection model to select regions from an image for a later visual search model).
Model serving In machine learning deployment, makes serving of models less expensive and faster to run by better using resources on the machine.
Multilayer Perceptron (MLP) A feedforward artificial neural network (ANN) model, composed of more than one perceptron, that maps sets of input data onto a set of appropriate outputs.
Neural Network System of hardware and/or software patterned after the operation of neurons in the human brain.
Object Detection Categorization of an image based on the number of objects in the image and/or the location of the objects.
ONNX Open Neural Network Exchange. Open-source inference engine that is a performance-focused complete scoring engine for ONNX models.
Quantization The process of approximating a neural network that uses floating-point numbers by a neural network of low bit width numbers. Quantization dramatically reduces the memory requirement and computational cost of using neural networks.
Recommendations Categorization of an image based on relevant suggestions. This class of machine learning algorithms finds similarity between different images.
ResNet Image classification model that is structurally dense.
Sparsification A model optimization technique used to improve performance by reducing the number of nonperformance critical elements, vectors, and matrices.
SSD Single Shot Detector. Convolutional neural network (CNN) algorithm for object detection that provides better balance between swiftness and precision. SSD runs CNN on an input image only one time and computes a feature map.
Structured pruning A method for compressing a neural network. Structured pruning alternates between removing channel connections and fine-tuning to reduce overall width of the network. Structured pruning severely limits the maximum sparsity that can be imposed on a network when compared with unstructured pruning.
Tensor The input to a convolutional layer. Tensor is a 3 or 4 dimensional representation of a 2D image.
Training The process of feeding an ML algorithm with data to help identify and learn good values for all attributes involved.
U-Net Fully convolutional network that does image segmentation (originally designed for medical image segmentation). The U-Net goal is to predict each pixel class.
Unstructured pruning A method for compressing a neural network. Unstructured pruning removes individual weight connections from a trained network. Software like Neural Magic's DeepSparse Engine runs these pruned networks faster.
VNNI Vector Neural Network Instructions. New versions of Intel's CPU chips are optimized with VNNI, making them faster and more efficient for certain types of machine learning applications.
YOLO You Only Look Once. Open-source type of CNN method of object detection that can recognize objects in images and videos swiftly.