Neural Magic LogoNeural Magic Logo


The machine learning community includes a vast array of terminology that can have variations in meaning depending on context. This glossary is not intended as a comprehensive list, but rather a clarification of terms you may encounter with Neural Magic and machine learning.

AutoMLAutomated Machine Learning. Platform that aims to reduce or eliminate the need for skilled data scientists to build ML and deep learning models. Google AutoML, for example, is a suite of cloud-based ML products.
AVX2Advanced Vector Extensions 2. Instruction set used for applications on an Intel CPU.
AVX-512Advanced Vector Extensions. Instruction set on Intel CPUs that impacts compute, storage, and network instructions. AVX-512 yields higher performance for demanding computational tasks.
Cascade Lake ChipsIntel CPU chips up to 28 cores that are improved for machine learning and added VNNI instructions. Cascade Lake Chips support FP16 and INT8 floating point operations.
Convolutional Neural Network (CNN)Artificial neural network used in image recognition and object detection as well as processing that is specifically designed to process pixel data.
Deep Learning (DL)Subset of machine learning in which artificial neural networks (algorithms inspired by the human brain) learn from large amounts of data.
Deep Learning FrameworksInterface, library, or tool that allows one to build deep learning models more easily and quickly without getting into details of underlying algorithms.
DLRMOpen-source Deep Learning Recommendation Model from Facebook.
Fully Connected NetworkNetwork in which every node in a layer (except the input and output layer) is connected to every node in the previous layer and following layer.
Image ClassificationSupervised learning problem to define a set of target classes (objects to identify in images) and train a model to recognize them using labeled example photos.
Image SegmentationIn computer vision, the process of partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something more meaningful and easier to analyze.
InferenceProcess of using a trained machine learning algorithm to make predictions (done by machine learning engineers).
MobileNetsA family of mobile-first computer vision models for TensorFlow, designed to effectively maximize accuracy while being mindful of the restricted resources for an on-device or embedded application.
Model pipelinesIn machine learning deployment, multiple models chained together to achieve business goals (such as a detection model to select regions from an image for a later visual search model).
Model servingIn machine learning deployment, makes serving of models less expensive and faster to run by better using resources on the machine.
Multilayer Perceptron (MLP)A feedforward artificial neural network (ANN) model, composed of more than one perceptron, that maps sets of input data onto a set of appropriate outputs.
Neural NetworkSystem of hardware and/or software patterned after the operation of neurons in the human brain.
Object DetectionCategorization of an image based on the number of objects in the image and/or the location of the objects.
ONNXOpen Neural Network Exchange. Open-source inference engine that is a performance-focused complete scoring engine for ONNX models.
QuantizationThe process of approximating a neural network that uses floating-point numbers by a neural network of low bit width numbers. Quantization dramatically reduces the memory requirement and computational cost of using neural networks.
RecommendationsCategorization of an image based on relevant suggestions. This class of machine learning algorithms finds similarity between different images.
ResNetImage classification model that is structurally dense.
SparsificationA model optimization technique used to improve performance by reducing the number of nonperformance critical elements, vectors, and matrices.
SSDSingle Shot Detector. Convolutional neural network (CNN) algorithm for object detection that provides better balance between swiftness and precision. SSD runs CNN on an input image only one time and computes a feature map.
Structured pruningA method for compressing a neural network. Structured pruning alternates between removing channel connections and fine-tuning to reduce overall width of the network. Structured pruning severely limits the maximum sparsity that can be imposed on a network when compared with unstructured pruning.
TensorThe input to a convolutional layer. Tensor is a 3 or 4 dimensional representation of a 2D image.
TrainingThe process of feeding an ML algorithm with data to help identify and learn good values for all attributes involved.
U-NetFully convolutional network that does image segmentation (originally designed for medical image segmentation). The U-Net goal is to predict each pixel class.
Unstructured pruningA method for compressing a neural network. Unstructured pruning removes individual weight connections from a trained network. Software like Neural Magic's DeepSparse runs these pruned networks faster.
VNNIVector Neural Network Instructions. New versions of Intel's CPU chips are optimized with VNNI, making them faster and more efficient for certain types of machine learning applications.
YOLOYou Only Look Once. Open-source type of CNN method of object detection that can recognize objects in images and videos swiftly.
SparseML Python API