sparseml.pytorch.datasets.detection package
Submodules
sparseml.pytorch.datasets.detection.coco module
-
class
sparseml.pytorch.datasets.detection.coco.
CocoDetectionDataset
(root: str = '~/.cache/nm_datasets/coco-detection', train: bool = False, rand_trans: bool = False, download: bool = True, year: str = '2017', image_size: int = 300, preprocessing_type: Optional[str] = None, default_boxes: Optional[sparseml.pytorch.utils.ssd_helpers.DefaultBoxes] = None)[source] Bases:
object
Wrapper for the Coco Detection dataset to apply standard transforms for input to detection models. Will return the processed image along with a tuple of its bounding boxes in ltrb format and labels for each box.
If a DefaultBoxes object is provided, then will encode the box and labels using that object returning a tensor of offsets to the default boxes and labels for those boxes and return a three item tuple of the encoded boxes, labels, and their original values.
- Parameters
root – The root folder to find the dataset at, if not found will download here if download=True
train – True if this is for the training distribution, False for the validation
rand_trans – True to apply RandomCrop and RandomHorizontalFlip to the data, False otherwise
download – True to download the dataset, False otherwise.
year – The dataset year, supports years 2014, 2015, and 2017.
image_size – the size of the image to output from the dataset
preprocessing_type – Type of standard pre-processing to perform. Options are ‘yolo’, ‘ssd’, or None. None defaults to just image normalization with no extra processing of bounding boxes.
default_boxes – DefaultBoxes object used to encode bounding boxes and label for model loss computation for SSD models. Only used when preprocessing_type= ‘ssd’. Default object represents the default boxes used in standard SSD 300 implementation.
-
property
default_boxes
The DefaultBoxes object used to encode this datasets bounding boxes
- Type
return
-
sparseml.pytorch.datasets.detection.coco.
coco_2017_yolo
(root: str = '~/.cache/nm_datasets/coco-detection', train: bool = False, rand_trans: bool = False, download: bool = True, year: str = '2017', image_size: int = 640, preprocessing_type: str = 'yolo')[source] Wrapper for COCO detection dataset with Dataset Registry values properly created for a Yolo model trained on 80 classes.
- Parameters
root – The root folder to find the dataset at, if not found will download here if download=True
train – True if this is for the training distribution, False for the validation
rand_trans – True to apply RandomCrop and RandomHorizontalFlip to the data, False otherwise
download – True to download the dataset, False otherwise.
year – Only valid option is 2017. default is 2017.
image_size – the size of the image to output from the dataset
preprocessing_type – Type of standard pre-processing to perform. Only valid option is ‘yolo’. Default is ‘yolo’
sparseml.pytorch.datasets.detection.helpers module
Helper classes and functions for PyTorch detection data loaders
-
class
sparseml.pytorch.datasets.detection.helpers.
AnnotatedImageTransforms
(transforms: List)[source] Bases:
object
Class for chaining transforms that take two parameters (images and annotations for object detection).
- Parameters
transforms – List of transformations that take an image and annotation as their parameters.
-
property
transforms
a list of the transforms performed by this object
- Type
return
-
sparseml.pytorch.datasets.detection.helpers.
bounding_box_and_labels_to_yolo_fmt
(annotations)[source]
-
sparseml.pytorch.datasets.detection.helpers.
random_horizontal_flip_image_and_annotations
(image: PIL.Image.Image, annotations: Tuple[torch.Tensor, torch.Tensor], p: float = 0.5) → Tuple[PIL.Image.Image, Tuple[torch.Tensor, torch.Tensor]][source] Performs a horizontal flip on given image and bounding boxes with probability p.
- Parameters
image – the image to randomly flip
annotations – a tuple of bounding boxes and their labels for this image
p – the probability to flip with. Default is 0.5
- Returns
A tuple of the randomly flipped image and annotations
-
sparseml.pytorch.datasets.detection.helpers.
ssd_collate_fn
(batch: List[Any]) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor, List[Tuple[torch.Tensor, torch.Tensor]]]][source] Collate function to be used for creating a DataLoader with values transformed by encode_annotation_bounding_boxes.
- Parameters
batch – a batch of data points transformed by encode_annotation_bounding_boxes
- Returns
the batch stacked as tensors for all values except for the original annotations
-
sparseml.pytorch.datasets.detection.helpers.
ssd_random_crop_image_and_annotations
(image: PIL.Image.Image, annotations: Tuple[torch.Tensor, torch.Tensor]) → Tuple[PIL.Image.Image, Tuple[torch.Tensor, torch.Tensor]][source] Wraps sparseml.pytorch.utils.ssd_random_crop to work in the AnnotatedImageTransforms pipeline.
- Parameters
image – the image to crop
annotations – a tuple of bounding boxes and their labels for this image
- Returns
A tuple of the cropped image and annotations
-
sparseml.pytorch.datasets.detection.helpers.
yolo_collate_fn
(batch: List[Any]) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor, List[Tuple[torch.Tensor, torch.Tensor]]]][source] Collate function to be used for creating a DataLoader with values for Yolo model input.
- Parameters
batch – a batch of data points and annotations transformed by bounding_box_and_labels_to_yolo_fmt
- Returns
the batch stacked as tensors for all values except for the original annotations
sparseml.pytorch.datasets.detection.voc module
VOC dataset implementations for the object detection field in computer vision. More info for the dataset can be found here.
-
class
sparseml.pytorch.datasets.detection.voc.
VOCDetectionDataset
(root: str = '~/.cache/nm_datasets/voc-detection', train: bool = True, rand_trans: bool = False, download: bool = True, year: str = '2012', image_size: int = 300, preprocessing_type: Optional[str] = None, default_boxes: Optional[sparseml.pytorch.utils.ssd_helpers.DefaultBoxes] = None)[source] Bases:
torchvision.datasets.voc.VOCDetection
Wrapper for the VOC Detection dataset to apply standard transforms for input to detection models. Will return the processed image along with a tuple of its bounding boxes in ltrb format and labels for each box.
If a DefaultBoxes object is provided, then will encode the box and labels using that object returning a tensor of offsets to the default boxes and labels for those boxes and return a three item tuple of the encoded boxes, labels, and their original values.
- Parameters
root – The root folder to find the dataset at, if not found will download here if download=True
train – True if this is for the training distribution, False for the validation
rand_trans – True to apply RandomCrop and RandomHorizontalFlip to the data, False otherwise
download – True to download the dataset, False otherwise. Base implementation does not support leaving as false if already downloaded
image_size – the size of the image to output from the dataset
preprocessing_type – Type of standard pre-processing to perform. Options are ‘yolo’, ‘ssd’, or None. None defaults to just image normalization with no extra processing of bounding boxes.
default_boxes – DefaultBoxes object used to encode bounding boxes and label for model loss computation for SSD models. Only used when preprocessing_type= ‘ssd’. Default object represents the default boxes used in standard SSD 300 implementation.
-
property
default_boxes
The DefaultBoxes object used to encode this datasets bounding boxes
- Type
return
-
class
sparseml.pytorch.datasets.detection.voc.
VOCSegmentationDataset
(root: str = '~/.cache/nm_datasets/voc-segmentation', train: bool = True, rand_trans: bool = False, download: bool = True, year: str = '2012', image_size: int = 300)[source] Bases:
torchvision.datasets.voc.VOCSegmentation
Wrapper for the VOC Segmentation dataset to apply standard transforms.
- Parameters
root – The root folder to find the dataset at, if not found will download here if download=True
train – True if this is for the training distribution, False for the validation
rand_trans – True to apply RandomCrop and RandomHorizontalFlip to the data, False otherwise
download – True to download the dataset, False otherwise.
year – The dataset year, supports years 2007 to 2012.
image_size – the size of the image to output from the dataset
Module contents
Datasets related to object detection field in computer vision