High quality, fast, modular reference implementation of SSD in PyTorch 1.0

This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for researches based on SSD.

Highlights

PyTorch 1.0
GPU/CPU NMS
Multi-GPU training and inference
Modular
Visualization(Support Tensorboard)

Installation

Requirements

Python3
PyTorch 1.0
yacs
GCC > = 4.9
OpenCV

Build

# build nms
cd ext
python build.py build_ext developCopy the code

Train

Setting Up Datasets

Pascal VOC

For Pascal VOC dataset, make the folder structure like this:

VOC_ROOT
|__ VOC2007
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ VOC2012
    |_ JPEGImages
    |_ Annotations
    |_ ImageSets
    |_ SegmentationClass
|__ ...
Copy the code

Where VOC_ROOT default is datasets folder in current project, you can create symlinks to datasets or export VOC_ROOT="/path/to/voc_root".

COCO

For COCO dataset, make the folder structure like this:

COCO_ROOT
|__ annotations
    |_ instances_valminusminival2014.json
    |_ instances_minival2014.json
    |_ instances_train2014.json
    |_ instances_val2014.json
    |_ ...
|__ train2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ val2014
    |_ <im-1-name>.jpg
    |_ ...
    |_ <im-N-name>.jpg
|__ ...
Copy the code

Where COCO_ROOT default is datasets folder in current project, you can create symlinks to datasets or export COCO_ROOT="/path/to/coco_root".

Single GPU training

# for example, train SSD300:
python train_ssd.py --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pthCopy the code

Multi-GPU training

# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS --config-file configs/ssd300_voc0712.yaml --vgg vgg16_reducedfc.pthCopy the code

The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, …) will also changed according to this paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. The pre – steeped VGG weights can be downloaded here: s3.amazonaws.com/amdegroot-m… .

Demo

Predicting image in a folder is simple:

python demo.py --config-file configs/ssd300_voc0712.yaml --weights path/to/trained/weights.pth --images_dir demoCopy the code

Then the predicted images with boxes, scores and label names will saved to demo/result folder. Currently, I provide weights trained with SSD300_VOC0712. yaml here: SSD300_VOC0712_map77.83. PTH (100 MB)

Performance

Origin Paper:

	VOC2007 test
SSD300*	77.2
SSD512*	79.8

Our Implementation:

	VOC2007 test
SSD300*	77.8
SSD512*	–

Details:

VOC2007 test

SSD300*

	VOC2007 test
SSD300*	`MAP: 0.7783 Aeroplane: 0.8252 bicycle: 0.8445 bird: 0.7597 Boat: 0.7102 bottle: 0.5275 Bus: 0.8643 car: Cat: 0.8741 chair: 0.6179 COW: 0.8279 diningtable: 0.7862 Dog: 0.8519 horse: 0.8630 motorbike: 0.8515 Person: 0.8024 pottedplant: 0.5079 sheep: 0.7685 SOFA: 0.7926 train: 0.8704 TVMonitor: 0.7554Copy the code`
SSD512*	`-Copy the code`

MAP: 0.7783 Aeroplane: 0.8252 bicycle: 0.8445 bird: 0.7597 Boat: 0.7102 bottle: 0.5275 Bus: 0.8643 car: Cat: 0.8741 chair: 0.6179 COW: 0.8279 diningtable: 0.7862 Dog: 0.8519 horse: 0.8630 motorbike: 0.8515 Person: 0.8024 pottedplant: 0.5079 sheep: 0.7685 SOFA: 0.7926 train: 0.8704 TVMonitor: 0.7554Copy the code

SSD512*

-Copy the code