Introduction to the

The MS COCO dataset is another well-known large dataset in the field of object detection (PASCAL VOC). COCO stands for Common Objects in COntext and was built by Microsoft. It includes detection, segmentation, keypoints and other tasks. At present, the main sponsors are Microsoft, Facebook, Google and other big companies.

Compared to PASCAL VOC data sets, image data in MS COCO has more objects, smaller sizes and more complex image backgrounds, so the task on this data set is more difficult. For current detection algorithms, detection results on MS COCO have become the de facto standard to measure the quality of the model.

MS COCO contains a total of 91 stuff categories and 80 object Cagegories, which are commonly referred to as object categories. The figure below is a comparison of category and data volume with PASCAL VOC

Evaluation criteria

MS COCO’s evaluation criteria are more stringent than PASCAL VOC’s. Unlike PASCAL VOC’s use of mAP, MS COCO’s main evaluation indicator is AP

As can be seen from the above picture:

  • MS COCOIn theAPIn 10IOU(from 0.5, step size 0.05 to 0.95) levels and the average of 80 category levels
  • For three different sizes (small,medium,large) proposed different measurement standards
  • In addition toAP, also proposedARThat is,Average Recall), its calculation method withAPIs similar

Annotation format

Unlike PASCAL VOC, where each image corresponds to an XML file, MS COCO directly writes all images and their corresponding Bbox information in a JSON file. The annotated JSON file looks like this

{
    "info": info,
    "images": [image],
    "annotations": [annotation],
    "licenses": [license],
    "categories": [category]
}

info{
    "year": int,
    "version": str,
    "description": str,
    "contributor": str,
    "url": str,
    "date_created": datetime,
}

image{
    "id": int,
    "width": int,
    "height": int,
    "file_name": str,
    "license": int,
    "flickr_url": str,
    "coco_url": str,
    "date_captured": datetime,
}

license{
    "id": int,
    "name": str,
    "url": str,
}

{  
    "id": int,  
    "name": str,  
    "supercategory": str,  
    "keypoints": [str],  
    "skeleton": [edge]  
} 

annotation{
    "id": int,    
    "image_id": int,
    "category_id": int,
    "segmentation": RLE or [polygon],
    "area": float,
    "bbox": [x,y,width,height],
    "iscrowd": 0 or 1,
}
Copy the code

Here are a few key words. Explain them

  • info: Records basic information about the data set
  • license: Copyright information, i.e. the source of the image
  • images: Contains information about each image, including file name, width, height,idEtc.
  • categories: Category information,idWe start at one, zero is the background, and we have onesupercategoryLike dogs and cats,supercategoryIs the animal
  • annotations: in the training set (or test set)bboxThe number of

There are three annotation types in MS COCO data set, which are Object Instances, Object keypoints and image captions. Among them, info, Licenses and images are shared. These three types are the same in different JSON files, and the annotations and categories are different

Data set download

There are mainly two data sets, 2014 and 2017. The data of 2014 is used for Detection target Detection, Captioning and Keypoints Detection, while the data of 2017 is based on this, and the Stuff and Panoptic segmentation tasks are added.

train2017 val2017 test2017

Annotation tool

Labelme is recommended and can be installed directly using PIP Install Labelme. It’s also easy to use, not too different from labelImg

PASCAL VOC and MS COCO interconvert

You can use the following open source tool github.com/veraposeido…

COCO API

MS COCO provides an API for manipulating data sets. The address is github.com/cocodataset… , which provides interfaces to Python, Lua, and Matlib. Python is used as an example

The first is installation

git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
Copy the code

Here is an official example of how to use it

from pycocotools.coco import COCO import numpy as np import skimage.io as io import matplotlib.pyplot as plt import Pylab pylab.rcparams [' fig.figsize '] = (8.0, 10.0) dataDir='.. Format (dataDir,dataType) # From instance Annotations the JSON file obtains the coco object coco= coco (annFile) cats = coke.loadcats (coke.getcatids ()) NMS =[cat['name'] for cat in cats] # Print ('COCO categories: Format (' '.join(NMS))) NMS = set([cat['supercategory'] for cat in cats]) supercategory print('COCO supercategories: \n{}'.format(' '.join(nms))) # get all images containing given categories, select one at random catIds = coco.getCatIds(catNms=['person','dog','skateboard']); imgIds = coco.getImgIds(catIds=catIds ); ImgIds = coke. getImgIds(imgIds = [324158]) # Img = coco.loadimgs (imgIds[np.random. Randint (0,len(imgIds))])[0 io.imread('%s/images/%s/%s'%(dataDir,dataType,img['file_name'])) # use url to load image I = io.imread(img['coco_url']) plt.axis('off') plt.imshow(I) plt.show()Copy the code

Code is displayed after execution

The resources

  • cocodataset.org/#home
  • Arxiv.org/abs/1405.03…
  • Github.com/wkentaro/la…
  • Github.com/veraposeido…
  • Github.com/cocodataset…