[Model training] Labelme annotation and processing segmentation data method

Make writing a habit together! This is the 7th day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

Please follow my official account [Jizhi Vision] for more notes to share

Hi, I’m Jizhi Vision. This paper introduces the method of Labelme annotation and processing segmented data in detail.

Image segmentation is a common task in computer vision tasks, including instance segmentation, semantic segmentation, panoramic segmentation and so on. Before sending into segmentation tasks, data need to be annotated. As we all know, the quality of data in deep learning has a great influence on the final detection effect, so the importance of data annotation is self-evident.

So let’s start.

1. Install Labelme

Whether you’re running Windows or Linux, you can install it like this:

#First, install Anaconda, which I won't talk about here
#Install pyqt5
pip install -i https://pypi.douban.com/simple pyqt5

#Install the labelme
pip install -i https://pypi.douban.com/simple labelme

#Open the labelme
./labelme
Copy the code

It then generates a JSON file for the image, which contains the split mask information for the label and label, something like this:

2. Built-in JSON to datset

2.1 Single figure JSON to dataset

Direct execution:

labelme_json_dataset xxx.json
Copy the code

It then generates:

Img. PNG: original image;
Label. PNG: mask image;
Label_viz. PNG: mask image with background;
Yaml and label_names. TXT: indicates the label information.

2.2 Batch JSON to dataset

Go to the cli/json_to_dataset. Py directory and then:

cd cli
touch json_to_datasetP.py
vim json_to_datasetP.py
Copy the code

Add the following:

import argparse
import json
import os
import os.path as osp
import warnings
 
import PIL.Image
import yaml
 
from labelme import utils
import base64
 
def main() :
    warnings.warn("This script is aimed to demonstrate how to convert the\n"
                  "JSON file to a single image dataset, and not to handle\n"
                  "multiple JSON files to generate a real-use dataset.")
    parser = argparse.ArgumentParser()
    parser.add_argument('json_file')
    parser.add_argument('-o'.'--out', default=None)
    args = parser.parse_args()
 
    json_file = args.json_file
    if args.out is None:
        out_dir = osp.basename(json_file).replace('. '.'_')
        out_dir = osp.join(osp.dirname(json_file), out_dir)
    else:
        out_dir = args.out
    if not osp.exists(out_dir):
        os.mkdir(out_dir)
 
    count = os.listdir(json_file) 
    for i in range(0.len(count)):
        path = os.path.join(json_file, count[i])
        if os.path.isfile(path):
            data = json.load(open(path))
            
            if data['imageData']:
                imageData = data['imageData']
            else:
                imagePath = os.path.join(os.path.dirname(path), data['imagePath'])
                with open(imagePath, 'rb') as f:
                    imageData = f.read()
                    imageData = base64.b64encode(imageData).decode('utf-8')
            img = utils.img_b64_to_arr(imageData)
            label_name_to_value = {'_background_': 0}
            for shape in data['shapes']:
                label_name = shape['label']
                if label_name in label_name_to_value:
                    label_value = label_name_to_value[label_name]
                else:
                    label_value = len(label_name_to_value)
                    label_name_to_value[label_name] = label_value
            
            # label_values must be dense
            label_values, label_names = [], []
            for ln, lv in sorted(label_name_to_value.items(), key=lambda x: x[1]):
                label_values.append(lv)
                label_names.append(ln)
            assert label_values == list(range(len(label_values)))
            
            lbl = utils.shapes_to_label(img.shape, data['shapes'], label_name_to_value)
            
            captions = ['{}: {}'.format(lv, ln)
                for ln, lv in label_name_to_value.items()]
            lbl_viz = utils.draw_label(lbl, img, captions)
            
            out_dir = osp.basename(count[i]).replace('. '.'_')
            out_dir = osp.join(osp.dirname(count[i]), out_dir)
            if not osp.exists(out_dir):
                os.mkdir(out_dir)
 
            PIL.Image.fromarray(img).save(osp.join(out_dir, 'img.png'))
            #PIL.Image.fromarray(lbl).save(osp.join(out_dir, 'label.png'))
            utils.lblsave(osp.join(out_dir, 'label.png'), lbl)
            PIL.Image.fromarray(lbl_viz).save(osp.join(out_dir, 'label_viz.png'))
 
            with open(osp.join(out_dir, 'label_names.txt'), 'w') as f:
                for lbl_name in label_names:
                    f.write(lbl_name + '\n')
 
            warnings.warn('info.yaml is being replaced by label_names.txt')
            info = dict(label_names=label_names)
            with open(osp.join(out_dir, 'info.yaml'), 'w') as f:
                yaml.safe_dump(info, f, default_flow_style=False)
 
            print('Saved to: %s' % out_dir)
if __name__ == '__main__':
    main()
Copy the code

Then batch conversion:

python path/cli/json_to_datasetP.py path/JPEGImages
Copy the code

If an error:

lbl_viz = utils.draw_label(lbl, img, captions)
AttributeError: module 'labelme.utils' has no attribute 'draw_label'
Copy the code

Solution: Need to replace labelme version, need to lower labelme version to 3.16.2, method enter labelme environment, type PIP install labelme==3.16.2 can automatically download this version, you can succeed.

3, another kind of segmentation label production

If you want to generate a tag like this:

The original:

Corresponding label (background 0, circle 1) :

This label is an 8-bit single-channel image, and this method supports up to 256 types.

Data sets can be created using the following script:

import cv2
import numpy as np
import json
import os

0, 1, 2, 3
# backg Dog Cat Fish
category_types = ["Background"."Dog"."Cat"."Fish"]

Get the original image size
img = cv2.imread("image.bmp")
h, w = img.shape[:2]

for root,dirs,files in os.walk("data/Annotations") :for file in files: 
        mask = np.zeros([h, w, 1], np.uint8)    # Create a blank image of the same size as the original image

        print(file[:-5])

        jsonPath = "data/Annotations/"
        with open(jsonPath + file, "r") as f:
            label = json.load(f)

        shapes = label["shapes"]
        for shape in shapes:
            category = shape["label"]
            points = shape["points"]
            # fill
            points_array = np.array(points, dtype=np.int32)
            mask = cv2.fillPoly(mask, [points_array], category_types.index(category))

        imgPath = "data/masks/"
        cv2.imwrite(imgPath + file[:-5] + ".png", mask)
Copy the code

These are the four categories.

We’re done here. I have shared labelme’s method of labeling and processing segmented data above. I hope my sharing can be of some help to your study.

[Public Account transmission]

[Model Training] Labelme Method for Labeling and Processing Segmented Data

[Model training] Labelme annotation and processing segmentation data method

1. Install Labelme

2. Built-in JSON to datset

2.1 Single figure JSON to dataset

2.2 Batch JSON to dataset

3, another kind of segmentation label production

Related Posts

Specifying a GPU to run and train Python programs, deep learning single-card, multi-card training GPU Settings

PyTorch-10 Spatial Transformer Tutorial (STN

Data analysis – Missing value processing