“This is my 27th day of participating in the First Challenge 2022. For more details: First Challenge 2022.”

Data acquisition and processing (mainly CV tasks)

  • Course objective
  1. Access to data
  2. Data processing and annotation
  3. Data preprocessing method
  4. Model training evaluation

1. Acquisition of data sets

In general, our data comes from various game platforms. The first is the data set in AIStudio, most of the classic data sets such as Baidu AIStudio,Kaggle, Tianchi, IFlytek and other platforms (to obtain the required data set through keyword search), or Github. There are also some small platforms for you to see for yourself. Typically, datasets are used for academic purposes, and some data may require a request for links.

1.1 Kaggle interesting hot data set

House Prices-Advanced Regression Techniques

Dogs and cats are classified

Machine Learning from Disaster prediction of the survival of Titanic and familiar with the basic knowledge of Machine Learning

1.2 tianchi

Remote Sensing Image segmentation of Barley Remote Sensing Dataset

Yale Face Database Target Detection Task (Face detection)

1.3 DataFountain

Flower classification data set image classification

1.4 Official website of other commonly used data sets

Iflytek official website

COCO data set

1.5 Overview of the complete process

1.5.1 Complete process of image processing

    1. Image data acquisition
    1. Image data cleaning

—- preliminary understanding of the data, screening out inappropriate pictures

    1. Image data annotation
    1. Data preprocessing.

—- standardlization

Centralization = de-mean normallization

One, center each dimension to zero

One is to speed up convergence and perform better on some activation functions

One normalization is equal to divided by the standard deviation

First, the variance of each dimension is normalized between [-1,1]

The first purpose is to improve the convergence efficiency, unify the influence of data in different input ranges on model learning, and map to the range of effective gradient of activation function

    1. Data preparation(training + test stage)

—- Divide training set, validation set, and test set

    1. Pictorial data enhanced data augjmentation (training phase)

—-CV common data enhancement

· Random rotation

· Random horizontal or restraight flip

Zoom,

Cutting,

Translation,

· Adjust brightness, contrast, saturation, chromatic aberration, etc

· Injection noise

· Make several augment augment based on generative adversarial network GAN, etc

1.5.2 Complete process of pure data processing

  • Data preprocessing and feature engineering

  • 1. Perception data

—- Preliminary understanding of the data

—- Number of records and features The name of a feature

—- Sampling for descriptive statistical results of numerical characteristics in the records

—- Feature types

—- is combined with relevant knowledge domain data and features are fused

  • 2. Clear data

—- Convert the data type

—- Process missing data

—- processes outlier data

  • 3. Feature transformation

—- Feature numeralization

—- Feature binarization

– OneHot coding

—- Features Features of discretization

– standardization

Range transform

standardized

The normalized

  • 4. Feature selection

—- wrapper method

Sequential feature selection

Exhaustive feature selection

Recursive feature selection

—- filter method

– embedding method

  • 5. Feature extraction

—- Unsupervised feature extraction

Principal component analysis

Factor analysis

—- has supervised feature extraction

Expand knowledge:

Pearson correlation coefficient is a statistic used to reflect the degree of similarity between two variables. In machine learning, it can be used to calculate the degree of similarity between features and categories, so as to determine whether the extracted features and categories are positively correlated, negatively correlated or not. The value range of Pearson coefficient is [-1,1]. When the value is negative, it is negative correlation; when the value is positive, it is positive correlation. The larger the absolute value is, the greater the degree of positive/negative correlation is. If the data has no duplicate value and the two variables are completely monotonically correlated, spearman’s correlation coefficient is +1 or -1. The correlation system is 0 when two variables are independent, but not vice versa.

Use the Corr() function (to ensure the same lines).

The formula is as follows:


rho X . Y = cov ( X . Y ) sigma X sigma Y = E ( ( X mu X ) ( Y mu Y ) ) sigma X sigma Y = E ( X Y ) E ( X ) E ( Y ) E ( X 2 ) E 2 ( X ) E ( Y 2 ) E 2 ( Y ) \rho_{X, Y}=\frac{\operatorname{cov}(X, Y)}{\sigma_{X} \sigma_{Y}}=\frac{E\left(\left(X-\mu_{X}\right)\left(Y-\mu_{Y}\right)\right)}{\sigma_{X} \sigma_{Y}}=\frac{E(X Y)-E(X) E(Y)}{\sqrt{E\left(X^{2}\right)-E^{2}(X)} \sqrt{E\left(Y^{2}\right)-E^{2}(Y)}}

When the standard deviation of both variables is not zero, the correlation coefficient is defined. Pearson correlation coefficient applies to:

(1) There is a linear relationship between the two variables, and both are continuous data.

(2) The overall distribution of the two variables is normal, or a unimodal distribution close to normal.

(3) The observed values of the two variables are paired, and each pair of observed values is independent of each other.

Second, data processing

2.1 Official data shall be processed into VOC or COCO

2.1.1COCO2017 Data Set Description

COCO dataset is a dataset made and collected by Microsoft for Detection + Segmentation + Localization + Captioning. The author collected the 2017 version of COCO dataset. There are about 25 GIGABytes of images and 600 megabytes of label files. COCO dataset has 80 subclasses, which are as follows:

[' person ', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',' train ', 'truck', 'Boat', 'traffic Light', 'Fire Hydrant', 'Stop Sign', 'parking meter', 'bench', 'Bird', 'Cat', 'dog', 'horse', 'sheep', 'cow', 'Elephant', 'Bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'Frisbee', 'Skis',' snowboard ', 'Sports Ball', 'kite', 'Baseball bat', 'Baseball Glove', 'Skateboard', 'Surfboard', 'Tennis racket', 'Bottle', 'wine glass, cup, the fork, knife, spoon,' always', 'banana', 'apple', "sandwich", "orange", "piece", "have", "Hot dog", "pizza", "donut ', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'TV', 'laptop', 'Mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'Oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', Before 'Vase', 'Scissors',' Teddy Bear ', 'Hair drier', 'obligatory']Copy the code

There are 12 categories, respectively

[' Appliance ', 'food', 'indoor', 'accessory', 'electronic', 'furniture', 'vehicle', 'sports',' animal ', 'kitchen', 'person', 'is suing]Copy the code

Introduction to VOC and COCO

Pascal stands for Pattern Analysis, Statical Modeling and Computational Learning. The PASCAL VOC Challenge is a benchmark for classification recognition and detection of visual objects, providing a standard image annotation dataset and a standard evaluation system for detection algorithms and learning performance. Each year since 2005, the group has provided a series of categories of tagged images, and the challenger has designed a variety of sophisticated algorithms to classify them based solely on the analysis of their content, ultimately passing accuracy, recall rate and efficiency

MS COCO stands for Microsoft Common Objects in Context, derived from a Microsoft COCO dataset funded by Microsoft in 2014. Like the ImageNet contest, It is regarded as one of the most closely watched and authoritative competitions in computer vision.

Here is a demonstration of the use of annotations.

COCO format, folder path style:

│ ├─ Flag # COCO │ ├─ Instances_Train2017. Json # Object │ ├─ Instances_Val2017. Json # Object Instances - │ ├─ Instances_Val2017 Person_keypoints_train2017.json # object │ ├─ Person_KeyPoints_val2017.json # object │ ├─ ├─ Captions_train2017.json # Image Captions │ ├─ │ ├ _ Captions_val2017. json # image captions -- This is a validation set annotationCopy the code

VOC format, folder path style:

│ ├─ ImageSets │ ├─ Main │ ├─ JPEGImages # │ ├─ label_list.txt │ ├─ label_list.txt │ ├─ label_list.txt │ ├─ label_list.txt │Copy the code

2.1.2 Annotation format of Object Keypoint

{
"info": info,
"licenses": [license],
"images": [image],
"annotations": [annotation],
"categories": [category]
}
Copy the code

Among them, info, licenses and images structures/types are the same in different JSON files. Definitions are shared (Object Instances, Object Keypoints, image captions). What is not shared is the annotation and category structures, which are different in different types of JSON files. The new keypoints is an array of length 3 X k, where k is the total number of keypoints in the category. Each keypoint is an array of length 3. The first and second elements are x and y coordinates, respectively. The third element is a flag bit v, where v is 0 means that the keypoint is not marked (in this case x=y=v=0). When v is 2, the key is marked and visible. Um_keypoints indicates the number of keypoints marked on this target (v>0). Key points may not be marked on smaller targets.

annotation{ "keypoints": [x1,y1,v1,...] , "num_keypoints": int, "id": int, "image_id": int, "category_id": int, "segmentation": RLE or [polygon], "area": Float, "bbox": [x,y,width,height], "iscrowd": 0 or 1,} Example: {"segmentation": [[125.12, 539.69, 140.94, 522.43, 100.67, 496.54, 84.85, 469.21, 73.35, 450.52, 104.99, 342.65, 168.27, 290.88, 179.78, 288189.84, 286. 56191.28, 260.67, 202.79, 240.54, 221.48, 237.66, 248.81, 243.42, 257.44, 256.36, 253.12, 262.11, 253.12, 275.06, 299.15, 233.35, 329.3 5207.46, 355.24, 206.02, 363.87, 206.02, 365.3, 210.34, 373.93, 221.84, 363.87, 226.16, 363.87, 237.66, 350.92, 237.66, 332.22, 234.79, 314.97, 249.17, 271.82, 313.89, 253.12, 326.83, 227.24, 352.72, 214.29, 357.03, 212.85, 372.85, 208.54, 395.87, 228.67, 414.56, 245.93, 4 21.75, 266.07, 424.63, 276.13, 437.57, 266.07, 450.52, 284.76, 464.9, 286.2, 479.28, 291.96, 489.35, 310.65, 512.36, 284.76, 549.75, 244. 49522.43, 215.73, 546.88, 199.91, 558.38, 204.22, 565.57, 189.84, 568.45, 184.09, 575.64, 172.58, 578.52, 145.26, 567.01, 117.93, 551.1 ], "num_keypoints": 0, "iscrowd": 0, "keypoints": 0 [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,142,309,1,177,320,2,191,398,2,237,317,2,233,426,2,306,233,2,92,452,2,123,468,2,0,0,0,251, 469,2,0,0,0,162,551,2], "image_id" : 425226, "bbox" : [73.35, 206.02, 300.58, 372.5], "category_id" : 1, "id" : 183126},Copy the code

2.1.3 categories field

Finally, for each category structure, two additional fields are added compared to the category in Object Instance. Keypoints are an array of length K that contains the name of each keypoint. Skeleton defines the connectivity between key points (e.g. a person’s left wrist and left elbow are connected, but left wrist and right wrist are not). Currently, COCO’s keypoints are labeled only person category.

{"id": int, "name": STR, "supercategory": STR, "keypoints": [STR], "skeleton": [edge]} Example: {"supercategory": "person", "id": 1, "name": "person", "keypoints": ["nose","left_eye","right_eye","left_ear","right_ear","left_shoulder","right_shoulder","left_elbow","right_elbow","left_ wrist","right_wrist","left_hip","right_hip","left_knee","right_knee","left_ankle","right_ankle"], "skeleton": [[16, 14], [14, 12], [17], [15, 13], [12, 13], [6, 12], [7, 13], [6, 7], [6], [7, 9], [8, 10], [9, 11], [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]]}Copy the code

2.1.4 Data set statistics

   Human key points labeling, the distribution of the number of key points in each human body, among which the range of 11-15 is the most, with nearly 70,000 people, followed by 6-10, more than 40,000 people, followed by 16-17,2-5,…

# pull PaddleDetection! git clone https://github.com.cnpmjs.org/PaddlePaddle/PaddleDetectionCopy the code
Store to the persistence layer! mv PaddleDetection/ work/Copy the code
Import the required dependencies! pip install -r work/PaddleDetection/requirements.txtImport the package needed for the conversion! pip install pycocotools ! pip install scikit-imageCopy the code
# COCO annotation! unzip -oq /home/aistudio/data/data97273/annotations_trainval2017.zip -d ./
# Total validation set! unzip -oq /home/aistudio/data/data97273/val2017.zip -d ./  
# Total training set! unzip -oq /home/aistudio/data/data97273/train2017.zip -d ./      
Copy the code
Create a directory of parsed images and XML files! mkdir -p VOCData/images/ ! mkdir -p VOCData/Annotations/ ! mkdir COCOData/Copy the code
# Processing target detection data set! python ProcessData.pyCopy the code
Create an empty label TXT file in the absence of this file! touch VOCData/label_list.txtCopy the code
Move to the dataset folder! mv VOCData work/PaddleDetection/dataset/Copy the code
%cd work/PaddleDetection/
Copy the code
Label_list.txt label_list.txt label_list.txt label_list.txt label_list.txt label_list.txt label_list.txt
import os
import shutil
import skimage.io as io

from tqdm import tqdm
from random import shuffle

dataset = 'dataset/VOCData/'
train_txt = os.path.join(dataset, 'train_val.txt')
val_txt = os.path.join(dataset, 'val.txt')
lbl_txt = os.path.join(dataset, 'label_list.txt')

classes = [
        "person"
    ]

with open(lbl_txt, 'w') as f:
    for l in classes:
        f.write(l+'\n')

xml_base = 'Annotations'
img_base = 'images'

xmls = [v for v in os.listdir(os.path.join(dataset, xml_base)) if v.endswith('.xml')]
shuffle(xmls)

split = int(0.85 * len(xmls)) # Divide training set and verification set

with open(train_txt, 'w') as f:
    for x in tqdm(xmls[:split]):
        m = x[:-4] +'.jpg'
        xml_path = os.path.join(xml_base, x)
        img_path = os.path.join(img_base, m)
        f.write('{} {}\n'.format(img_path, xml_path))
    
with open(val_txt, 'w') as f:
    for x in tqdm(xmls[split:]):
        m = x[:-4] +'.jpg'
        xml_path = os.path.join(xml_base, x)
        img_path = os.path.join(img_base, m)
        f.write('{} {}\n'.format(img_path, xml_path))
Copy the code
%cd /home/aistudio/ ! mv val2017/ COCOData/ ! mv train2017/ COCOData/ ! mv annotations/ COCOData/ ! mv -f COCOData/ data/ ! mv -f work/PaddleDetection/dataset/VOCData data/Copy the code

2.2 Customized data sets for training

2.2.1 Common Annotation Tools

For the task of image classification, we only need to classify the corresponding picture into which category. For detection task and segmentation task, currently the more popular data annotation tools are Labelimg and Labelme, which are used for detection task and segmentation task annotation respectively.

Github address:

labelimg

labelme

PPOCRLabel

! mkdir work/PaddleDetection/dataset/MaskVOCDataCopy the code

2.2.2 Make VOC format and COCO format data sets and divide them

# Unpack your own data set! unzip -oq /home/aistudio/data/data101583/facemask.zip -d work/PaddleDetection/dataset/MaskVOCData
Copy the code
# import paddlex! pip install paddlexCopy the code
# Divide VOC dataset! paddlex --split_dataset --format VOC --dataset_dir work/PaddleDetection/dataset/MaskVOCData/ --val_value 0.15  --test_value 0.05
Copy the code
%cd work/PaddleDetection/
Copy the code
# Make COCO dataset
Extract all photo names from img directory without suffix
import pandas as pd 
import os


filelist = os.listdir("dataset/MaskVOCData/JPEGImages")
train_name = []

for file_name in filelist:
    name, point ,end =file_name.partition('. ')
    train_name.append(name)

df = pd.DataFrame(train_name) 
df.head(8)

df.to_csv('./train_all.txt', sep='\t', index=None,header=None) 
Copy the code
! mkdir -p dataset/MaskVOCData/ImageSets/Main ! mv train_all.txt dataset/MaskVOCData/ImageSets ! mv dataset/MaskVOCData/labels.txt dataset/MaskVOCData/label_list.txt ! cp dataset/MaskVOCData/label_list.txt dataset/MaskVOCData/ImageSets/Copy the code
# backup VOC! cp -r dataset/MaskVOCData /home/aistudio/Copy the code
! python tools/x2coco.py \ --dataset_type voc \ --voc_anno_dir dataset/MaskVOCData/Annotations \ --voc_anno_list dataset/MaskVOCData/ImageSets/train_all.txt \ --voc_label_list dataset/MaskVOCData/ImageSets/label_list.txt \ --voc_out_name ./dataset/annotations.jsonCopy the code

AssertionError: Label is not in label2ID. If you encounter this problem, it means that your label does not correspond to the label in the label file, and there is no space when labeling

! mv dataset/MaskVOCData dataset/MaskCOCOData ! mv .. /.. /MaskVOCData dataset ! mkdir dataset/MaskCOCOData/annotations ! mv dataset/annotations.json dataset/MaskCOCOData/annotations ! rm dataset/MaskCOCOData/train_list.txt ! rm dataset/MaskCOCOData/val_list.txt ! rm dataset/MaskCOCOData/label_list.txt ! rm dataset/MaskCOCOData/test_list.txt ! rm -r dataset/MaskCOCOData/Annotations ! rm -r dataset/MaskCOCOData/ImageSetsCopy the code
Split COCO dataset! paddlex --split_dataset --format COCO --dataset_dir dataset/MaskCOCOData/annotations --val_value 0.15  --test_value 0.05
Copy the code

3. Data processing methods

3.1 The nature of images

There are actually two kinds of common images, one is called bitmap and the other is called vector map. As shown below:

Bitmap features:

  • Defined by the pixels of a drop paste

  • Large file size

  • Color performance is rich and lifelike

Vector graph features:

  • Hypervector definition

  • Put too not fuzzy

  • Small file size

  • Performance is poor

%cd /home/aistudio/work/PaddleDetection
Copy the code
import paddle
import paddlex as pdx
import numpy as np
import paddle.nn as nn
import paddle.nn.functional as F
import PIL.Image as Image
import cv2 
import os

from random import shuffle
from paddlex.det import transforms as T
from PIL import Image, ImageFilter, ImageEnhance

Copy the code
import matplotlib.pyplot as plt # PLT is used to display images

path='dataset/MaskCOCOData/JPEGImages/maksssksksss195.png'
img = Image.open(path)
plt.imshow(img)          Draw an image from an array
plt.show()               # display image


# grayscale
img = np.array(Image.open(path).convert('L'), 'f')
plt.imshow(img,cmap="gray")          Draw an image from an array
plt.show()               # display image

The plt.imshow(gray,cmap="gray") method is used to display the grayscale image.
Copy the code
# artwork
img = cv2.imread(path)
plt.subplot(221)
plt.imshow(img,cmap="gray")
# matplotlib Displays the original image in RGB order
plt.imshow(cv2.cvtColor(img,cv2.COLOR_BGR2RGB)) 
plt.subplot(222)
# Cv2 default GBR display diagram
plt.imshow(img)
plt.subplot(223)
# 32*32 thumbnail
plt.imshow(cv2.resize(img, (32.32)))
Copy the code
There is a lot of overlap in the target field of view, or it is a little fuzzy
path='dataset/MaskCOCOData/JPEGImages/maksssksksss443.png'
img = Image.open(path)
plt.imshow(img)        
plt.show()  


# sharpening
img = img.filter(ImageFilter.SHARPEN)
img = img.filter(ImageFilter.SHARPEN)
plt.imshow(img)        
plt.show()          

# Brightness conversion
bright_enhancer = ImageEnhance.Brightness(img)    # Pass in the adjustment coefficient brightness
img = bright_enhancer.enhance(1.6)
plt.imshow(img)        
plt.show() 

# Improve contrast
contrast_enhancer = ImageEnhance.Contrast(img)    # Pass in adjustment coefficient contrast
img = contrast_enhancer.enhance(1.9)
plt.imshow(img)        
plt.show() 
Copy the code

3.2 Why do these data enhancements need to be done

Because the complexity of many deep learning models is too high, and in the case of a small amount of data, it is easy to cause overfitting (generally speaking, the training model is too immersed in some characteristics of the training sample), and the model is affected by many irrelevant factors. And it turns out that you don’t do very well predicting it on a sample that you haven’t seen.


def preprocess(dataType="train") :
    if dataType == "train":
        transform = T.Compose([
            T.MixupImage(mixup_epoch=10),   # Mixup operation on images, data enhancement operation during model training, currently only YOLOv3 model supports this transform
            # t.rayomexpand (), # random expand image
            # t. vandaldistort (brightness_range=1.2, brightness_prob=0.3), # Random pixel content transformations with a certain probability
            # t.crop (), # crop the image randomly
            # t.resizebyShort (), # resize the image according to the short edge of the image
            T.Resize(target_size=608, interp='RANDOM'),   # adjust the image size, [' on ', 'LINEAR', 'CUBIC', 'AREA', 'LANCZOS4', 'RANDOM']
            # t.randomHorizontalFlip (), # perform a random horizontal flip of the image with a certain probability
            T.Normalize()  # Standardize the image
            ])
        return transform
    else:
        transform = T.Compose([
            T.Resize(target_size=608, interp='CUBIC'), 
            T.Normalize()
            ])
        return transform


train_transforms = preprocess(dataType="train")
eval_transforms  = preprocess(dataType="eval")



Define the data set for training and validation
# API address: https://paddlex.readthedocs.io/zh_CN/develop/data/format/detection.html? highlight=paddlex.det
train_dataset = pdx.datasets.VOCDetection(
    data_dir='./dataset/MaskVOCData',
    file_list='./dataset/MaskVOCData/train_list.txt',
    label_list='./dataset/MaskVOCData/label_list.txt',
    transforms=train_transforms,
    shuffle=True)
eval_dataset = pdx.datasets.VOCDetection(
    data_dir='./dataset/MaskVOCData',
    file_list='./dataset/MaskVOCData/val_list.txt',
    label_list='./dataset/MaskVOCData/label_list.txt',
    transforms=eval_transforms)
Copy the code

4. Model training and evaluation

import matplotlib
matplotlib.use('Agg') 
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

#num_classes Some models need to be added by 1 like faster_rCNN
num_classes = len(train_dataset.labels)

model = pdx.det.PPYOLO(num_classes=num_classes, )

model.train(
    num_epochs=70,
    train_dataset=train_dataset,
    train_batch_size=16,
    eval_dataset=eval_dataset,
    learning_rate=3e-5,
    warmup_steps=90,
    warmup_start_lr=0.0,
    save_interval_epochs=7,
    lr_decay_epochs=[42.70],
    save_dir='output/PPYOLO',
    use_vdl=True)
Copy the code
! mkdir ./output ! unzip -oq /home/aistudio/data/data101583/PPYOLO_YES.zip-d ./output ! unzip -oq /home/aistudio/data/data101583/PPYOLO_ALL.zip-d ./output ! unzip -oq /home/aistudio/data/data101583/PPYOLO_NO.zip-d ./output ! mv -f output/home/aistudio/PPYOLO_ALL output ! mv -f output/home/aistudio/PPYOLO_YES output ! rm -r output/home/Copy the code

Note: the environment must be restarted after the training point is broken, and the environment cached before the interruption must be cleared. Click restart. Re-run code blocks 7, 20, 31, 32, 37 to continue with the previous addition (39) to retrain.

4.1 Comparative experiment

When all other parameters are the same, mAP is 38.06 without any data enhancement:

When other parameters are the same, mAP is 41.9 in the case of random expansion and random pixel transformation data enhancement:

When all other parameters are the same, mAP is 35.4 when adding random clipping, random horizontal flip, short edge adjustment, and data enhancement with Mixup:

The above comparative experiments show that mAP value can be slightly improved when data enhancement is correctly added.

4.2 Expand mAP:

In the field of machine learning, there are many indexes used to evaluate the performance of a model, among which several are FP, FN, TP, TN, Precision, Recall and Accuracy.

Mean Average Precision, i.e. the Average value of AP of each category, is the area under the AP: PR curve.

Let’s take a look at the IOU criteria:

TP, FP, FN, TN

The usual way of judging, the first T,F stands for true or false. The second P and N represent true or false judgment

  • True Positive (TP): IoU>IOUthreshold \mathrm{IoU}>I O U_{\text {threshold}}IoU>IOUthreshold (IoU threshold is usually 0.5) You can think of it as the real box, or the standard answer

  • False Positive (FP): IoU

  • False Negative (FN): Indicates the number of GT’s not detected

  • True Negative (TN): Useless in mAP

Precision: Precision =TPTP+FP=TP all detections =\frac{T P}{T P+F P}=\frac{T P}{\text { all detections }}=TP+FPTP= all detections TP

Recall: Recall =TPTP+FN=TP all ground truths =\frac{T P}{T P+F N}=\frac{T P}{\text { all ground truths }}=TP+FNTP= all ground truths TP

The curves drawn by the two are called P-R curves: recall rate: P is the vertical axis y recall rate: R is the horizontal axis X axis, as shown in the figure below

The mAP value is the area under the PR curve.

5. Model inference prediction

Visualize the prediction using a model and visualize the result using PDx.det. visualize, Visual results will be saved to the work/PaddleDetection/output/PPYOLO vdl_log, save images to work/load model reasoning PaddleDetection/output/PPYOLO/img.

#maksssksksss152.png maksssksksss105.png
model = pdx.load_model('output/PPYOLO_YES/best_model')

image_dir = '.. /.. /Test/'
images = os.listdir(image_dir)

for img in images:
    image_name = image_dir + img
    result = model.predict(image_name)
    pdx.det.visualize(image_name, result, threshold=0.3, save_dir='./output/PPYOLO_YES/img')
Copy the code
# Show the reasoning results of the model
path = ".. /.. /Test/maksssksksss152.png"
img = Image.open(path)
plt.imshow(img)          Draw an image from an array
plt.show()               # display image

path = 'output/PPYOLO_YES/img/visualize_maksssksksss152.png'
img = Image.open(path)
plt.imshow(img)          Draw an image from an array
plt.show()               # display image
Copy the code