[PP Smoking Video Analysis warning System] Based on PP-PicoDET

1. PP smoking video analysis warning detection system

Project Address: [PP Video Analysis early Warning System] based on PP-PicoDETAistudio.baidu.com/aistudio/pr…

Welcome to fork!

1. Function Introduction

PP video analysis warning system [Smoking detection]. Once smoking is detected in the area, the system will trigger the alarm immediately and remind the management personnel to deal with it. We will effectively improve supervision efficiency and reduce potential safety hazards. Widely used in warehouses, parks, gas stations, kitchens, forests, shopping malls and a series of smoke-free places, fire and explosion-proof places.

2. Project background

Smoking is harmful to health, and the hidden danger brought by smoking is a great threat to the public’s daily living environment. It is reported that smoking is responsible for more than 10% of the national fires each year. According to the statistics of Shanghai, Beijing, Jiangsu and other provinces and cities, the fire caused by careless smoking accounts for 10-20%, ranking the third in all kinds of fire causes. The traditional smoke control technology means mainly smoke sensor, when the sensor detects smoke alarm. However, the management personnel are unable to efficiently manage it, obtain evidence in time, and trace back, so that an effective closed loop is not formed, leading to poor tobacco control effect, and there will be missing and false alarms.

3. Smoking surveillance

PP video analysis of early warning system based on smoke detection 】【 fly oar PP – PicoDet lightweight series model research, the main activities of people smoking in combination with the recognition of cigarette, smoking behavior to analysis of the monitoring area, when found abnormal situation system immediately alarm, remind managers timely processing, access to the broadcast system front can call remind, Truly advance warning, normal detection in the event, after the standard management.

4. Application scenarios

PP video analysis warning system [smoking detection] is widely used in warehouses, parks, gas stations, kitchens, forests, shopping malls, subway stations and fire corridors and a series of non-smoking places, fire and explosion prevention places

A small cigarette butts a great deal of mischief. PP video analysis early warning system [smoking detection] can effectively improve the efficiency of supervision, reduce safety risks. It can be widely deployed on edge equipment, economical, stable and practical.

Second, data processing

1. Decompress data

Decompress the data! unzip -qoa data/data94796/pp_smoke.zip
Copy the code

2. Divide data sets proportionally

Ratio ratio

import random
import os
TXT and val.txt
random.seed(2020)
xml_dir  = '/home/aistudio/Annotations'# tag file address
img_dir = '/home/aistudio/images'Image file address
path_list = list(a)for img in os.listdir(img_dir):
    img_path = os.path.join(img_dir,img)
    xml_path = os.path.join(xml_dir,img.replace('jpg'.'xml'))
    path_list.append((img_path, xml_path))
random.shuffle(path_list)
ratio = 0.9
train_f = open('/home/aistudio/work/train.txt'.'w') # Generate training files
val_f = open('/home/aistudio/work/val.txt' ,'w')Generate a validation file

for i ,content in enumerate(path_list):
    img, xml = content
    text = img + ' ' + xml + '\n'
    if i < len(path_list) * ratio:
        train_f.write(text)
    else:
        val_f.write(text)
train_f.close()
val_f.close()

Generate the tag document
label = ['smoke']Set the category you want to detect
with open('/home/aistudio/work/label_list.txt'.'w') as f:
    for text in label:
        f.write(text+'\n')
Copy the code

3. Data viewing

The source data format is VOC format and the storage format is as follows:

The dataset / ├ ─ ─ Annotations │ ├ ─ ─ xxx1. XML │ ├ ─ ─ xxx2. XML │ ├ ─ ─ xxx3. XML │ |... ├ ─ ─ Images │ ├ ─ ─ xxx1. JPG │ ├ ─ ─ xxx2. JPG │ ├ ─ ─ xxx3. JPG │ |... │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │ ├─ Valid.txt │.Copy the code

Three, environmental preparation

1. PP – PicoDet is introduced

PaddleDetection has introduced a new lightweight series of models, PP-PicoDET, with excellent performance on the mobile end, as the new SOTA lightweight model. Full technical details can be found in our arXiv technical report.

Pp-picodet model has the following characteristics:

🌟 higher mAP: the first one is within 1M parametersThe mAP (0.5, 0.95)beyond30 +(at 416 pixels).
🚀 faster prediction speed: network prediction up to 150FPS on an ARM CPU.
😊 deployment friendly: support PaddleLite/MNN NCNN/OpenVINO predict library, support roll-out ONNX, provides a c + + / Python/Android demo.
😍 Advanced algorithms: We have made innovations in existing SOTA algorithms, including: ESNet, CSP-PAN, SimOTA and so on. At present

2. Data format

Currently, PP-PicoDET supports VOC and COCO formats, which can be selected as required.

3. The baseline

model	The input size	mAP^{val 0.5:0.95}	mAP^val 0.5	And the number of ^(M)	FLOPS ^(G)	To predict delay^{NCNN^{^(ms)}}	To predict delay^{Lite^{^(ms)}}	download	The configuration file
PicoDet-S	320 * 320	27.1	41.4	0.99	0.73	8.13	6.65	model \| log	config
PicoDet-S	416 * 416	30.7	45.8	0.99	1.24	12.37	9.82	model \| log	config
PicoDet-M	320 * 320	30.9	45.7	2.15	1.48	11.27	9.61	model \| log	config
PicoDet-M	416 * 416	34.8	50.5	2.15	2.50	17.39	15.88	model \| log	config
PicoDet-L	320 * 320	32.9	48.2	3.30	2.23	15.26	13.42	model \| log	config
PicoDet-L	416 * 416	36.6	52.5	3.30	3.76	23.36	21.85	model \| log	config
PicoDet-L	640 * 640	40.9	57.6	3.30	8.91	54.11	50.55	model \| log	config

More configurations

model	The input size	mAP^{val 0.5:0.95}	mAP^val 0.5	And the number of ^(M)	FLOPS ^(G)	To predict delay^{NCNN^{^(ms)}}	To predict delay^{Lite^{^(ms)}}	download	The configuration file
PicoDet-Shufflenetv2 1x	416 * 416	30.0	44.6	1.17	1.53	15.06	10.63	model \| log	config
PicoDet-MobileNetv3-large 1x	416 * 416	35.6	52.0	3.55	2.80	20.71	17.88	model \| log	config
PicoDet – LCNet 1.5 x	416 * 416	36.3	52.2	3.10	3.85	21.29	20.8	model \| log	config
PicoDet – LCNet 1.5 x	640 * 640	40.6	57.4	3.10	–	–	–	model \| log	config
PicoDet-R18	640 * 640	40.7	57.2	11.10	–	–	–	model \| log	config

Matters needing attention:

Delay test:All of our models are here865 (4 + 4 xa55 xa77 Xiao dragon)On test (4 threads, FP16 forecast). Marked in the table aboveNCNNIs the use ofNCNNLibrary tests, labeledLiteIs the use ofPaddle LiteTest. The benchmark script for the test comes from:MobileDetBenchmark.
PicoDet trains on COCO Train2017 and is validated on COCO Val2017.
PicoDet is trained with a 4-card GPU (picodet-L-640 is trained with 8-card GPU), and all models are trained with published default configurations.

Baseline for other models

model	The input size	mAP^{val 0.5:0.95}	mAP^val 0.5	And the number of ^(M)	FLOPS ^(G)	To predict delay^{NCNN^{^(ms)}}
YOLOv3-Tiny	416 * 416	16.6	33.1	8.86	5.62	25.42
YOLOv4-Tiny	416 * 416	21.7	40.2	6.06	6.96	23.69
PP-YOLO-Tiny	320 * 320	20.6	–	1.08	0.58	6.75
PP-YOLO-Tiny	416 * 416	22.7	–	1.08	1.02	10.48
Nanodet-M	320 * 320	20.6	–	0.95	0.72	8.71
Nanodet-M	416 * 416	23.5	–	0.95	1.2	13.35
Nanodet – M 1.5 x	416 * 416	26.8	–	2.08	2.42	15.83
YOLOX-Nano	416 * 416	25.8	–	0.91	1.08	19.23
YOLOX-Tiny	416 * 416	32.8	–	5.06	6.45	32.77
YOLOv5n	640 * 640	28.4	46.0	1.9	4.5	40.35
YOLOv5s	640 * 640	37.2	56.0	7.2	16.5	78.05

4. Install

Environmental requirements

PaddlePaddle > = 2.1.2
Python > = 3.5
PaddleSlim > = 2.1.1
PaddleLite > = 2.10

# Download the PaddleDetection source code and run the following command! git clone https://gitee.com/PaddlePaddle/PaddleDetection.git -b develop --depth1
Copy the code

Cloning into 'PaddleDetection'... remote: Enumerating objects: 1993, done.[K remote: Counting objects: 100% (1993/1993), done.[K remote: Compressing objects: 100% (1511/1511), done.[K remote: Total 1993 (delta 689), reused 1118 (delta 415), pack-reused 0[K Receiving objects: 100% (1993/1993) and 175.80 MiB | 4.03 MiB/s, done. Resolving deltas: 100% (689/689), done. Checking connectivity... done.Copy the code

Install other dependencies%cd ~/PaddleDetection ! pip install -U pip --user ! pip install -r requirements.txt# install paddledet! python setup.py install ! pip install paddleslim==2.11.
Copy the code

4. Execution training

1. Model selection

Because it needs to be deployed on the mobile end and ensure fast speed and high accuracy, we choose the new lightweight series model PP-PicoDET proposed by PaddleDetection. The model has the following characteristics:

Higher mAP: the first mAP(0.5:0.95) to exceed 30+(when entering 416 pixels) within 1M parameters.
Faster prediction speed: network prediction up to 150FPS on an ARM CPU.
Deployment: friendly support PaddleLite/MNN NCNN/OpenVINO predict library, support roll-out ONNX, provides a c + + / Python/Android demo.
Advanced algorithms: We have made innovations in existing SOTA algorithms, including: ESNet, CSP-PAN, SimOTA and so on.

Select the VOC data set training configuration for PP-PicoDET here

2. Modify the configuration

(1) Modify configs/datasets/ VOC.yml

metric: VOC map_type: 11point num_classes: 1 TrainDataset: ! VOCDataSet dataset_dir: /home/aistudio/images anno_path: /home/aistudio/work/train.txt label_list: /home/aistudio/work/label_list.txt data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult'] EvalDataset: ! VOCDataSet dataset_dir: /home/aistudio/images anno_path: /home/aistudio/work/val.txt label_list: /home/aistudio/work/label_list.txt data_fields: ['image', 'gt_bbox', 'gt_class', 'difficult'] TestDataset: ! ImageFolder anno_path: /home/aistudio/work/label_list.txtCopy the code

Num_classes Contains training set, validation set, test set image path image_dir, annotated JSON file path anno_path, dataset path dataset_dir

Configs /picodet/ picoDET_s_320_voc.yml

Pre-training model: pretrain_weights

Training hyperparameters: epoch, BATCH_size, base_LR

Detailed configuration file modification and description.

The configuration is complete
%cd ~
cp voc.yml ~/PaddleDetection/configs/datasets/voc.yml
Copy the code

3. Model training

PaddleDetection provides a single-card/multi-card training model to meet various training needs of users. The specific codes are as follows:

# Train on a single GPU%cd ~/PaddleDetection/ ! export CUDA_VISIBLE_DEVICES=0 You do not need to run this command on Windows or Mac! python tools/train.py -c configs/picodet/picodet_s_320_voc.yml --eval

# Multi-card GPU training
#! Export CUDA_VISIBLE_DEVICES,1,2,3 = 0
#! Python -m paddle. Distributed. Launch - gpus 0,1,2,3 tools/train py \
# -c configs/picodet/picodet_l_640_coco.yml
Copy the code

The log

[03/11 00:47:54] PPdet.engine INFO: [290] [0/5] Learning_rate: 0.001711 LOss_vFL: 0.479042 LOss_bbox: 0.254253 loss_DFL: 0.178100 Loss: 0.905043 ETA: 0:02:34 batCH_cost: 2.7071 data_cost: 0.9844 ips: Images /s [03/11 00:48:10] PPdet.engine INFO: Epoch: [291] [0/5] Learning_rate: 0.001386 LOss_vFL: 0.477680 loss_bbox: 0.251249 loss_DFL: 0.178367 Loss: 0.908022 eta: 0:02:18 batCH_cost: 2.7419 data_cost: 1.0017 ips: Images/S [03/11 00:48:25] PPdet.engine INFO: Epoch: [292] [0/5] Learning_rate: 0.001096 LOss_vFL: 0.481945 loss_bbox: 0.250645 loss_DFL: 0.179363 Loss: 0.913140 ETA: 0:02:03 batCH_cost: 2.6677 data_cost: 0.9474 ips: Images /s [03/11 00:48:41] PPdet.engine INFO: Epoch: [293] [0/5] Learning_rate: 0.0039 loss_vfl: 0.484040 loss_bbox: 0.255441 loss_DFL: 0.178693 Loss: 0.920747 ETA: 0:01:47 batCH_cost: 2.6138 data_cost: 0.8594 ips: Images /s [03/11 00:48:57] PPdet.engine INFO: Epoch: [294] [0/5] Learning_rate: 0.000617 LOss_vFL: 0.487185 loss_bbox: 0.259508 loss_DFL: 0.179363 Loss: 0.927957 ETA: 0:01:32 batch_cost: 2.7094 data_cost: 1.0264 ips: 47.2421 Images/S [03/11 00:49:15] PPdet.engine INFO: Epoch: [295] [0/5] Learning_rate: 0.000428 LOss_vFL: 0.487185 loss_bbox: 0.258693 loss_DFL: 0.178415 Loss: 0.928867 eta: 0:01:17 batCH_cost: 2.7961 data_cost: 1.2018 ips: Images/S [03/11 00:49:31] PPdet.engine INFO: Epoch: [296] [0/5] Learning_rate: 0.000274 LOss_vFL: 0.487185 loss_bbox: 0.258693 loss_DFL: 0.177018 Loss: 0.924824 ETA: 0:01:01 batCH_cost: 2.8753 data_cost: 1.2545 ips: Images/S [03/11 00:49:48] PPdet.engine INFO: Epoch: [297] [0/5] Learning_rate: 0.000154 LOss_vFL: 0.484518 loss_bbox: 0.252453 loss_DFL: 0.176589 Loss: 0.916203 ETA: 0:00:46 batch_cost: 2.9412 data_cost: 1.4018 ips: [03/11 00:50:05] PPdet.engine INFO: Epoch: [298] [0/5] Learning_rate: 0.000069 LOss_vFL: 0.480916 loss_bbox: 0.249677 LOss_DFL: 0.175895 Loss: 0.913992 ETA: 0:00:30 batCH_cost: 2.9504 data_cost: 1.4170 ips: 43.3835 Images/S [03/11 00:50:22] PPdet.engine INFO: Epoch: [299] [0/5] Learning_rate: 0.000017 LOss_vFL: 0.483217 loss_bbox: 0.249677 loss_DFL: 0.175399 Loss: 0.913660 ETA: 0:00:15 batCH_cost: 2.9036 data_cost: 1.2665 ips: 44.0839 images/s [03/11 00:50:29] ppdet.utils. Checkpoint INFO: Save checkpoint: output/picodet_s_320_voc [03/11 00:50:29] ppdet.engine INFO: Eval iter: 0 [03/11 00:50:33] ppdet.metrics.metrics INFO: Accumulating evaluatation results... [03/11 00:50:33] ppdet.metrics INFO: mAP(0.50, 11point) = 85.92% [03/11 00:50:33] ppdet.metrics INFO: mAP(0.50, 11point) = 85.92% Total sample number: 78, averge FPS: 19.29553460230837 [03/11 00:50:33] ppdet.engine INFO: [03/11 00:50:34] ppdet.utils. Checkpoint INFO: Save checkpoint: output/ picodet_s_320_VOCCopy the code

5. Model evaluation

! python -u tools/eval.py -c  configs/picodet/picodet_s_320_voc.yml  -o weights=output/picodet_s_320_voc/best_model.pdparams
Copy the code

W0311 00:52:53.690088 19588 Device_context. cc:447] Please NOTE: device: 0, GPU Compute Capability: W0311 00:52:53.741935 19588 Device_context. cc:465] Device: [03/11 00:53:13] ppdet.utils. Checkpoint INFO: Finish loading model weights: output/picodet_s_320_voc/best_model.pdparams [03/11 00:53:17] ppdet.engine INFO: Eval iter: 0 [03/11 00:53:20] ppdet.metrics.metrics INFO: Accumulating evaluatation results... [03/11 00:53:20] ppdet.metrics. Metrics INFO: mAP(0.50, 11point) = 85.92% [03/11 00:53:20] ppdet.metrics INFO: Total sample number: 78, averge FPS: 11.196457662129422Copy the code

W0311 00:52:53.690088 19588 Device_context. cc:447] Please NOTE: device: 0, GPU Compute Capability: W0311 00:52:53.741935 19588 Device_context. cc:465] Device: [03/11 00:53:13] ppdet.utils. Checkpoint INFO: Finish loading model weights: output/picodet_s_320_voc/best_model.pdparams [03/11 00:53:17] ppdet.engine INFO: Eval iter: 0 [03/11 00:53:20] ppdet.metrics.metrics INFO: Accumulating evaluatation results... [03/11 00:53:20] ppdet.metrics. Metrics INFO: mAP(0.50, 11point) = 85.92% [03/11 00:53:20] ppdet.metrics INFO: Total sample number: 78, averge FPS: 11.196457662129422Copy the code

6. Model prediction

1. Dynamic graph prediction

After tools/infer. Py is executed, corresponding prediction results will be generated in the output folder

%cd ~/PaddleDetection ! python tools/infer.py -c configs/picodet/picodet_s_320_voc.yml -o weights=output/picodet_s_320_voc/best_model.pdparams --infer_img=/home/aistudio/smoke1.jpgCopy the code

/ home/aistudio/PaddleDetection W0311 00:59:59. 968083 20342 device_context. Cc: 447] both Please NOTE: the device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.w0311 00:59:59.973002 20342 Device_context. cc:465] Device: 0, cuDNN Version: [03/11 01:00:03] ppdet.utils. Checkpoint INFO: Finish loading model weights: output/picodet_s_320_voc/best_model.pdparams [03/11 01:00:04] ppdet.engine INFO: Detection bbox results save in output/smoke1.jpgCopy the code

1. The results show

The original image

Forecast figure

2. Model export

The model file saved in the model training process contains the process of forward prediction and back propagation. In actual industrial deployment, there is no need for back propagation, so the model needs to be derived into the model format required by deployment. To export the model, run the following command

! export CUDA_VISIBLE_DEVICES=0%cd ~/PaddleDetection/ ! python tools/export_model.py \ -c configs/picodet/picodet_s_320_voc.yml \ -o weights=output/picodet_s_320_voc/best_model.pdparams \ --output_dir=inference_modelCopy the code

/home/aistudio/PaddleDetection [03/11 01:08:30] ppdet.utils.checkpoint INFO: Finish loading model weights: output/picodet_s_320_voc/best_model.pdparams [03/11 01:08:30] ppdet.engine INFO: Export Inference Config File to Inference_Model/Picodet_S_320_VOC /infer_cfg. yML W0311 01:08:34.320701 20767 Device_context.cc :447] Please NOTE: Device: 0, GPU Compute Capability: 7.0, Driver API Version Runtime API Version: 10.1 W0311 01:08:34.320773 20767 Device_context. cc:465] Device :0, cuDNN Version: [03/11 01:08:37] PPdet. engine INFO: Export Model and Saved in inference_model/ PicodeT_S_320_VOCCopy the code

! tree ./inference_model/picodet_s_320_vocCopy the code

├─ ├─ model.pdiparams.info ├── ├.pdiparams.info ├── ├.pdiparams.info directories, 4 filesCopy the code

The prediction model is exported to the inference_model/ directory, including model.pdModel, model.pdiparams, model.pdiparams.info and infer_cfg.yml. The process configuration file representing the network structure, model weight, model weight name and model profile (including data preprocessing parameters, etc.) of the model respectively.

3. Static graph prediction

Enter the following command on the terminal to predict the deployment. For details, see Predicting deployment on Python:

! export CUDA_VISIBLE_DEVICES=0
''' --model_dir: --image_file: indicates the image to be tested. --image_dir: indicates the folder to be tested. -- Device: indicates the running device. The root directory where visual results are saved, default is output/"! python deploy/python/infer.py \ --model_dir=./inference_model/picodet_s_320_voc \ --image_file=/home/aistudio/smoke1.jpg  \ --device=GPUCopy the code

----------- Running Arguments ----------- batch_size: 1 camera_id: -1 cpu_threads: 1 device: GPU enable_mkldnn: False image_dir: None image_file: /home/aistudio/smoke1.jpg model_dir: ./inference_model/picodet_s_320_voc output_dir: output reid_batch_size: 50 reid_model_dir: None run_benchmark: False run_mode: paddle save_images: False save_mot_TXt_per_img: False save_mot_txts: False scaled: False threshold: 0.5 trt_calib_mode: False trt_max_shape: 1280 trt_min_shape: 1 trt_opt_shape: 640 use_dark: True use_gpu: False video_file: None ------------------------------------------ ----------- Model Configuration ----------- Model Arch: PicoDet Transform Order: --transform op: Resize --transform op: NormalizeImage --transform op: Permute --transform op: PadStride -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- class_id: 0, confidence: 0.8912, Right_bottom left_top: [2122.73, 1425.18] : [3712.47, 1949.28] save the result to: The output/smoke1. JPG -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- Inference Time Info -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- total_time (ms) : 3041.3, img_num: 1 Average Latency (ms): 3041.30, QPS: 0.328807, inference_time(ms): 3019.60 16.80, postprocess_time (ms) : 4.90Copy the code

In 4K images, the inference speed is 16.80ms, which is quite good.

Seven,

1. Compare with PP-Yolov2

Eval results are as follows:

The mAP (0.50, 11 point) = 85.92%
FPS: 11.196457662129422

However, compared with 86.74% of PP-YOLOV2, there is a gap of 0.82%, which should be due to incomplete data enhancement.

2. Optimization scheme

Pre-training model: The use of pre-training model can effectively improve the model accuracy, PP-Picodet model provides a pre-training model on COCO data set
Modify Loss: Change GIOU Loss in target detection to DIOU Loss
Modify LR: Adjust the learning rate, which is reduced by half here
Modify LR retraining: when the model no longer improves, it can load the trained model, adjust the learning rate to 1/10, and retrain.