Make writing a habit together! This is the fifth day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

preface

Previously we have learned how to quickly get started pytorch, and build a simple neural network, but where there are still a few small problem, that is we haven’t from homemade data sets, a classification of network, so later I would have time on how a LeNet do a simple custom based classification neural network small dome. And we modeled the YOLO project structure to “normalize” the dome ourselves.

However, our current task is to deploy an open source deep learning project using GitHub, which is based on Pytorch.

In this article we will briefly explain how to use YoloV5, do some fun things with it, and then train our own models to implement some of our features.

Environment to prepare

Download the project

Open the gayhub Download unzip and open your Pytorch.

It’s very thoughtful, there’s an environment dependent fileSo after opening the project, we just need to type this command in the consolePycharm has hints as well, but if you want pyCharm to come with your own installation environment, you are advised to use a mirror or go online.

Download weight files

If we don’t want to start from zero, we also need to download the weight files they’ve trained so we can get started quickly.

Download it and put it here

To obtain

For the convenience of readers, I have myself, I uploaded the project to baidu cloud disk. Help yourself to need

Link: https://pan.baidu.com/s/1tXbtecPGki_QyyohrlRjig extracted code: 6666

The project structure

Any machine learning project, or deep learning project, is really just a few steps awayThe project then provides an interface to load and use our trained models.

So this is actually no exception

The first “Hello World”

Let’s look at how to use YOLOV5.

Let’s start with the detect.py file. We first noticed a couple of hyperparameter Settings hereThe point is that the two can be seen. One is weights and this is to set our weights model. This is obviously the default in our project root directory. Another one is our resource file. Opening that folder we found a file with pictures in it

Ok, so anyway, let’s just go ahead and run our file console and output thisWe found the result of our run in the run folderThat completes our first Hello World.

Parameter Setting (Detect)

So let’s take a look at how this parameter can be set.

 parser = argparse.ArgumentParser()
    parser.add_argument('--weights', nargs='+'.type=str, default='yolov5s.pt'.help='model.pt path(s)')
    parser.add_argument('--source'.type=str, default='data/images'.help='source')  # file/folder, 0 for webcam
    parser.add_argument('--img-size'.type=int, default=640.help='inference size (pixels)')
    parser.add_argument('--conf-thres'.type=float, default=0.25.help='object confidence threshold')
    parser.add_argument('--iou-thres'.type=float, default=0.45.help='IOU threshold for NMS')
    parser.add_argument('--device', default=' '.help='cuda device, i.e. 0 or 0,1,2,3 or CPU ')
    parser.add_argument('--view-img', action='store_true'.help='display results')
    parser.add_argument('--save-txt', action='store_true'.help='save results to *.txt')
    parser.add_argument('--save-conf', action='store_true'.help='save confidences in --save-txt labels')
    parser.add_argument('--nosave', action='store_true'.help='do not save images/videos')
    parser.add_argument('--classes', nargs='+'.type=int.help='filter by class: --class 0, or --class 0 2 3')
    parser.add_argument('--agnostic-nms', action='store_true'.help='class-agnostic NMS')
    parser.add_argument('--augment', action='store_true'.help='augmented inference')
    parser.add_argument('--update', action='store_true'.help='update all models')
    parser.add_argument('--project', default='runs/detect'.help='save results to project/name')
    parser.add_argument('--name', default='exp'.help='save results to project/name')
    parser.add_argument('--exist-ok', action='store_true'.help='existing project/name ok, do not increment')
    opt = parser.parse_args()
    print(opt)
    check_requirements(exclude=('pycocotools'.'thop'))
Copy the code

Here we can refer to this article cloud.tencent.com/developer/a… It has detailed Settings about the parameters.

Real-time detection

Let’s look at the effect firstI didn’t do anything. I just turned on my phone camera and an IP camera app.

Download the software

Start the server

Then record the IP address of your LAN

So let’s say I have theta

Then enter your detect file, which you can set directly in the parameters or as Python runs a single file.

Then click Run.

But what we actually want to do is not that simple. And to be honest with YOLO from what I’ve been working with so far, it’s not a framework at all, it’s just a Dome, a fully implemented residual neural network Dome for target detection. So in order to facilitate the use and rewrite, we need to carefully read its source code, convenient behind the transformation of the framework. I have had a brief look at this framework. From an engineering point of view, it is not complicated, much simpler than Spring. Of course, its difficulty is not engineering difficulty, but professional theory.

Their training

Then it was finally time to train a gadget of your own.

But before you do that, you need to download a tagging software. Of course, you can also use online, but need scientific Internet, so I don’t need to introduce here.

So we use LabelImg here.

Using LabelImg

As for this software, it is very simple to use, mainly installation and use more trouble.

Download the source code first. This I will give baidu cloud disk link later. Since I’m a Coda environment, I just need to download the source code first and unzip it

Then go into this folder and install PyQt

conda install pyqt=5
Copy the code
pyrcc5 -o resources.py resources.qrc
Copy the code

That’s not enough. There’s a high probability that something will go wrong. So you need to move this generated file into the LIBS And finally, run

Everything is normal, the next is to obtain the software link: https://pan.baidu.com/s/14y-0vqU7u9JkDBaAw7Odxg extraction code: 6666

mark

In this case, I’ll just play around with some pictures.

Making data sets

In fact, we just finished the preparatory work before, and the next is the more complicated point, that is to make the data set. The format of the data set packaged by LabelImg is actually VOC format. Of course, it can also be switched to YOLO format, but it still needs to be converted later.

So let’s take a look at what our original data set looks like

Let’s see what the final document looks like

I’ll walk you through these scripts and processing operations in turn. These codes are copy, the goal is only a standard VOC data set, and then give Yolo recognition. Since it’s just a demo, I won’t have too many data sets for my images.

About the picture of the acquisition of words, you can write a crawler is it. Make sure the image size is the same here (I flipped over behind)

Divide training files

In this case, we’ll mainly run this script.

# coding:utf-8

import os
import random
import argparse

parser = argparse.ArgumentParser()
Annotations XML is stored under Annotations
parser.add_argument('--xml_path', default='Annotations'.type=str.help='input xml label path')
# Data set partition, address select their own data under ImageSets/Main
parser.add_argument('--txt_path', default='ImageSets/Main'.type=str.help='output txt label path')
opt = parser.parse_args()

trainval_percent = 1.0  # Proportion of training set and validation set. There is no test set partition
train_percent = 0.9     # The proportion of training set can be adjusted by yourself
xmlfilepath = opt.xml_path
txtsavepath = opt.txt_path
total_xml = os.listdir(xmlfilepath)
if not os.path.exists(txtsavepath):
    os.makedirs(txtsavepath)

num = len(total_xml)
list_index = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list_index, tv)
train = random.sample(trainval, tr)

file_trainval = open(txtsavepath + '/trainval.txt'.'w')
file_test = open(txtsavepath + '/test.txt'.'w')
file_train = open(txtsavepath + '/train.txt'.'w')
file_val = open(txtsavepath + '/val.txt'.'w')

for i in list_index:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        file_trainval.write(name)
        if i in train:
            file_train.write(name)
        else:
            file_val.write(name)
    else:
        file_test.write(name)

file_trainval.close()
file_train.close()
file_val.close()
file_test.close()

Copy the code

After the run, this file will appear

The tag

Then we generate our tags.

The following path changes by yourself

# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
import os
from os import getcwd

sets = ['train'.'val'.'test']
classes = ["Karada-chan"."Meow!"."Girl"]  # Change your category
abs_path = os.getcwd()
print(abs_path)


def convert(size, box) :
    dw = 1. / (size[0])
    dh = 1. / (size[1])
    x = (box[0] + box[1) /2.0 - 1
    y = (box[2] + box[3) /2.0 - 1
    w = box[1] - box[0]
    h = box[3] - box[2]
    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return x, y, w, h


def convert_annotation(image_id) :
    in_file = open('F: \ projects \ PythonProject \ \ mydata \ yolov5 5.0 Annotations \ % s.x ml' % (image_id), encoding='UTF-8')
    out_file = open('F: \ projects \ PythonProject \ yolov5-5.0 \ mydata \ labels \ % s.t xt' % (image_id), 'w')
    tree = ET.parse(in_file)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    for obj in root.iter('object'):
        difficult = obj.find('difficult').text
        # difficult = obj.find('Difficult').text
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        cls_id = classes.index(cls)
        xmlbox = obj.find('bndbox')
        b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
             float(xmlbox.find('ymax').text))
        b1, b2, b3, b4 = b
        # Mark out of bounds correction
        if b2 > w:
            b2 = w
        if b4 > h:
            b4 = h
        b = (b1, b2, b3, b4)
        bb = convert((w, h), b)
        out_file.write(str(cls_id) + "" + "".join([str(a) for a in bb]) + '\n')


wd = getcwd()
for image_set in sets:
    if not os.path.exists('F: \ projects \ PythonProject \ \ mydata yolov5 5.0 \ labels'):
        os.makedirs('F: \ projects \ PythonProject \ \ mydata yolov5 5.0 \ labels')
    image_ids = open('F: \ projects \ PythonProject \ yolov5-5.0% \ mydata \ / ImageSets/Main/s.t xt' % (image_set)).read().strip().split()

    if not os.path.exists('F: \ projects \ PythonProject \ \ mydata \ dataSet_path yolov5 5.0 /'):
        os.makedirs('F: \ projects \ PythonProject \ \ mydata \ dataSet_path yolov5 5.0 /')

    list_file = open('dataSet_path/%s.txt' % (image_set), 'w')
    This path does not need to be changed. This is a relative path
    for image_id in image_ids:
        list_file.write('F: \ projects \ PythonProject \ \ mydata yolov5 5.0 / images/s.j pg \ % n' % (image_id))
        convert_annotation(image_id)
    list_file.close()

Copy the code

After that, two folders appear

Here, one is the real address of our picture, and the other is the label text.

Aggregation operations

That is primarily used to set our target box size, there are two scripts, the purpose of these two script is used to calculate we in data set, and the average size of a manual box out of the box, the purpose is to, after we get the training of good model, it gives us the size of the box out of the box is not too strange, controls in a proper range.

Auxiliary script

import numpy as np

def iou(box, clusters) :
    """ Calculates the Intersection over Union (IoU) between a box and k clusters. :param box: tuple or array, shifted to the origin (i. e. width and height) :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: numpy array of shape (k, 0) where k is the number of clusters """
    x = np.minimum(clusters[:, 0], box[0])
    y = np.minimum(clusters[:, 1], box[1])
    if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
        raise ValueError("Box has no area")    If this is an error, you can change this line to pass

    intersection = x * y
    box_area = box[0] * box[1]
    cluster_area = clusters[:, 0] * clusters[:, 1]

    iou_ = intersection / (box_area + cluster_area - intersection)

    return iou_

def avg_iou(boxes, clusters) :
    """ Calculates the average Intersection over Union (IoU) between a numpy array of boxes and k clusters. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param clusters: numpy array of shape (k, 2) where k is the number of clusters :return: average IoU as a single float """
    return np.mean([np.max(iou(boxes[i], clusters)) for i in range(boxes.shape[0]])def translate_boxes(boxes) :
    """ Translates all the boxes to the origin. :param boxes: numpy array of shape (r, 4) :return: numpy array of shape (r, 2) """
    new_boxes = boxes.copy()
    for row in range(new_boxes.shape[0]):
        new_boxes[row][2] = np.abs(new_boxes[row][2] - new_boxes[row][0])
        new_boxes[row][3] = np.abs(new_boxes[row][3] - new_boxes[row][1])
    return np.delete(new_boxes, [0.1], axis=1)


def kmeans(boxes, k, dist=np.median) :
    """ Calculates k-means clustering with the Intersection over Union (IoU) metric. :param boxes: numpy array of shape (r, 2), where r is the number of rows :param k: number of clusters :param dist: distance function :return: numpy array of shape (k, 2) """
    rows = boxes.shape[0]

    distances = np.empty((rows, k))
    last_clusters = np.zeros((rows,))

    np.random.seed()

    # the Forgy method will fail if the whole array contains the same rows
    clusters = boxes[np.random.choice(rows, k, replace=False)]

    while True:
        for row in range(rows):
            distances[row] = 1 - iou(boxes[row], clusters)

        nearest_clusters = np.argmin(distances, axis=1)

        if (last_clusters == nearest_clusters).all() :break

        for cluster in range(k):
            clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)

        last_clusters = nearest_clusters

    return clusters

if __name__ == '__main__':
    a = np.array([[1.2.3.4], [5.7.6.8]])
    print(translate_boxes(a))

Copy the code

This script doesn’t have to be run, it’s a utility class.

Scripts to run

Note path modification here as well

# -*- coding: utf-8 -*-
# select the prior box according to the tag file

import os
import numpy as np
import xml.etree.cElementTree as et
from kmeans import kmeans, avg_iou

FILE_ROOT = "F: \ projects \ PythonProject \ \ mydata yolov5 5.0 /"     # root
ANNOTATION_ROOT = "Annotations"   # Data set tag folder path
ANNOTATION_PATH = FILE_ROOT + ANNOTATION_ROOT

ANCHORS_TXT_PATH = "F: \ projects \ PythonProject \ yolov5-5.0 \ mydata/anchors. TXT"   #anchors Save the file location

CLUSTERS = 9
CLASS_NAMES = ['Karada-chan'.'meow meow ~'.'the girl']   # category name

def load_data(anno_dir, class_names) :
    xml_names = os.listdir(anno_dir)
    boxes = []
    for xml_name in xml_names:
        xml_pth = os.path.join(anno_dir, xml_name)
        tree = et.parse(xml_pth)

        width = float(tree.findtext("./size/width"))
        height = float(tree.findtext("./size/height"))

        for obj in tree.findall("./object"):
            cls_name = obj.findtext("name")
            if cls_name in class_names:
                xmin = float(obj.findtext("bndbox/xmin")) / width
                ymin = float(obj.findtext("bndbox/ymin")) / height
                xmax = float(obj.findtext("bndbox/xmax")) / width
                ymax = float(obj.findtext("bndbox/ymax")) / height

                box = [xmax - xmin, ymax - ymin]
                boxes.append(box)
            else:
                continue
    return np.array(boxes)

if __name__ == '__main__':

    anchors_txt = open(ANCHORS_TXT_PATH, "w")

    train_boxes = load_data(ANNOTATION_PATH, CLASS_NAMES)
    count = 1
    best_accuracy = 0
    best_anchors = []
    best_ratios = []

    for i in range(10) :##### can be modified, not too large, or it will take a long time
        anchors_tmp = []
        clusters = kmeans(train_boxes, k=CLUSTERS)
        idx = clusters[:, 0].argsort()
        clusters = clusters[idx]
        # print(clusters)

        for j in range(CLUSTERS):
            anchor = [round(clusters[j][0] * 640.2), round(clusters[j][1] * 640.2)]
            anchors_tmp.append(anchor)
            print(f"Anchors:{anchor}")

        temp_accuracy = avg_iou(train_boxes, clusters) * 100
        print("Train_Accuracy:{:.2f}%".format(temp_accuracy))

        ratios = np.around(clusters[:, 0] / clusters[:, 1], decimals=2).tolist()
        ratios.sort()
        print("Ratios:{}".format(ratios))
        print(20 * "*" + "{}".format(count) + 20 * "*")

        count += 1

        if temp_accuracy > best_accuracy:
            best_accuracy = temp_accuracy
            best_anchors = anchors_tmp
            best_ratios = ratios

    anchors_txt.write("Best Accuracy = " + str(round(best_accuracy, 2)) + The '%' + "\r\n")
    anchors_txt.write("Best Anchors = " + str(best_anchors) + "\r\n")
    anchors_txt.write("Best Ratios = " + str(best_ratios))
    anchors_txt.close()

Copy the code

This file is then generated

Start training

First, open our folderThere are four training methods, each of which means different precision and training timeOur default is 5s, so I’m going to use 5s here.

Parameter Settings

First we need to be in the data file, or some other folder where you can find it.

train: F: \ projects \ PythonProject \ yolov5-5.0 \ mydata \ dataSet_path \ "train" TXT
val: F: \ projects \ PythonProject \ yolov5-5.0 \ mydata \ dataSet_path \ val TXT

# number of classes
nc: 3

# class names
names: ['Karada-chan'.'meow meow ~'.'the girl']

Copy the code

And then we modify our model and we use 5s

Note here the modification of the anchors parameters

This is actually what we used to cluster.

Start training

At this point, I can only say that there are quite a few holes, so many that a lot of people get to this point and the project doesn’t run at all and the mentality explodes.

Ok, let me just say a little bit about what it looks like if everything works.

First of all, we only have to notice a few parameters here.

That’s how it works with instructions

python train.py --weights weights/yolov5s.pt  --cfg models/yolov5s.yaml  --data data/mydata.yaml --epoch 200 --batch-size 4   --device 0

Copy the code

Error handling

Coding error

First of all, the first one is thisI looked for a lot of answers on the Internet, but they were useless. They were bullshit. Later, I carefully located the error point, which is here

There is no SPPF

There are two solutions to this. One is to copy the EPPS class directly into the comment file


import warnings


class SPPF(nn.Module) :
    # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
    def __init__(self, c1, c2, k=5) :  # equivalent to SPP(k=(5, 9, 13))
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1.1)
        self.cv2 = Conv(c_ * 4, c2, 1.1)
        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)

    def forward(self, x) :
        x = self.cv1(x)
        with warnings.catch_warnings():
            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning
            y1 = self.m(x)
            y2 = self.m(y1)
            return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1))

Copy the code

But, uh, actually, I found out that I was the one who killed me, that I downloadedIt was placed under the weights folder and the path was wrong when it was run. Then it would report an error when it was run for the first time. After that, it would not. So don’t specify this file.

Of course, the first case is mainly due to yoloV5 problems, but I have encountered both cases anyway, so I will mention both. If the first doesn’t work, do the second. If not, do the other.

Memory overrunning error reported

I don’t know. After that, let’s run the code

You think there is hope, then, hehe!

GTX16504GB RAM16GB Dell game box G5 you say I configuration too garbage? ! Apparently not. Here I have also looked up quite a lot of information, either the computer is not good, or the driver version is not compatible, how to say, in my place is unlikely to appear.

What was the solution? You might not believe it, restart the computer? ! I didn’t know what I was.

Normal condition

Next is our normal situation. Here I input such command (of course, you can modify train directly, but it is difficult to change back later).

python train.py --weights weights/yolov5s.pt  --cfg models/yolov5s.yaml  --data data/mydata.yaml --epoch 200 --batch-size 4   --device 0
Copy the code

Then thisThis is normal, then your computer fan will go up, if it doesn’t go up, either your computer is too well configured, or there is something wrong with your computer. You’ll be told the output path after the training

And then open up our tensorboard

tensorboard --logdir=runs
Copy the code

When we do this training, by default we’ll write a visual file under runs folder (tensorboard) and we’ll see it when we open TensorBoard after training

use

Finally, it’s not easy to use

Python detect. Py - weights runs/train/exp29 / weights/best pt - source imagesCopy the code

But in this case, the car rolled over because of the size of the picture. I suddenly realized that my picture size is wrong!! So useless, I should first use my own written picture processor to deal with the size of the picture.

But here’s how it works.