Abstract:

When we see a picture, our brains immediately recognize the object contained in the picture. For machines, identifying these objects takes a lot of time and training data. However, with the continuous progress of deep learning and hardware technology, object detection has become more simple and intuitive in the field of computer vision.


As shown below, the target detection system is able to identify different objects in an image with incredible accuracy.

Nowadays, object detection technology is developing rapidly in different industries. It can help driverless cars drive safely in traffic, detect violent behavior in crowded places, assist sports teams in analyzing and building reports, and ensure component quality in manufacturing, among other things. These are not just the tip of the iceberg in object detection.

This article introduces what target detection is and some approaches to solving problems in this area. Then, we’ll delve into how to build a target detection system in Python. After reading this article, you will be able to deal with different subjects on your own!

Before you read this article, you should have some basic knowledge of deep learning, as well as solving simple image processing problems.

What is object detection?

Before we start building the model, let’s try to understand what target detection is. Suppose we want to build a pedestrian detection system for a driverless car. The car has captured the following picture. How would you describe this picture?

This picture describes a scene where a car is near a square and some people are crossing the road in front of them. As the traffic signs on the road are not very clear, the pedestrian detection system must be able to accurately identify where the pedestrians are walking, so that our cars can avoid them.

So what does a car’s pedestrian detection system need to do? It creates a boundary box around the pedestrians so that the system can pinpoint the pedestrians in the image and then decide which way to go to avoid them.

The targets detected by the system are as follows:

1. Determine all objects in the image and their locations.

2. Filter out the target object.

Target detection

The method of

Now that we know what problem to solve, what are the methods for object detection?

Method one: Divide and conquer

The image is simply divided into four parts :(1) upper left corner

(2) Upper right corner

(3) The lower left corner

(4) Lower right corner

This is a good idea, but we want to build a more accurate system that recognizes the entire object (such as the person in this diagram).

Method two: increase the number of segmentation

The method one system works fine, so what else do we need to do? Increase the number of blocks we input into the system to improve the system, as shown below:

This is a good idea, but it also creates unnecessary problems. Of course, this is better than the first method, with the only drawback being the large number of bounding boxes. Therefore, we need a more reasonable method for target detection.

Method three: structured division

In order to create a more disciplined target detection system, our improvements are as follows:

Step 1: Divide the image into 10*10 grids, as shown below:

Step 2: Define the center of each block

Step 3: For each center, take three different combinations of blocks, i.e., different length-width ratios:

Let’s go ahead and see what else we can do to make the target detection model perform better.

Method four: improve efficiency

All three approaches are relatively easy to implement, but we want to build a more efficient system. I optimized the model based on method 3 to improve the performance of the model. I did two things in total:

1. Increase the number of grids: Increase the number of grids from 10 x 10 to 20 x 20.

2. Quickly increase the number of blocks with different aspect ratios from the previous 3: take out 9 blocks from the same center, namely 3 square blocks with different widths and 6 rectangular blocks with different aspect ratios, as shown in the picture below:

This approach has both advantages and disadvantages in that it allows us to detect objects at a much finer level, but on the other hand, we have to input a large number of blocks into the image classification model.

Therefore, we can selectively select some blocks rather than all of them as inputs to the model. For example, build an intermediate classifier to detect whether the image has a background or may contain an image. This greatly reduces the number of blocks the model needs to detect.

We can also make another optimization: reduce the prediction of “same object”. Let’s look at the output of method 3:

As the picture above shows, the bounding boxes predict the same person, so we can choose any one of them to predict. Therefore, for target detection, for the box with “same object”, we choose the boundary box that is most easily detected as a person.

So far, all of these optimization systems have predicted well. So, is there something missing? That’s right, deep learning!

Method 5: Use deep learning for feature selection and build an end-to-end approach

Deep learning has great application potential in target detection. So how do we apply deep learning to target detection? Here are a few ways to do it:

1. Input the original image into the neural network to reduce the dimensions, rather than the blocks of the original image.

2. Use neural network to detect the selected prediction block.

3. Use the reinforcement deep learning algorithm to make the prediction result as close as possible to the original boundary box, which will ensure that the algorithm can provide more accurate boundary box prediction.

Now, instead of training different neural networks to solve every problem in target detection, we train a single deep neural network model to solve all the problems. The advantage of this is that each piece in the neural network helps to optimize the other components of the neural network, and also helps us train the whole deep learning model.

The output is one of the best performing of all the above methods, and looks something like the image below. Finally, we’ll learn how to create the model in Python.

Tutorial: How to use it
Image AI

Library build a


Target detection



Now that we know what is the best method of target detection and problem solving, let’s build a target detection system of our own! Here, we build the object detection model using the Image AI library — a Machine learning Python library that supports cutting edge computer vision tasks.

It is fairly simple to run an object detection model to obtain a prediction. We don’t need a complex foot setup, let alone a GPU, to produce predictive results. We will use the above

Methods five and

Image AI library conducts target detection, and the code implementation is described below.

Note: in the build
Target detection

Before modeling, make sure you have it installed on your local computer


Python

the


Anaconda


Version!

Step 1: Lay out an Anaconda environment in Python 3.6.

Step 2: Activate the environment and install the necessary software packages.

Step 3: Install the Image AI library.

Step 4: Download the pre-trained model based on target detection – Retina Net, download
Retina Net

Pretraining model

.

Step 5: Copy the downloaded files to your working folder.

Step 6:

Download the pictures

And name it image.png.

Step 7: Run the following code on the Jupyter Notebook:

This will create an image file called Image new.png that contains the bounding box.

Step 8: Print the image, the code and output image are as follows:

In this way, we successfully build a pedestrian target detection model.

conclusion

In this article, we learned what target detection is and how to create a target detection model. Finally, we successfully constructed a pedestrian detection model with the Image AI library.

Understanding and Building an Object Detection Model from Scratch in Python

The original link

This article is the original content of the cloud habitat community, shall not be reproduced without permission.