The Heart of machine original, participation: Si, Sun Qianqian.

“If a man sits at home, he is against heaven.” Recently, Gree Dong Mingzhu “Miss Dong” was “run a red light” in Ningbo.

“Red light stop, green light go, yellow light and so on.”

Despite being taught to obey traffic rules from an early age, “Chinese-style road crossing” is still frequent and unstopped.

In order to regulate civilized road crossing, many cities (Shenzhen, Tianjin, Putian, Xinjiang Korla, Guangzhou…) The “Light station for Pedestrians running red lights” was launched. As the name suggests, pedestrians who run red lights are exposed on the big screen.

Recently, a photo out of the Internet shows that Dong Mingzhu’s photo also appeared in Ningbo’s “pedestrians running red lights exposure platform”! It also shows where she ran the red light.

However, a closer look at the photos revealed that there was no one at the zebra crossing. Take a closer look, the system is the bus body of dong Mingzhu advertising as a real person.

“Dong Mingzhu” has been searched on Sina Weibo, and the entry “Dong Mingzhu ran a red light” has been automatically pushed to you.

A netizen joked, “Later, I guess Bruce Lee was also shown running a red light.”

How to avoid “being run red light” phenomenon to happen again? Machine Heart spoke to a number of computer vision companies to understand the technology behind red light exposure and find out how it works.

Behind the technology and problems

From the exposure image, the model mainly predicts the face detection frame with target detection algorithms such as SSD or YOLO, captures the face image according to the detection frame, and finally determines the identity of the red light runner by calculating the face image similarity. Although the process is not complicated, there are still many unsolved problems. In this case, in vivo detection has become an important missing link.

For the exposure platform mainly used technology, Cloud science and Technology Research Institute vice president Zhou Xiang said, the overall process can be divided into “synchronous red light time – capture face – background or front-end real-time comparison – real-time release or manual audit”; Among them, it is relatively simple to synchronize the red light time, and human face capture is also a common technology. However, since it needs to be put into production, the model size, stability and response delay all need special design.

God orders, general manager of wuhan Huang Rui said that this scene is captured through a red light traveller, “like a bayonet to interact with traffic lights and motor vehicle running a red light is a little similar, just different: a man caught, face and garage, 1: N match to verify identity; a scratching the car license plate, and garage check the motor vehicle real identity.” But even with the same technology applied to pedestrians and license plates, colorful pedestrians are a lot harder to grasp.

Although the red-light exposure process is simple and the main function can be completed by invoking some pre-trained real-time detection models on GitHub, there are still many optimization aspects to reduce or even avoid errors like “Dong Mingzhu runs a red light”.

Zhou Xiangshui, appear this kind of situation the reason mainly has: “(1) the bottom not miss dong, so automatically select the most similar to display; (2) algorithm without living detection, in the distance is to determine living difficulty or larger; (3) advertising seriously phenomenon now appears to be more normal, this is the actual be born of a compromise.”

A similar attitude is taken by Wang Jianhui, CTO of Shenzhen Technology, who says: “This incident only shows that the performance of the platform’s face detection algorithm is good, but the processing of abnormal situations such as billboards is not considered in the product design process. In addition, it reflects that the capture algorithm does not do live detection, only face detection. The technical difficulty is how to do live detection without cooperation.”

For pedestrians running red lights, the difficulty really lies in uncoordinated in-vivo detection, says Robert Lorenz, a principal researcher at Pepper Technologies. “In this case, as long as the human face can be captured in the red light, it will be judged as running the red light, without the judgment of the person and the action. The difficulty is the detection of the person and the action, that is, the face captured is from a” living “person with the action of running the red light.”

However, Huang said the incident was not a “miscarriage of justice” and similar incidents happen every day across the country. “According to the current general requirements of the capture machine in the domestic market, it only does dynamic image recognition, not living body recognition.” In his opinion, the camera and access control are not the same, judging the living body is not just needed.

The solution

These are still relatively obvious and general direction of the problem, then, for these problems, and what are the solutions? Could we detect not just the face, but the whole body and movement of a person to see if they’re running a red light?

Zhou Xiang said, “don’t consider the semantic information, at present there is no claim to be very good solution, because now in vivo detection algorithm and hardware mainly for close-up scenes, such as 3 d structure light, infrared eyes, action, lips, silence and so on, but can be by qualified grasping face size range to a certain extent, ease the problem.”

Actually, if you don’t consider the semantic or environmental information, still can pass some hardware to solve the problem, Huang Rui said: “at present, part of the southern city of certain locations of the existing, extra infrared cameras, but because captured the airport scene usually distant, so you want to install infrared camera is large, the cost is relatively high.”

Considering semantic information would do better, Lorenz said, adding human testing could go some way to solving this problem. But there are also many factors to consider, such as the size of the face and the proportion of the body torso, otherwise there will be printed on the clothes of pedestrians or backpacks on the idol face was mistakenly caught.

Adding semantic information, such as attitude estimation or moving coordinates, does make it possible to determine whether it is an advertisement or a human, Zhou said. “But this will increase the complexity of the model,” he said. “If we don’t switch to better hardware, the speed of the algorithm may decrease to some extent, which is not necessary in this scenario.” Because for such large-scale applications, the number of model parameters, computational power and accuracy are highly required, it is difficult to deploy complex models into practical applications.

According to Robert Lorenz, the current solutions are as follows:

  • Join the size of the image detection, face size on the print advertising generally bigger than pedestrians face, of pixel area will be bigger, only need to set the pedestrian crossing from far to near may shoot the size of the face, can largely avoid such cases, but clothing, such as face image is has certain risk is identified; This is about eliminating these kinds of problems from an engineering perspective, not an algorithmic perspective;

  • For image quality judgment, there are obvious differences between real people and printed pictures, such as skin color and texture, etc., which can be distinguished by computer through model training.

  • By adding depth information, such as binocular camera and RGB-D camera, 3D information can be obtained and 2D images such as printing objects can be shielded. However, this scheme is too expensive and not suitable for running red lights.

  • Human skeleton key point detection is used to describe human posture and predict human behavior. Based on the analysis of human skeleton key points and key objects, restore human posture and understand human actions. Peng technology have the behavior of the testing technology can also detect abnormal behavior (using high-precision human key points, you can check out the video or pictures of the human body in the abnormal action), such as for running a red light of the situation, you can for pedestrian accidentally fell down, slow, wheelchair-bound disabled for testing, combined with the traffic light control, You can really facilitate people’s lives.

“Misphotographing is a small matter, remember to travel in a civilized way, safety is the most important.” Dong mingzhu herself responded to the incident on Sina Weibo at 10:15 am today.

In the opinion of xiaobian, it is necessary to make efforts to avoid the recurrence of “being crossed the red light”, not only the technical solution provider, but also ourselves. Standardize their own behavior, civilized crossing the road is the fundamental way to solve the problem. Come, repeat, “red light stop, green light go, yellow light wait.”