Take a look at the three mainstream front-end technologies and see what they are all about, and how to choose the right framework based on the business characteristics.

Abstract: We designed a face recognition algorithm based on MindSpore to solve the face recognition problem in the mask occlusion scene. The open source code for the algorithm is already available on MindSpore.

MindSpore trains mask face recognition model to know who you are without Removing your Mask.

The goal of face recognition is to determine the identity of face image. With the construction of smart city, face recognition application scenarios are more and more diverse, such as suspect tracking, locating the lost elderly and children, etc. With the Novel Coronavirus outbreak worldwide, for the sake of public health safety, more and more people are wearing masks in public places to reduce the risk of infectious disease.

Traditional face analysis methods mainly extract the global features of the whole face for analysis. In the mask occlusion scenario, some important parts of the face (such as the jaw, etc.) are covered by the mask, so the extraction of the global features of the whole face will introduce a lot of noise. Therefore, new solutions are urgently needed for face analysis in mask occlusion scenarios. We designed a face recognition algorithm based on MindSpore to solve the face recognition problem in the mask occlusion scene. The open source code for the algorithm is already available on MindSpore.

  • Paper:

Feifei Ding, Peixi Peng, Yangru Huang, Mengyue Geng andYonghong Tian. Masked Face Recognition with Latent Part Detection. ACMMultimedia 2020.

Dl.acm.org/doi/pdf/10….

  • Open source Address:

Gitee.com/mindspore/m…

Algorithm framework

At present, there is no large-scale training data set for face mask. To solve the problem of insufficient training data, we generate face mask data as training data based on the existing public face data set. The process is as follows:

(1) Using the existing face key point detection algorithm (such as Dlib) to detect a number of key points on the face image without occlusion;

(2) Mark the key positions of the mask wearing area (below the nose tip);

(3) Manually mark the key points on the sample pictures of masks (such as N95 masks), and correspond to the key points of the wearing area of face masks one by one;

(4) According to the key position of the mask sample picture, do triangulation and divide the mask sample picture into several small triangles;

(5) According to the key mapping relationship between the mask sample picture and the face picture, affine each small triangular region on the mask sample picture to the face picture, and perform smoothing operation to generate the mask face picture.

We generated 8 different styles of face masks on the face public data set WebFace and mixed them with the original face pictures without masks as training data.

The algorithm framework is shown in the figure. Considering that many important information in the face of a mask is covered by the mask, we propose a two-branch network model based on potential region detection to extract more discriminative features. The local branch extracts local features from the potential region, and the global branch extracts global features from the original map.

We define the potential area as the face area not covered by the mask, with

Is, where

Is the parameter to be learned. Inspired by the Spatial Transformation Network (STN), we studied

The features in the region are transformed into the size of the original image by limited affine transformation. The formula is as follows:

Where, the target box is defined as

H and W are the length and width of the original picture respectively. According to this formula, it can be obtained

The parameters of the affine transformation matrix can be passed by STN

Which can be used to detect potential areas. According to the corresponding relationship between the potential region and the coordinates of the original map, bilinear interpolation is used to expand it to the same size as the original map.

The network model is a two-branch network. One branch extracts local features from the potential region, and the other extracts global features from the original map. The two branches share network parameters. Classification loss function is used to optimize each branch. For local branches, LPD loss function is introduced as follows:

Among them,

Is the ordinate of the key point of the nasal tip. The purpose of introducing this loss function is to standardize the network’s positioning of potential regions, and make it as far as possible in the region above the nose tip of the face, because according to prior knowledge, masks tend to cover the face region below the nose tip. In the test phase, global features and local features are combined as the final feature expression.

The experimental results

We collect the actual Masked Face Dataset PKU-masked Face Dataset as a test set, which contains 10301 Face images belonging to 1018 different identities. Most people had at least five faces with masks and five faces without masks from five different perspectives: front, left, right, overlooking and up. The mask face image is used as the query image library, and the normal face image is used as the image library to be matched.

  • Test set links:

Pkuml.org/resources/p…

The experimental results of MindSpore algorithm model are shown in the following table. Baseline is the ResNet50 model using WebFace raw data training, MG is the ResNet50 model using WebFace enhanced data training. LPD proposed a potential area detection model for us.

MindSpore code implementation

Two branch network structure code:

Global_out and Partial_OUT are global features extracted from the original image and local features extracted from the potential region, respectively. The two parts share the feature extraction backbone layer.

Specific implementation of LPD module for potential region detection:

The input is the original picture, and the output is the boundary coordinates of the unblocked face region.

  • Related training and reasoning code:

Gitee.com/mindspore/m…

The code is implemented based on the MindSpore framework and runs on Ascend910 hardware environment. The algorithm solves the face recognition problem in the mask occlusion scene, and significantly improves the performance of the benchmark model. The experimental results are expressed to the industry-leading level as above.

Click follow to learn about the fresh technologies of Huawei Cloud