The background,

Face recognition is one of the most popular applications in the field of computer vision in recent years, and there have been a lot of face recognition algorithms, such as DeepID, FaceNet, DeepFace and so on. Face recognition is widely used in scenic spots, passenger transport, hotels, offices, construction sites, residential areas and other places, which greatly facilitates people’s lives. In the field of security, face recognition also shows great vitality, through face recognition to process the images collected by the camera, you can find suspicious people faster.

1:1 face verification usually does not overconsider the speed problem, and 1:N face recognition scenes sometimes speed is very important. For example, users want to quickly determine who the star in the picture is through face recognition, and there are millions or even tens of millions of data in the background star database, one by one comparison will be difficult to return results in a short time, in high concurrency is very occupied resources. So using vector approximate search will be very important in large-scale face recognition scenarios.

Ii. Introduction of Hongsoft SDK and Milvus

Rainbow soft face recognition SDK is a setFace detection, face tracking, face comparison, face search, face attributes, IR/RGB in vivo detectionMultiple capabilities in one offline face recognition SDK. Support Windows, Linux, Android and other platforms. Support offline service, can be used in no network environment,Localized deployment. There areValue-added versionandThe free versionTwo versions.

Milvus is an open source vector similarity search engine, providing Python, Java, Go, C++, RESTful and other API interfaces, supporting the operation of adding, subtracting and modifying TB level vectors and near-real-time query, with high flexibility, stability and reliability and high-speed query characteristics. Milvus integrates with Faiss, NMSLIB, dirty and other widely used vector index libraries, providing a set of straightforward apis that allow you to select different index types for different scenarios. In addition, Milvus can filter scalar data, further improving recall rates and increasing search flexibility.

Development environment

In this article, the rainbow software SDK uses C++ calls, and Milvus uses the Python API. If you want to use the C++ version of the Milvus API, compile it yourself.

Environment required for this code:

  1. Rainbow soft face recognition SDK4.0 value-added version
  2. Milvus 1.0.0
  3. OpenCV 2.4.9
  4. VS 2013
  5. Python 3.6 + (You may not be able to install Pymilvus below 3.6)

Iv. Introduction of Hongsoft face recognition SDK

Rainbow soft face recognition SDK is very simple to use. For the general face recognition process:

  1. callASFOnlineActivationOnline activation, activation file will be generated after the activation, the next run will not be activated again.
  2. callASFInitEngineInitialize the engine, where you can select face detection mode or face tracking mode (face tracking is faster) and pass in other parameters.
  3. callASFDetectFacesDetect the face, get a frame of all the face frame in the image.
  4. callASFFaceFeatureExtractExtracting facial features
  5. callASFFaceFeatureCompareCompare the features of two faces and return the similarity.

  • Note that every time ASFDetectFaces, ASFFaceFeatureExtract and other interfaces are called, the location of the saved result is fixed, and this location was determined when the engine was initialized. The structure returned is only a pointer to this location. That is, the next call to ASFDetectFaces will overwrite the last ASFDetectFaces result, so if you want to save the last result, copy it out. The advantage of this is that the function will not fail because it cannot obtain memory, and will not cause memory leakage.

This is just the simplest face recognition process. In addition, rainbow Soft face recognition SDK also supports RGB liveness recognition, IR liveness recognition, mask detection, eye closure detection, occlusion detection, image quality detection and other functions. For more documents, please refer to The Rainbow Soft Documentation Center

5. Milvus environment construction

The easiest way to install Milvus is through Docker. Milvus has BOTH CPU and GPU versions. Here, the CPU version is used as an example. Refer to the official Milvus reference documentation. Milvus. IO/cn/docs/v1….

  1. Install CentOS or Ubuntu, I use the vault.centos.org/7.4.1708/is…

  2. Install Docker and use the official installation script to install it automatically

curl -fsSL https://get.docker.com | bash -s docker --mirror Aliyun
Copy the code
  1. Pull the Milvus image
Sudo docker pull milvusdb/milvus: 1.0.0 - CPU - ea92e d030521-1Copy the code
  1. Download the Milvus configuration file
mkdir -p /home/$USER/milvus/conf
cd /home/$USER/milvus/conf
wget https://raw.githubusercontent.com/milvus-io/milvus/v1.0.0/core/conf/demo/server_config.yaml
Copy the code

If you cannot download the configuration file using the wget command, you can also create the server_config.yaml file in the /home/$user/milvus/conf directory and copy the contents of the server_config.yaml file to the configuration file you created.

  1. Start the Milvus Docker container
Sudo docker run -d --name milvus_cpu_1.0.0 \ -p 19530:19530 \ -p 19121:19121 \ -v /home/$USER/milvus/db:/var/lib/milvus/db \ -v /home/$USER/milvus/conf:/var/lib/milvus/conf \ -v /home/$USER/milvus/logs:/var/lib/milvus/logs \ -v /home/$USER/milvus/wal:/var/lib/milvus/wal \ Milvusdb/milvus: 1.0.0 - CPU - ea92e d030521-1Copy the code

Use Sudo Docker PS to confirm Milvus running status.

If Milvus is not running properly, you can view the error log by sudo docker logs milvus_CPU_1.0.0. If the CPU does not support one of SSE42, AVX, AVX2, or AVX512, you may not be able to start Milvus.

To install the GPU version, see Installing the GPU Version

Six, fast retrieval implementation

Introduction to face recognition process

The basic process of face recognition using Hongsoft SDK has been introduced in front, and the current face recognition process is basically the same. Here is a brief description of the general face recognition three steps:

1. Face detection

Given the image, get the position of the face in the image. Some will also get some key points of the face, Angle and other information, used to align the face.

2. Feature extraction

The face image is extracted and features are extracted by neural network. The extracted feature vector is usually a 128-dimensional or 256-dimensional feature vector (normalization is usually added).

3. Feature comparison

Compare the feature vectors extracted from the previous step, calculate the distance of the two vectors, and then the distance of simple processing can get the similarity of two faces. Common similarity calculation methods include Euclidean distance and cosine similarity.

Rapid retrieval

Usually 1:N face search is the most common way to direct violence search, all the faces in the face library are compared to find the highest similarity k. Rainbow Software SDK provides ASFFaceFeatureCompare to compare two face feature vectors. If the face library is too large, the search speed will undoubtedly slow down, and it will be difficult to perform well in some scenes with high real-time requirements. Through some vector similarity search algorithms, we can calculate the similarity of a large number of data in a short time and find out the highest similarity. This paper uses hongsoft face recognition SDK for face detection and face feature extraction. The extracted facial feature vectors were retrieved by Milvus.

How does rainbow Software SDK obtain feature vectors

Let’s cut to the chase.

In the C++ version of rainbow software SDK, face features are stored using the structure ASF_FaceFeature structure.

typedef struct { MByte* feature; MInt32 featureSize; }ASF_FaceFeature, *LPASF_FaceFeature;Copy the code

By checking the memory pointed to by multiple features, those who have a little knowledge of machine learning can easily find the rule. Except for the first two integers, these data are floating point numbers, which are obviously normalized feature vectors.

Conclusion:featureThis points to an array of type float. The first eight bytes are fixed to be floating-point type 2004,78 (which may be used to distinguish between different versions of the SDK). The next 2048 bytes are 512 floating point numbers. ifASFFaceFeatureExtractSet up theregisterOrNotIf the parameter is false, the first 256 bytes of 512 data are 0.

In addition, ifregisterOrNotSet tofalseIf so, the first 256 feature vectors are all 0 and can be ignored. We just need to copy the last 256 eigenvectors.

// When registerOrNot is set to false, only the last 256 vectors can be copied float data[256]; memcpy(data, f.feature + 8 + 1024, 1024);Copy the code

Here we have obtained the face feature vector extracted by Rainbow soft SDK.

Feature vectors were extracted in batches and Milvus was inserted

Feature vectors were extracted from face photos in CelebA dataset and saved in the file. Copy /b *.txt res.txt can be merged into a single file if multiple threads are used.

#include "FaceEngine.h" #include <string> #include <atomic> #include <fstream> #include "TP.cpp" #include <windows.h> atomic_int n = 0; void task(int start, int end , int index) { ofstream save("D:/Face/feature/" + to_string(index) + ".txt"); // Make sure FaceEngine X (ASF_DETECT_MODE_IMAGE, ASF_OP_0_ONLY, 1) is enabled before calling; char file[50] = { 0 }; for (int i = start; i <= end; i++) { sprintf(file, "D:/Face/img_celeba/%06d.jpg", i); Mat img = imread(file); auto faces = x.DetectFace(img); if (faces.faceNum == 1) { auto face = x.GetSingleFace(faces,0); auto f = x.GetFaceFeature(img,face); float data[256]; memcpy(data, f.feature + 8 + 1024, 1024); save << i << "|"; for (int u = 0; u < 256; u++) save << data[u] << "|"; save << endl; } n++; } } int main() { ThreadPool pool(2); pool.AddTask(bind(task, 1, 100000, 1)); pool.AddTask(bind(task, 100001, 202599, 2)); while (n < 202599) { cout << "\r" << n << "\t" << n * 100 / 202599 << "\t"; Sleep(1000); }}Copy the code

In order to simplify the use of rainbow software SDK, I have a simple SDK package, as well as a simple thread pool implementation, can be downloaded at the end of the link.

The above feature vectors have been saved in TXT, and then 200,000 feature vectors will be inserted into Milvus (200,000 data is a little small, but due to my low computer configuration, it took nearly 2 hours to extract the feature vectors of 200,000 personal faces, so I will not add too much data here. Add more data if you can. . You can also compile Milvus’s C++ SDK to extract and insert in one step. Pymilvus pip3 install Pymilvus ==1.0.1

from milvus import Milvus, IndexType, MetricType, Status import numpy as np m = Milvus(host='IP', Port ='19530') # create collection param = {'collection_name':'face', 'dimension':256, 'index_file_size':256, } print(m.create_collection(param)) num = 200000 step = 5000 now = 0 def GetBatch(data): global now ids = np.zeros(step,dtype=np.int32) vects = np.zeros((step,256),dtype=np.float32) for i in range(step): tmp = data[i+now].split("|") ids[i] = int(tmp[0]) for u in range(256): vects[i][u] = float(tmp[u+1]) now += step return ids.tolist() , Data = open("G:\\feature\ res.txt"). Readlines () for I in range(int(num/step)): ids , vs = GetBatch(data) res = m.insert(collection_name='face', records=vs, ids=ids) print(i)Copy the code

Note here that Milvus’s Python SDK inserts with a list, and numpy creates data that needs to be usedtolist()To convert a list

By default, the FLAT index (violent search) is used after insertion. The speed of violent search is the slowest, but the recall rate is 100%. If there is a large amount of data, other indexes can be established to speed up the retrieval. Common indexes to query on a CPU are:

For more indexes, referMilvus official documentation

Create index:

# 'ivF_PARam' is the parameter to create the index, and 'IVF_FLAT' is the index type. ivf_param = {'nlist': 16384} print(m.create_index('face', IndexType.IVF_FLAT, ivf_param))Copy the code

The query

Data = open("G:\ feature\ res.txt").readlines() ids, vs = GetBatch(data) idx = int(input("index:")) # Print ("id:", ids[IDx]) # print("id:", ids[IDx]) # print("id:", ids[IDx]) # 000014.jpg search_param = {'nprobe': 16} res = m.search(collection_name='face', query_records=[vs[idx]], top_k=3, params=search_param) print(res)Copy the code

Query batch to obtain a result:

index:12 id: 14 (Status(code=0, message='Search vectors successfully! '), the [[(id: 14, short: 1.0000004768371582), (id: 39306, short: 0.8084499835968018), (id: 109420, Short: 0.776871919631958)]])Copy the code

The three most similar tokids found by Milvus are 14, 39306, and 109420 (id is the file name number, 14 is 000014.jpg). 14 is the file itself, so the inner integrals calculated for 1,39306 and 109420 are 0.8084 and 0.7769. The pictures corresponding to these three ids are:

It can be seen that the Top3 faces are indeed the same value. A threshold can be set according to the calculated distance to determine whether it is the same person. The threshold can be set to about 0.55-0.6, and you can test yourself to determine a more appropriate threshold if necessary.

7. Performance description

Using the ASFFaceFeatureCompare interface of Hongsoft SDK, it takes 156ms to retrieve 200,000 faces in a single thread. Milvus (running in a virtual machine) uses the default FLAT index, and it takes 168ms to retrieve 200,000 faces. It takes 70ms to establish IVF_FLAT index and search nProbe is set to 16.

In high concurrency scenarios, using THE GPU version of Milvus can greatly reduce the search time, and can be set to achieve an ideal recall rate. However, it is recommended to use the ASFFaceFeatureCompare interface for low concurrency and low data volume.

Eight, supplement

There are two methods of similarity calculation commonly used in Milvus:

  • Euclidean distance (L2)

  • Inner product (IP)

When the vector is normalized, the two computations are equivalent. Face features extracted by Rainbow Software SDK are normalized, so it is ok to choose these two calculation methods.

All the code has been uploaded to github github.com/Memory2414/…

If you are too lazy to configure the OpenCV environment, you can also download the configured Rainbow soft SDK and OpenCV environment (VS2013) here, extract code: ATKW.

To learn more about face recognition products, please visitRainbow soft visual open platformoh