Extra! Extra! Now people can finally do face recognition in their browsers! This article will introduce “face-api.js”, which is a javascript module built on the kernel of “tensorflow.js”. It implements three convolutional neural network (CNN) architectures for face detection, recognition and feature point detection.

  • Face-api.js:github.com/justadudewh…
  • TensorFlow.js:github.com/tensorflow/…

As usual, we’ll look at a simple code example that will allow you to get started with the API in just a few lines of code. Let’s get started!

We already have “face-recognize.js”, now another package of the same kind?

If you’ve read node.js + face-recognisation.js, another article on face recognition in node.js: Simple and Robust Face Recognition using Deep Learning “(Node.js + face-recognize.js: Simple and robust face recognition through deep learning , you’ll know he assembled a similar package before, such as “face-Recgnition. Js”, which introduced face recognition to Node. js.

Initially, the authors did not anticipate the level of demand for facial recognition packages from the JavaScript community. To many, face-recognisation.js seems like a good, free, open source alternative to paid facial recognition services offered by companies like Microsoft or Amazon. But the authors have been asked many times: is it possible to run a complete pipeline of facial recognition in a browser?

Thanks to “tensorflow.js”, this vision has finally become a reality! The author managed to implement some similar tools using the “tf.js” kernel, which get almost the same results as “face-recognisation.js”, but the author did it in a browser! And best of all, the tool doesn’t require any external dependencies, making it very easy to use. The tools can also be accelerated by GPU, and the related operations can be run using WebGL.

This is enough to convince me that the JavaScript community needs such a package written for the browser environment! Imagine what kind of applications you could build with it.

 

How to use deep learning to solve face recognition problems

If you want to get to the hands-on part as soon as possible, you can skip this chapter and jump straight to the code analysis part. But to better understand the methods used to implement face recognition in face-api.js, I strongly encourage you to follow this chapter, because I get asked this question all the time.

For simplicity, what we actually want to achieve is to recognize the person in the image given an image of a face. To do this, we need to provide each person we want to identify with an image (or more) of their face, and tag those images with the name of the face’s owner as reference data. Now we compare the input image with the reference data to find the reference image that most resembles the input image. If both images are similar enough to the input, we print the name of the person, otherwise “unknown”.

That does sound like a good idea! There are, however, two problems with the scheme. First, what if we had an image that showed multiple people, and we needed to identify all of them? Second, we need to build a similarity measure to compare two face images.

 

Face detection

The answer to the first question can be found in face detection technology. In short, we will first locate all the faces in the input image. “Face-api.js” implements a Single Shot Multibox Detector (SSD) algorithm for face detection, which is essentially a convolutional neural network (CNN) based on MobileNetV1, adding some face border prediction layers to the top layer of the network.

The network will return a bounding box for each face and the corresponding score for each bounding box, the probability that each bounding box represents a face. These scores are used to filter bounding boxes because there may be cases where an image does not contain any human faces. Note that the face detection process should be performed even if there is only one person in the image in order to retrieve the bounding box.

 

Face feature point detection and face alignment

In the previous post, we solved the first problem! However, I would like to point out that we need to align the bounding boxes to extract the face centered image from each bounding box and then pass it as input to the face recognition network, because this makes face recognition more accurate!

To achieve this goal, “face-api.js” implements a simple convolutional neural network (CNN), which will return 68 personal face feature points for a given image:

From the point of view of feature position, the boundary box can center the face. You can see how the face detection results (left) compare to the aligned face image (right) below:

 

Face recognition

We can now input the extracted and aligned face image into a face recognition network based on a resNET-34 architecture that is basically similar to dlib (github.com/davisking/d…) The architecture is consistent. The network has been trained to learn the mapping of face features to face descriptors (a feature vector containing 128 values), a process commonly known as face embedding.

Now let’s go back to the original problem of comparing two face images: we will use the face descriptors of each extracted face image and compare them with the face descriptors of the reference data. More specifically, we can calculate the Euclidean distance between two face descriptors and determine whether two face images are similar based on a threshold (0.6 is a good threshold for 150*150 images). Using Euclidean distance works surprisingly well, but of course you can use any classifier. The GIF below visualizes the process of comparing two human face images by Euclidean distance:

So far, we have some understanding of the theory of face recognition. Let’s start writing a code example.

 

It’s time to start programming!

In this short example, we will see how to run the face recognition program step by step to recognize multiple people in the input image as shown below:

 

Import the script

First, get the latest version from dist/face-api.js (github.com/justadudewh…) , or get the reduced version from dist/face-api.min.js and import the script:

<script SRC ="face-api.js"></script> Copy the codeCopy the code

If you use the NPM package management tool, you can enter the following command:

NPM I face-api.js copies the codeCopy the code

 

Loading model data

You can load the specific model you need as required by your application. But to run a complete end-to-end example, we also need to load face detection, face feature point detection, and face recognition models. The associated model files can be found in the code repository, linked to: github.com/justadudewh… .

The weight of the model has been quantified, and the file size has been reduced by 75% from the original model, so that your client only needs to load the minimum amount of data it needs. In addition, the weight of the model is divided into blocks of up to 4 MB, enabling the browser to cache these files so that they only need to be loaded once.

Model files can be used directly as static resources in your Web application, or you can store them on another host and load them via a specified path or url link to the file. Suppose you store them in a Models directory along with your assets in the public/ Models folder:

Const MODEL_URL = '/models' await faceapi.loadModels(MODEL_URL) copies the codeCopy the code

Or, if you just want to load specific models:

const MODEL_URL = '/models' await faceapi.loadFaceDetectionModel(MODEL_URL) await Faceapi. LoadFaceLandmarkModel (MODEL_URL) await faceapi. LoadFaceRecognitionModel (MODEL_URL) duplicate codeCopy the code

 

A complete description of all faces is obtained from the input image

The neural network can take an HTML image, a canvas, a video element or a tensor as input. In order to detect face boundary boxes whose score (score) is greater than minScore in the input image, we can use the following simple operations:

Const minConfidence = 0.8 const fullFaceDescriptions = await faceAPI. AllFaces (input, minConfidenceCopy the code

A complete face descriptor contains the detection result (boundary box + score), face feature points and the calculated descriptor. As you can see, “Faceapi.Allfaces” does all the work discussed in the previous sections of this article underneath. However, you can also manually obtain face location and feature points. If that’s your goal, you can refer to a few examples in the Github repo.

Note that the position of bounding boxes and feature points is related to the size of the original image/media file. When the size of the images displayed does not match the size of the original image, you can simply resize them by:

Const resized = fullFaceDescriptions. Map (fd => fd.forsize (width, height)) Copy codeCopy the code

We can visualize the detection result by drawing the boundary box on the canvas:

FullFaceDescription. ForEach ((fd, I) = > {faceapi. DrawDetection (canvas, fd. Detection, {withScore: true})}) duplicate codeCopy the code

 

Face feature points can be displayed through the following method:

FullFaceDescription. ForEach ((fd, I) = > {faceapi. DrawLandmarks (canvas, fd. The landmarks, {drawLines: true})}) duplicate codeCopy the code

Typically, I’ll overlay an absolute positioned canvas of the same width and height on top of the IMG element (see the example on Github for more information).

 

Face recognition

Once we know how to get the positions and descriptors of all the faces in a given image, we will get some images showing one person per image and calculate their face descriptors. These descriptors will serve as our reference data.

Assuming we have some sample images that we can use, we first get the images from a URL link and then use “faceapi.bufferToImage” to create HTML image elements from their data cache:

// fetch images from url as blobs const blobs = await Promise.all( ['sheldon.png' 'raj.png', 'leonard.png', 'howard.png'].map( uri => (await fetch(uri)).blob() ) ) // convert blobs (buffers) to HTMLImage elements const images = Await promise.all (blobs.map(blob => await faceapi.bufferToimage (blob))) copies the codeCopy the code

Next, in each image, as we did earlier with the input image, we locate the face and calculate the face descriptor:

const refDescriptions = await Promsie.all(images.map( img => (await faceapi.allFaces(img))[0] )) const refDescriptors = RefDescriptions. Map (fd => fd.Descriptor) Copy codeCopy the code

Now, all we need to do is walk through the face descriptor of our input image and find the descriptor with the smallest distance from the input image in the reference data:

const sortAsc = (a, b) => a - b const labels = ['sheldon', 'raj', 'leonard', 'howard'] const results = fullFaceDescription.map((fd, i) => { const bestMatch = refDescriptors.map( refDesc => ({ label: labels[i], distance: faceapi.euclideanDistance(fd.descriptor, refDesc) }) ).sort(sortAsc)[0] return { detection: fd.detection, label: Bestmatch.label, distance: bestmatch.distance}}) copies the codeCopy the code

As mentioned earlier, we use Euclidean distance as a similarity measure here, which works very well. Every face we detect in the input image is the best match.

Finally, we can draw boundary boxes and their labels on the canvas to display the detection result:

// 0.6 is a good distance threshold value to judge // Whether the descriptors match or not const maxDistance = 0.6 results.forEach(result => { faceapi.drawDetection(canvas, result.detection, { withScore: false }) const text = `${result.distance < maxDistance ? result.className : 'unkown'} (${result.distance})` const { x, y, height: BoxHeight} = discover.getBox () faceapi.drawText(Canvas.getContext ('2d'), x, y + boxHeight, text)}Copy the code

At this point, I hope you have a sense of how to use this API. In the meantime, I encourage you to look at other examples of the code repository presented in this article. Have fun with this package!

Author: Heart of machine Link: juejin. Im /post/5b4d80… The copyright belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.