As Web browsers become more powerful, so do the complexity of websites and Web applications. Operations that required supercomputers decades ago can now be performed on smartphones, one of which is face detection.

The ability to detect and analyze faces is very useful because it allows us to add intelligent features. Things like automatically blurring faces (like Google Maps), moving and zooming camera feeds to focus on people (like the Microsoft team), verifying passports, adding silly filters (like Instagram and Snapchat), and more. But first, we need to find the face!

Face-api.js is a library that enables developers to use Face detection in their applications without a machine learning background.

The code for this tutorial is available on GitHub. Github.com/sitepoint-e…

Machine learning face detection

Detecting objects, such as faces, is quite complex. Think about it: maybe we could write a program that scans pixels to find eyes, nose, and mouth. It can be done, but to make it completely reliable is virtually impossible because there are so many factors to consider. Think about light conditions, facial hair, the variety of shapes and colors, makeup, angles, facial masks, and many other factors.

However, neural networks are good at solving such problems and can be generalized to most, if not all, conditions. We can use the popular JavaScript machine learning library tensorflow.js to create, train, and use neural networks in a browser. However, even if we use an off-the-shelf, pre-trained model, we will still have an understanding of the details of providing information to TensorFlow and interpreting the output.

Using face-api.js, it wraps all of this into an intuitive API. We can pass an IMG, Canvas, or video DOM element, and the library will return one or a set of results. Face-api.js can detect faces, but can also estimate various contents within them, as listed below.

  • Face detection: Obtaining the boundaries of one or more faces, which is useful for determining the position and size of a face in an image.
  • Facial landmark detection: Obtain the position and shape of eyebrows, eyes, nose, mouth and lips, and chin. This can be used to determine orientation or to project shapes in specific areas, such as the beard between the nose and lips.
  • Facial recognition: Determine who is in the frame.
  • Facial expression detection: Obtaining expression from a person’s face.
  • Age and sex detection: Get age and sex from a face. Note that in the “gender” category, it classifies a face as feminine or masculine, which doesn’t necessarily reveal their gender.

Before you use these things outside of your experiments, be aware that AI is good at amplifying biases. Gender categorization works well for bisexuals, but it doesn’t detect the gender of my non-bisexual friends. It can identify white people most of the time, but often fails to detect people of color.

Be very thoughtful when using this technology and test it thoroughly with different test groups.

The installation

We can install face-api.js via NPM:

npm install face-api.js
Copy the code

However, to skip the setup of the build tool, I’ll include the UMD package via unpkg.org:

/* globals faceapi */
import 'https://unpkg.com/[email protected]/dist/face-api.min.js';
Copy the code

After that, we need to download the correct pre-training model from the library’s repository.

Determine what we want to know from the face and use the available models section to determine which models are needed. Some functions can use multiple models. In this case, we have to choose between bandwidth/performance and precision. Compare the file sizes of the various models available and choose the model that you think best fits your project.

Not sure what model you need for your use? You can return to this step later. When we use the API without loading the required model, we throw an error stating the expected model for the library.

We are now ready to use the face-api.js API.

The sample

Let’s build something!

For the following example, I’ll use this function to load a random image from an Unsplash Source:

function loadRandomImage() {
  const image = new Image();

  image.crossOrigin = true;

  return new Promise((resolve, reject) = > {
    image.addEventListener('error'.(error) = > reject(error));
    image.addEventListener('load'.() = > resolve(image));
    image.src = 'https://source.unsplash.com/512x512/?face,friends';
  });
}
Copy the code

Cut out pictures

You can find the code for this demo in the accompanying GitHub repo. Github.com/sitepoint-e…

First, we select and load the model. In order to crop the image, we only need to know the boundary box of a face, so face detection is enough. We can do it with two models. SSD Mobilenet V1 model (just under 6MB) and Tiny Face Detector model (under 200KB). We say accuracy is irrelevant because the user can also choose to manually crop. Also, let’s assume that the visitor uses this feature on a slow network connection. Since our focus is on bandwidth and performance, we will choose the smaller Tiny Face Detector model.

After downloading the model, we can load it:

await faceapi.nets.tinyFaceDetector.loadFromUri('/models');
Copy the code

We can now load the image and pass it to face-api.js. Faceapi.detectallfaces uses the SSD Mobilenet V1 model by default, . So we must explicitly pass new faceapi TinyFaceDetectorOptions () to force it to use Tiny Face Detector model.

const image = await loadRandomImage();
const faces = await faceapi.detectAllFaces(image, new faceapi.TinyFaceDetectorOptions());
Copy the code

The faces variable now contains an array of results. Each result has a box and score attribute. The score indicates how confident the neural network is that the result is indeed a face. The box contains an object with face coordinates, and we can select the first result (or we can use faceapi.DetectsingleFace ()), but if the user submits a group photo, we want to see all the people in the cropped image. To do this, we can evaluate a custom bounding box.

const box = {
  // Set the bounds to their inverse infinity, so any numbers are larger/smaller
  bottom: -Infinity.left: Infinity.right: -Infinity.top: Infinity.// Given the boundary, we can calculate the width and height
  get height() {
    return this.bottom - this.top;
  },

  get width() {
    return this.right - this.left; }};// Update the box boundary
for (const face of faces) {
  box.bottom = Math.max(box.bottom, face.box.bottom);
  box.left = Math.min(box.left, face.box.left);
  box.right = Math.max(box.right, face.box.right);
  box.top = Math.min(box.top, face.box.top);
}
Copy the code

Finally, we can create a canvas and display the results:

const canvas = document.createElement('canvas');
const context = canvas.getContext('2d');

canvas.height = box.height;
canvas.width = box.width;

context.drawImage(
  image,
  box.left,
  box.top,
  box.width,
  box.height,
  0.0,
  canvas.width,
  canvas.height
);
Copy the code

Place emoticons

You can find the code for this demo in the accompanying GitHub repo. Github.com/sitepoint-e…

Why not have some fun? We can make a filter and put a mouth emoji on all the eyes (👄). To find the eye’s landmark, we need another model. This time, we were concerned with accuracy, so we used the SSD Mobilenet V1 and 68-point facial landmark detection model.

Again, we need to load the model and image first:

await faceapi.nets.faceLandmark68Net.loadFromUri('/models');
await faceapi.nets.ssdMobilenetv1.loadFromUri('/models');

const image = await loadRandomImage();
Copy the code

To get the landmark, we must append the withFaceLandmarks() function call to detectAllFaces() to get the landmark data.

const faces = await faceapi
  .detectAllFaces(image)
  .withlandmarks();
Copy the code

As last time, Faces contains a list of results. In addition to the position of the face, each result also contains a list of original points for a landmark. To get the correct landmark for each feature, we need to slice up the list of points. Because the number of points is fixed, I chose a hard-coded index.

for (const face of faces) {
  const features = {
    jaw: face.landmarks.positions.slice(0.17),
    eyebrowLeft: face.landmarks.positions.slice(17.22),
    eyebrowRight: face.landmarks.positions.slice(22.27),
    noseBridge: face.landmarks.positions.slice(27.31),
    nose: face.landmarks.positions.slice(31.36),
    eyeLeft: face.landmarks.positions.slice(36.42),
    eyeRight: face.landmarks.positions.slice(42.48),
    lipOuter: face.landmarks.positions.slice(48.60),
    lipInner: face.landmarks.positions.slice(60),};// ...
}
Copy the code

Now we can finally have some fun. There are many options, but let’s cover our eyes with a mouth emoji (👄).

First, we had to decide where to put the emoji and how big it should be. To do this, let’s write an auxiliary function that creates a box with all the information we need from an arbitrary set of points.

function getBoxFromPoints(points) {
  const box = {
    bottom: -Infinity.left: Infinity.right: -Infinity.top: Infinity.get center() {
      return {
        x: this.left + this.width / 2.y: this.top + this.height / 2}; },get height() {
      return this.bottom - this.top;
    },

    get width() {
      return this.right - this.left; }};for (const point of points) {
    box.left = Math.min(box.left, point.x);
    box.right = Math.max(box.right, point.x);

    box.bottom = Math.max(box.bottom, point.y);
    box.top = Math.min(box.top, point.y);
  }

  return box;
}
Copy the code

Now we can start painting emoticons on the pictures. Since we have to do this for both eyes, we can put feature.eyeLeft and feature.eyeRight in an array and iterate over them, executing the same code for each eye. All that’s left is to paint the emoji on the canvas!

for (const eye of [features.eyeLeft, features.eyeRight]) {
  const eyeBox = getBoxFromPoints(eye);
  const fontSize = 6 * eyeBox.height;

  context.font = `${fontSize}px/${fontSize}px serif`;
  context.textAlign = 'center';
  context.textBaseline = 'bottom';

  context.fillStyle = '# 000';
  context.fillText('👄', eyeBox.center.x, eyeBox.center.y + 0.6 * fontSize);
}
Copy the code

Note that I used some magic numbers to adjust the font size and exact text position. Since emoticons are Unicode and typography on the Web is weird (at least to me), I just tweak the numbers until they look right. A more powerful alternative is to use images as overlay layers.

conclusion

Face-api.js is a great library that makes Face detection and recognition very easy. No knowledge of machine learning and neural networks is required. I like tools that I can use, and this is definitely one of them.

In my experience, face recognition on the Web affects performance. We have to choose between bandwidth and performance or accuracy. Smaller models are certainly less accurate and will miss faces from some of the factors I mentioned earlier, such as poor light or when the face is covered by a mask.

Microsoft Azure, Google Cloud, and possibly others all offer face detection in the Cloud. Because we avoid downloading large models, cloud-based detection avoids heavy page loading, tends to be more accurate as it improves frequently, and may even be faster due to optimized hardware. If you need high precision, you may want to research a plan that you are happy with.

I definitely recommend using face-api.js for side projects, experiments, and maybe even MVP.


Translated from www.sitepoint.com/face-api-js… By Tim Severien