The Shape Detection API has been available for some time, and its main capability is to give the front end a directly usable interface for feature Detection (including bar code, face, and text Detection). This article will simply introduce it, the front-end face detection for universal explanation. (This article does not talk about algorithm ~ hope to tap)



1 Background and scene

Face Detection is an old topic and is widely used in many industries, such as finance, security, e-commerce, smart phones, entertainment pictures and other industries. The technologies involved are also evolving. Here are a few ideas:

A. Feature-based face detection

For example, OpencV has a built-in Harr classifier based on the Viola-Jones target detection framework (in fact, most classifiers are based on learning), You just need to load the corresponding configuration file (haarCascade_frontalface_alt.xml) to call detectObject directly to complete the detection process, and also support detection of other features (such as nose, mouth, etc.).

B. Face detection based on learning, in fact, is also the need to extract the local features of the image through the operator, through its classification, statistics, regression and other ways to get a more accurate and fast response classifier.


2 sets of highlights

2.1 Back-end processing

The front-end transmits resources to the back-end through the network. The back-end processes the images or video streams that need to be detected in a unified manner, which poses certain challenges to the back-end architecture. Meanwhile, the network delay often fails to bring real-time interaction effects to users.

2.2 Client Processing

Thanks to OpenCV’s cross-language and cross-platform advantages, the client can also provide face detection capabilities with low development costs, and can provide services to the Web container via JsBridge and other means. However, once out of the container, isolated pages will lose this capability. Until one day…

2.3 Open Service

Somewhere along the way, concepts like cloud computing took off and the cost of computing got cheaper. Each major research and development team (such as ali cloud, Face++) are ready to move and leisurely shelves face detection services, and even brought a variety of special! Very special! Suit! Service! , face recognition, liveness recognition, DOCUMENT OCR and face comparison, etc.



Although there are not only client-side SDKS and front and back apis, I would like to talk about my pure front-end solution anyway.


What does the era bring

Well, face recognition in the front is still in slash-and-burn ancient times, however, our infrastructure has started, I hope that some of the subsequent introduction can bring certain inspiration for you.

3.1 Shape Detection API

With the gradual improvement of the computing capability of the client hardware, more and more permissions are obtained at the browser level. As image processing requires a large amount of computing resources, in fact, the browser can also undertake some work of image Detection, so the Shape Detection API was developed.


The following simple examples show the basic usage. Before attempting to edit and run the code, make sure that the new feature is enabled on your Chrome version and that the API is restricted by the same origin policy:

chrome://flags/#enable-experimental-web-platform-features


Barcode Detection: Barcode Detection (For Chrome 56+)

var barcodeDetector = new BarcodeDetector();
barcodeDetector.detect(image)
  .then(barcodes => {
    barcodes.forEach(barcode => console.log(barcodes.rawValue))
  })
  .catch(err => console.error(err));Copy the code


    Face Detection (For Chrome 56+)

    var faceDetector = new FaceDetector();
    faceDetector.detect(image)
      .then(faces => faces.forEach(face => console.log(face)))
      .catch(err => console.error(err));Copy the code


    Text Detection (For Chrome 58+)

    var textDetector = new TextDetector();
    textDetector.detect(image)
      .then(boundingBoxes => {
        for(let box of boundingBoxes) {
          speechSynthesis.speak(new SpeechSynthesisUtterance(box.rawValue));
        }
      })
      .catch(err => console.error(err));
    Copy the code

    3.2 Face detection in images

    Image face detection is relatively simple, just need to pass in a picture element, can directly tune up the API face recognition. And then catch the canvas and we can display the results of the inspection.


    Core code:

    var image = document.querySelector('#image');
    var canvas = document.querySelector('#canvas');
    var ctx = canvas.getContext("2d");
    var scale = 1;
    
    image.onload = function () {
      ctx.drawImage(image, 0, 0, image.width, image.height, 0, 0, canvas.width, canvas.height);
      scale = canvas.width / image.width;
    };
    
    function detect() {  
      if (window.FaceDetector == undefined) {
        console.error('Face Detection not supported');
        return;
      } 
      var faceDetector = new FaceDetector();
      console.time('detect');
      return faceDetector.detect(image)
        .then(faces => {
          console.log(faces)      // Draw the faces on the <canvas>.
          var ctx = canvas.getContext("2d");
          ctx.lineWidth = 2;
          ctx.strokeStyle = "red";
          for (var i = 0; i < faces.length; i++) {
            var item = faces[i].boundingBox;
            ctx.rect(Math.floor(item.x * scale), Math.floor(item.y * scale), Math.floor(item.width * scale), Math.floor(item.height * scale));
            ctx.stroke();
          }
          console.timeEnd('detect');
        })
        .catch((e) => console.error("Boo, Face Detection failed: " + e));
    }Copy the code

    Treatment effect:


    3.3 Face detection in videos

    The face detection in the video is not much different from the image. Through getUserMedia, the camera can be opened to obtain the information of the video/microphone. Through the detection and display of the video frame, the face detection in the video can be realized.

    The core code is as follows:

    navigator.mediaDevices.getUserMedia({
      video: true,    // audio: true
    })
      .then(function (mediaStream) {
        video.src = window.URL.createObjectURL(mediaStream);
        video.onloadedmetadata = function (e) {
          // Do something with the video here.
        };
      })
      .catch(function (error) {
        console.log(error.name);
      });
    
    setInterval(function () {
      ctx.clearRect(0, 0, canvas.width, canvas.height);
      ctx.drawImage(video, 0, 0);
      image.src = canvas.toDataURL('image/png');
      image.onload = function() { detect(); }}, 60);Copy the code

    Treatment effect:


    3.4 Go back in time to the days when there were no apis

    In fact, many solutions existed a long, long time ago. Due to hardware conditions and no hardware acceleration and other limitations, it has not been widely put into production.

    a. tracking.js

    Tracking. Js is a jS-wrapped image processing library that brings rich algorithms and technologies related to computational vision to the browser. It can realize color tracking, face detection and other functions, specific features are as follows:

    b. jquery.facedetection

    Facedetection is a jquery/zepto facedetection plug-in based on the cross-terminal CCV image classifier and detector.


    3.5 the Node. Js & OpenCV

    The Node-OpencV module has been around for a few years and is perfectly compatible with OpencV V2.4.x, although it is not yet fully compatible with V3.x and provides limited apis. The arrival of N-API may bring more surprises.

    Imagine in a Electron or Node-WebKit container, can we realize real-time face detection by locally enabling websocket service? The implementation of the idea of the code is as follows:


    Back-end processing logic

    import cv from 'opencv';
    
    const detectConfigFile = './node_modules/opencv/data/haarcascade_frontalface_alt2.xml'; // camera propertiesconst camWidth = 320; const camHeight = 240; const camFps = 10; const camInterval = 1000 / camFps; // face detection propertiesconst rectColor = [0, 255, 0]; const rectThickness = 2; // initialize cameraconst camera = new cv.VideoCapture(0); camera.setWidth(camWidth); camera.setHeight(camHeight); const frameHandler = (err, im) => {return new Promise((resolve, reject) => {
        if (err) {
          return reject(err);
        }
        im.detectObject(detectConfigFile, {}, (error, faces) => {
          if (error) {
            return reject(error);
          }
          let face;
          for (let i = 0; i < faces.length; i++) {
            face = faces[i];
            im.rectangle([face.x, face.y], [face.width, face.height], rectColor, rectThickness);
          }
          return resolve(im);
        });
      });
    };
    
    module.exports = function (socket) {
      const frameSocketHanlder = (err, im) => {
        return frameHandler(err, im)
          .then((img) => {
            socket.emit('frame', {
              buffer: img.toBuffer(),
            });
          });
      };
      const handler = () => {
        camera.read(frameSocketHanlder);
      };
      setInterval(handler, camInterval);
    };Copy the code


    Front-end call interface

    socket.on('frame'.function (data) {
    
      var unit8Arr = new Uint8Array(data.buffer);
      var str = String.fromCharCode.apply(null, unit8Arr);
      var base64String = btoa(str);
    
      img.onload = function () {
        ctx.drawImage(this, 0, 0, canvas.width, canvas.height);
      }
      img.src = 'data:image/png; base64,' + base64String;
    });Copy the code

    4.1 Future development

    These cutting-edge technologies will get more extensive application in the front and support is beyond doubt, the future of the image on the front end will also with the traditional image processing – > learning + image processing way, all the credit for leave the infrastructure (hardware, browser, tools, libraries, etc.) gradually strengthen and perfect, including but not limited to:

    • GetUserMedia /Canvas => image/video operation

    • Shape Detection API => Image Detection

    • Web Workers => Parallel computing capability

    • ConvNetJS => Deep learning framework

    • Tensorflow (DeeplearnJS) => supports JS


    4.2 Is actually not that optimistic

    2 accuracy

    The recognition rate of the front face (multiple faces) is relatively high, but the detection effect is not ideal when there are obstacles on the side face.


    4.2.2 Processing speed

    For example 2.2 of face detection in images, it takes 300ms+ (in fact, it cannot meet the real-time processing of large resolution video), which is three times faster than the detection speed of 100ms by calling Opencv.


    Holdings features

    There is still a lot to be done: such as not supporting eyewear status, gender, age estimation, facial recognition, race, smile, blur detection and other services provided by mainstream service providers.


    4.3 Want to say and say endless

    Welcome to Fork/Star for source code for all examples in this article:

    https://github.com/x-cold/face-detection-browser

    https://github.com/x-cold/face-detection-nodejs

    B. There is no data support for the adaptability of face detection in different scenarios and the detection time. PASCAL VOC and samples provided by AT&T will be introduced to carry out small-scale tests.


    5 reference

    1. The facial recognition technology summary (1) : Face Detection & Alignment: http://blog.jobbole.com/85783/

    2. Alibaba live prevention and control of the real person authentication technology: https://xianzhi.aliyun.com/forum/mobile/read/635.html

    3. What can the front end do in the age of ARTIFICIAL intelligence? : https://yq.aliyun.com/articles/153198.

    4. ConvNetJS Deep Learning in your browser:http://cs.stanford.edu/people/karpathy/convnetjs/

    5. Face detection using Shape detection API: https://paul.kinlan.me/face-detection/