• A Picture is Worth A Thousand Words, Faces, and Barcodes — The Shape Detection API
  • By Thomas Steiner
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: jerryOnlyZRJ
  • Proofreader: Park-ma, Haiyang-tju

Note: We are currently using the specification of this API as part of a functional project, and we will keep this article updated as this new API moves from design to implementation.

What is a shape detection API?

With API navigator. MediaDevices. GetUserMedia and the new android chrome photo picker, obtain images from the camera on a mobile device or upload real-time video data or the local image becomes quite easy. Until now, moving image data and still images on a page have been a black box that we couldn’t manipulate, even though images might actually contain interesting features like faces, bar codes, and text.

In the past, if developers wanted to extract these features on the client side, such as building a QR code recognizer, they had to rely on external JavaScript libraries. This is expensive from a performance perspective and increases the resource volume of the overall page. On the other hand, operating systems such as Android, iOS and macOS, as well as the hardware chips in their camera modules, often already have high-performance and highly optimized feature detectors, For example, Android FaceDetector or iOS CIDetector.

What the Shape Detection API does is call these native implementations and turn them into a set of JavaScript interfaces. Currently, the API supports face detection through the FaceDetector interface, bar code detection through the BarcodeDetector interface, and text detection (optical Character recognition, OCR) through the TextDetector interface.

Note: Although text detection is an interesting area, it is not stable enough in the computing platforms or character sets that are currently being standardized, which is why text detection already has a separate information specification.

Read more about it

Shape Detection API practice case

As mentioned above, the Shape Detection API currently supports Detection of faces, bar codes, and text. The following list contains examples of use cases for all three features:

  • Face detection

    • Online social networks or photo-sharing sites often ask users to tag people in images. Recognizing faces through edge detection can make this work much easier.
    • Content sites can crop images dynamically based on what faces they might detect, rather than relying on other heuristics or using the panning or zooming approach proposed by Ken Burns.
    • Multimedia messaging sites can allow their users to add interesting stickers like sunglasses or whiskers to different spots on the detected face.
  • Bar code detection

    • Web applications that can read qr codes can implement many interesting use cases, such as online payment or Web navigation, or using bar codes to share social connections on applications.
    • Shopping apps can allow their users to scan EAN or UPC barcodes on items in physical stores to compare prices online.
    • Airports could set up kiosks where passengers can scan Aztec Codes for boarding passes to display personalized information related to their flights.
  • Text detection

    • When no other description is provided, online social networking sites can add detected text asimg[alt]Property values to improve the experience of user-generated image content.
    • Content sites can use text detection to avoid placing titles on top of main images that contain text.
    • Web applications can use text detection to translate text, for example, to translate restaurant menus.

The current progress

steps state
1. Create an interpreter complete
2. Create an initial draft of the specification ongoing
Collect feedback and iterate ongoing
4, put into the experiment ongoing
5. Release Not at the

How do I use the Shape Detection API

The interfaces exposed by the three detectors, FaceDetector, BarcodeDetector and TextDetector, are very similar in that they all provide an asynchronous method to detect, It takes an ImageBitmapSource input (or a CanvasImageSource, [Blob] object (w3c.github. IO /FileAPI/#df…). Or ImageData).

In the case of FaceDetector and BarcodeDetector, optional parameters can be passed to the constructor of the detector, which allows calling instructions to the underlying native detector.

Tip: If your ImageBitmapSource comes from a separate script source that is different from the Document source, detect will fail and throw a DOMException called SecurityError. If your image has CORS set across domains, you can request CORS access using the Crossorigin property.

Use it in the projectFaceDetector

const faceDetector = new FaceDetector({
  // (Optional) Hint to try and limit the amount of detected faces
  // on the scene to this maximum number.
  maxDetectedFaces: 5,
  // (Optional) Hint to try and prioritize speed over accuracy
  // by, e.g., operating on a reduced scale or looking for large features.
  fastMode: false
try {
  const faces = await faceDetector.detect(image);
  faces.forEach(face => console.log(face));
} catch (e) {
  console.error('Face detection failed:', e);
Copy the code

Use it in the projectBarcodeDetector

const barcodeDetector = new BarcodeDetector({
  // (Optional) A series of barcode formats to search for.
  // Not all formats may be supported on all platforms
  formats: [
    'aztec'.'code_128'.'code_39'.'code_93'.'codabar'.'data_matrix'.'ean_13'.'ean_8'.'itf'.'pdf417'.'qr_code'.'upc_a'.'upc_e']}); try { const barcodes = await barcodeDetector.detect(image); barcodes.forEach(barcode => console.log(barcode)); } catch (e) { console.error('Barcode detection failed:', e);
Copy the code

Use it in the projectTextDetector

const textDetector = new TextDetector();
try {
  const texts = await textDetector.detect(image);
  texts.forEach(text => console.log(text));
} catch (e) {
  console.error('Text detection failed:', e);
Copy the code

Usability testing

It is necessary to check the presence of the constructor before using the Shape Detection API, because although Chrome on Linux and Chrome OS currently has the detector interface open, they are not working properly (bug). As an interim measure, we recommend doing this before using these apis:

const supported = await (async () => 'FaceDetector' in window &&
    await new FaceDetector().detect(document.createElement('canvas'))
    .then(_ => true)
    .catch(e => e.name === 'NotSupportedError' ? false : true()));Copy the code

Best practices

All detectors work asynchronously, that is, they don’t block the main thread 🎉, so don’t go for real-time detection too much, but rather give the detector a period of time to complete its work.

If you’re a big fan of Web Workers (and who else isn’t?) And best of all, the detector connector is exposed there. Detection results are also serializable, so they can be passed from the worker thread back to the main thread via postMessage. Here’s a demo that shows some simple practices.

Not all platform implementations support all features, so be sure to double-check the support and think of the API as incremental enhancements. For example, some platforms may support face detection themselves, but not facial marker detection (eyes, nose, mouth, etc.), or recognize the presence and location of text, but not the actual text content.

Tip: This API is an optimization and is not guaranteed to work for every user. Developers are expected to combine it with their own image recognition code as a native optimization tool when it becomes available.


We need your help to ensure that the Shape Detection API meets your needs and that we don’t miss any key scenarios.

We need your help! Does the current design of the Shape Detection API meet your needs? If not, submit the issue to the Shape Detection API REPo with as much detail as possible.

We would also like to know how you plan to use the Shape Detection API:

  • Do you have a specific use scenario or do you know of a situation where you can use it?
  • Are you going to use this?
  • Love it and want to show your support for it?

Share your discussions on Shape Detection API WICG Discourse.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.

The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.