JavaCV camera combat fourteen: mask detection

Make writing a habit together! This is the 10th day of my participation in the “Gold Digging Day New Plan · April More text Challenge”. Click here for more details.

Welcome to my GitHub

Here classification and summary of xinchen all original (including supporting source code) : github.com/zq2599/blog…

This paper gives an overview of

This is the 14th article in the JavaCV Camera In Action series. As the title says, today’s function is to check whether the person in the camera is wearing a mask, and put the detection result in the preview window in real time, as shown in the picture below:

The whole processing process is as follows. The key to realize mask detection is to submit the picture to Baidu AI open platform, and then mark the location of the face and whether the person is wearing a mask in the local preview window according to the results returned by the platform:

Problem notification in advance

A typical problem with relying on cloud platforms to process business is limited processing speed
First of all, if you register an account on Baidu AI open platform as an individual, the free interface calls will be limited to two times per second, and ten times for an enterprise account
Secondly, after measurement, a face detection interface takes more than 300ms
In the end, you can actually process only two frames per second, and this effect is only slideshows when shown in the preview window (below 15 frames per second is a noticeable lag).
Therefore, this article is only suitable for basic function demonstration and cannot be used as a solution for actual scenarios

About Baidu AI Open platform

In order to use baidu AI open platform services normally, you need to complete some registration and application operations, please refer to the “simplest face detection (free call Baidu AI Open Platform interface)”
Now, if you have completed the registration and application for baidu AI Open Platform, you should now have the access_token available and you can start coding

Code: Add dependency libraries

This article continues with the simple-grab-Push project created in JavaCV Camera Hands-on: Basics
The first is to add okHTTP and Jackson dependencies to pom.xml for network requests and JSON parsing, respectively:

<dependency>
	<groupId>com.squareup.okhttp3</groupId>
	<artifactId>okhttp</artifactId>
    <version>3.10.0</version>
</dependency>
<dependency>
	<groupId>com.fasterxml.jackson.core</groupId>
    <artifactId>jackson-databind</artifactId>
    <version>2.11.0</version>
</dependency>
Copy the code

Encoding: Encapsulates the code that requests and responds to Baidu AI open platform

The next step is to develop a service class, which encapsulates all the code related to Baidu AI open platform
First, define the Request object for the Web request, FaceDetectrequest.java:

@Data
public class FaceDetectRequest {
    // Image information (the total data size should be less than 10M). The image upload mode is determined according to image_type
    String image;

    // Image type
    // BASE64: the BASE64 value of the image. The size of the encoded image does not exceed 2M.
    // URL: the URL of the image (it may take too long to download the image due to network reasons);
    // FACE_TOKEN: the unique identifier of a face image. When the face detection interface is invoked, each face image is given a unique FACE_TOKEN. The same image can be detected multiple times to obtain the same FACE_TOKEN.
    @JsonProperty("image_type")
    String imageType;

    / / including age, expression, face_shape, gender, glasses, landmark, landmark150, quality, eye_status, emotion, face_type, mask, spoofing information
    By default, only face_token, face frame, probability, and rotation Angle are returned
    @JsonProperty("face_field")
    String faceField;

    // The maximum number of faces to process, the default value is 1, according to the face detection sorting type detection picture of the first face (default is the face with the largest face area), the maximum value is 120
    @JsonProperty("max_face_num")
    int maxFaceNum;

    // The type of face
    // LIVE: a portrait taken by a mobile phone or camera, or taken from the Internet
    // IDCARD stands for IDCARD chip photo: portrait photo in the embedded chip of the second-generation IDCARD
    // WATERMARK indicates a small watermarked id photo, such as a police network photo
    // CERT: photo of id card, employee card, passport, student id card, etc
    / / LIVE by default
    @JsonProperty("face_type")
    String faceType;

    // Faces that do not meet the requirements in the liveness control test result will be filtered
    // NONE: no control is performed
    // LOW: LOW live requirement (high pass rate, LOW attack rejection rate)
    // NORMAL: NORMAL living requirements (balanced attack rejection rate, pass rate)
    // HIGH: HIGH live requirement (HIGH attack rejection rate, low pass rate)
    / / the default NONE
    @JsonProperty("liveness_control")
    String livenessControl;

    // Face detection sort type
    // 0: indicates that the detected faces are arranged according to the face area from large to small
    // 1: indicates that the detected faces are arranged from near to far from the center of the picture
    // Default is 0
    @JsonProperty("face_sort_type")
    int faceSortType;
}
Copy the code

Next, define the Web response object, FaceDetectresponse.java:

@Data
@ToString
public class FaceDetectResponse implements Serializable {
    / / return code
    @JsonProperty("error_code")
    String errorCode;
    // Description
    @JsonProperty("error_msg")
    String errorMsg;
    // What to return
    Result result;

    @Data
    public static class Result {
        // The number of faces
        @JsonProperty("face_num")
        private int faceNum;
        // Information about each face
        @JsonProperty("face_list")
        List<Face> faceList;

        / * * *@author willzhao
         * @version 1.0
         * @descriptionFace object detected *@date2022/1/1 16:03 * /
        @Data
        public static class Face {
            / / position
            Location location;
            // Is the face confidence
            @JsonProperty("face_probability")
            double face_probability;
            / / face mask
            Mask mask;

            / * * *@author willzhao
             * @version 1.0
             * @descriptionThe position of the face in the picture *@date2022/1/1 16:04 * /
            @Data
            public static class Location {
                double left;
                double top;
                double width;
                double height;
                double rotation;
            }

            / * * *@author willzhao
             * @version 1.0
             * @descriptionMask object *@date2022/1/1 tightly with * /
            @Data
            public static class Mask {
                int type;
                doubleprobability; }}}}Copy the code

Baiducloudservice.java is the service class for baiducloudservice.java, which contains all the logic for requesting and responding to baidu AI open platform. Construct the request object according to the base64 string of the image, send the POST request (PATH is the face detection service), and deserialize it into the FaceDetectResponse object with Jackson after receiving the response:

public class BaiduCloudService {

    OkHttpClient client = new OkHttpClient();

    static final MediaType JSON = MediaType.parse("application/json; charset=utf-8");

    static final String URL_TEMPLATE = "https://aip.baidubce.com/rest/2.0/face/v3/detect?access_token=%s";

    String token;

    ObjectMapper mapper = new ObjectMapper();

    public BaiduCloudService(String token) {
        this.token = token;

        // Important: When deserializing, if the character field is more than the class field, the following setting will ensure that deserialization is successful
        mapper.disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES);
    }
    
    /** * test the specified image *@param imageBase64
     * @return* /
    public FaceDetectResponse detect(String imageBase64) {
        // Request object
        FaceDetectRequest faceDetectRequest = new FaceDetectRequest();
        faceDetectRequest.setImageType("BASE64");
        faceDetectRequest.setFaceField("mask");
        faceDetectRequest.setMaxFaceNum(6);
        faceDetectRequest.setFaceType("LIVE");
        faceDetectRequest.setLivenessControl("NONE");
        faceDetectRequest.setFaceSortType(0);
        faceDetectRequest.setImage(imageBase64);

        FaceDetectResponse faceDetectResponse = null;

        try {
            // Serialize the request object into a string with Jackson
            String jsonContent = mapper.writeValueAsString(faceDetectRequest);

            //
            RequestBody requestBody = RequestBody.create(JSON, jsonContent);
            Request request = new Request
                    .Builder()
                    .url(String.format(URL_TEMPLATE, token))
                    .post(requestBody)
                    .build();
            Response response = client.newCall(request).execute();
            String rawRlt = response.body().string();
            faceDetectResponse = mapper.readValue(rawRlt, FaceDetectResponse.class);
        } catch (IOException ioException) {
            ioException.printStackTrace();
        }

        returnfaceDetectResponse; }}Copy the code

After the service class is written, it’s up to the main program to string the logic together

Implementation of the DetectService interface

Readers familiar with the “JavaCV Camera Combat” series should be familiar with the DetectService interface. In order to realize the process of capturing frames > processing frames > output processing results in a unified style in the whole series of many actual situations, we define a DetectService interface. Each different frame processing business can realize this interface according to its own characteristics (such as face detection, age detection, gender detection, etc.)
Let’s review the DetectService interface:

public interface DetectService {

    /** * Construct MAT of the same size according to the passed MAT, store grayscale images for future detection *@paramSRC The MAT object of the original image *@returnMAT object of the same size grayscale image */
    static Mat buildGrayImage(Mat src) {
        return new Mat(src.rows(), src.cols(), CV_8UC1);
    }

    /** * Check the image and mark the detection result with a rectangle on the original image *@paramClassifier Classifier *@paramConverter * for Converter Frame and MAT@paramRawFrame Raw video frame *@paramGrabbedImage The mat * corresponding to the original video frame@paramGrayImage Mat * for storing grayscale images@returnVideo frames labeled with recognition results */
    static Frame detect(CascadeClassifier classifier, OpenCVFrameConverter.ToMat converter, Frame rawFrame, Mat grabbedImage, Mat grayImage) {

        // The current image is converted to grayscale image
        cvtColor(grabbedImage, grayImage, CV_BGR2GRAY);

        // The container to store the test results
        RectVector objects = new RectVector();

        // Start the test
        classifier.detectMultiScale(grayImage, objects);

        // Total number of test results
        long total = objects.size();

        // If no result is detected, the original frame is returned
        if (total<1) {
            return rawFrame;
        }

        // If there is a test result, construct a rectangle based on the result data and draw it on the original image
        for (long i = 0; i < total; i++) {
            Rect r = objects.get(i);
            int x = r.x(), y = r.y(), w = r.width(), h = r.height();
            rectangle(grabbedImage, new Point(x, y), new Point(x + w, y + h), Scalar.RED, 1, CV_AA, 0);
        }

        // Release the detection result resource
        objects.close();

        // Convert the labeled image into a frame, and return
        return converter.convert(grabbedImage);
    }

    /** * Initializes operations, such as model download *@throws Exception
     */
    void init(a) throws Exception;

    /** * get the original frame, do the identification, add box select *@param frame
     * @return* /
    Frame convert(Frame frame);

    /** * Release resources */
    void releaseOutputResource(a);
}
Copy the code

Take a look at this practice DetectService interface implementation class BaiduCloudDetectService. Java, there are several caveats mentioned later:

@Slf4j
public class BaiduCloudDetectService implements DetectService {

    /** * The object of each frame of the original image */
    private Mat grabbedImage = null;

    /** * Baidu cloud token */
    private String token;

    /** * image base64 string */
    private String base64Str;

    /** * Baidu cloud service */
    private BaiduCloudService baiduCloudService;

    private OpenCVFrameConverter.ToMat openCVConverter = new OpenCVFrameConverter.ToMat();

    private Java2DFrameConverter java2DConverter = new Java2DFrameConverter();

    private OpenCVFrameConverter.ToMat converter = new OpenCVFrameConverter.ToMat();

    private BASE64Encoder encoder = new BASE64Encoder();

    /** * constructor, where to specify the model file download address *@param token
     */
    public BaiduCloudDetectService(String token) {
        this.token = token;
    }

    /** * Initialization of Baidu cloud service object *@throws Exception
     */
    @Override
    public void init(a) throws Exception {
        baiduCloudService = new BaiduCloudService(token);
    }

    @Override
    public Frame convert(Frame frame) {
        // Convert the original frame to a Base64 string
        base64Str = frame2Base64(frame);

        // Record the start time of the request
        long startTime = System.currentTimeMillis();

        // Submit to Baidu Cloud for face and mask detection
        FaceDetectResponse faceDetectResponse = baiduCloudService.detect(base64Str);

        // If the test fails, return ahead of time
        if (null==faceDetectResponse
         || null==faceDetectResponse.getErrorCode()
         || !"0".equals(faceDetectResponse.getErrorCode())) {
            String desc = "";
            if (null! =faceDetectResponse) { desc = String.format(", error code [%s], error message [%s]", faceDetectResponse.getErrorCode(), faceDetectResponse.getErrorMsg());
            }

            log.error("Face detection failed", desc);

            // Return early
            return frame;
        }

        log.info("Detection time [{}]ms, result: {}", (System.currentTimeMillis()-startTime), faceDetectResponse);

        // If no detection result is available, return the original frame
        if (null==faceDetectResponse.getResult()
        || null==faceDetectResponse.getResult().getFaceList()) {
            log.info("No faces detected.");
            return frame;
        }

        // Retrieve baidu Cloud detection results, which will be processed one by one later
        List<FaceDetectResponse.Result.Face> list = faceDetectResponse.getResult().getFaceList();
        FaceDetectResponse.Result.Face face;
        FaceDetectResponse.Result.Face.Location location;
        String desc;
        Scalar color;
        int pos_x;
        int pos_y;

        // If there is a test result, construct a rectangle based on the result data and draw it on the original image
        for (int i = 0; i < list.size(); i++) {
            face = list.get(i);

            // The position of each face
            location = face.getLocation();

            int x = (int)location.getLeft();
            int y = (int)location.getHeight();
            int w = (int)location.getWidth();
            int h = (int)location.getHeight();

            // Mask field type equals 1 for mask and 0 for no mask
            if (1==face.getMask().getType()) {
                desc = "Mask";
                color = Scalar.GREEN;
            } else {
                desc = "No mask";
                color = Scalar.RED;
            }

            // Frame the face on the picture
            rectangle(grabbedImage, new Point(x, y), new Point(x + w, y + h), color, 1, CV_AA, 0);

            // The horizontal coordinate of the face annotation
            pos_x = Math.max(x-10.0);
            // The ordinate of the face annotation
            pos_y = Math.max(y-10.0);

            // Mark the face and indicate whether to wear a mask
             putText(grabbedImage, desc, new Point(pos_x, pos_y), FONT_HERSHEY_PLAIN, 1.5, color);
        }

        // Convert the labeled image into a frame, and return
        return converter.convert(grabbedImage);
    }

    /** * before the end of the program, release the face recognition resources */
    @Override
    public void releaseOutputResource(a) {
        if (null!=grabbedImage) {
            grabbedImage.release();
        }
    }

    private String frame2Base64(Frame frame) {
        grabbedImage = converter.convert(frame);
        BufferedImage bufferedImage = java2DConverter.convert(openCVConverter.convert(grabbedImage));
        ByteArrayOutputStream bStream = new ByteArrayOutputStream();
        try {
            ImageIO.write(bufferedImage, "png", bStream);
        } catch (IOException e) {
            throw new RuntimeException("BugImg read failed :"+e.getMessage(),e);
        }

        returnencoder.encode(bStream.toByteArray()); }}Copy the code

The code above has the following caveats:

The entire BaiduCloudDetectService class is mainly a use of the previous BaiduCloudService class
In the convert method, the frame instance will be converted into base64 string, which will be submitted to Baidu AI open platform for face detection
In the detection results of Baidu AI open platform, there are multiple face detection results, which need to be processed one by one: take out the position of each face and put it in the original picture rectangle. Then mark people’s faces according to whether they wear masks. Those who wear masks are marked green (including rectangular frame), and those who do not wear masks are marked red rectangular frame

The main program

Finally is the main program, or “JavaCV camera combat” series routine, let’s take a look at the main program service class defined framework
“JavaCV camera one of actual combat: basis” to create a simple – grab – a push in the project is ready to parent AbstractCameraApplication, so we continue to use the project, create subclasses implement the abstract methods
Before coding, we should review the basic structure of the parent class, as shown in the figure below. The bold is the method defined by the parent class, and the red block is the method that needs to be subclassed to implement the abstract method. Therefore, we should implement the three red methods with the goal of local window preview:

New file PreviewCameraWithBaiduCloud Java, this is a subclass of AbstractCameraApplication, its code is very simple, the next order, according to the above instructions
We define a member variable of type CanvasFrame, previewCanvas, which is the local window to display video frames:

protected CanvasFrame previewCanvas
Copy the code

DetectService = DetectService (DetectService);

    /** * detection tool interface */
    private DetectService detectService;
Copy the code

PreviewCameraWithBaiduCloud constructor, accept DetectService instances:

    /** * Different detection tools can be passed in through the constructor@param detectService
     */
    public PreviewCameraWithBaiduCloud(DetectService detectService) {
        this.detectService = detectService;
    }
Copy the code

Initialization operations, including instantiation and parameter Settings of the previewCanvas, and initialization operations for detection and recognition:

    @Override
    protected void initOutput(a) throws Exception {
        previewCanvas = new CanvasFrame("Camera Preview", CanvasFrame.getDefaultGamma() / grabber.getGamma());
        previewCanvas.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
        previewCanvas.setAlwaysOnTop(true);

        // Check the initialization of the service
        detectService.init();
    }
Copy the code

Detectservice. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert: DetectService. convert

    @Override
    protected void output(Frame frame) {
        // The original frame is first handed to the detection service, which involves object detection, and then the detection result is marked on the original image.
        // Then convert to frame return
        Frame detectedFrame = detectService.convert(frame);
        // The frame displayed in the preview window is the frame marked with the detection result
        previewCanvas.showImage(detectedFrame);
    }
Copy the code

Finally, after the end of the loop processing video, the program exits before doing things, first close the local window, and then release the detection service resources:

    @Override
    protected void releaseOutputResource(a) {
        if (null! = previewCanvas) { previewCanvas.dispose(); }// Instrumentation also frees resources
        detectService.releaseOutputResource();
    }
Copy the code

Each frame takes too long, so there is no extra time between frames:

    @Override
    protected int getInterval(a) {
        return 0;
    }
Copy the code

Note that the token value is the access_token obtained in baidu AI open platform above:

    public static void main(String[] args) {
        String token = "21.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.xxxxxxx.xxxxxxxxxx.xxxxxx-xxxxxxxx";
        new PreviewCameraWithBaiduCloud(new BaiduCloudDetectService(token)).action(1000);
    }
Copy the code

By this point, the code is written, the cameras are ready for validation, and the extras are waiting in the cold for the free lunch box

validation

Run the main method of PreviewCameraWithBaiduCloud, please the masses actor appeared in front of the camera, this time don’t wear a face mask, visible man’s face is red font and rectangular box:

Having the extras put on masks and appear again in front of the camera, this time the masks are detected, showing green notes and rectangular boxes:

In practice, with only two frames per second at most, the preview window is a complete slideshow, which is pretty miserable…
This blog used two photos of extras, so he took two boxes of lunch, And Hsin Chen felt very distressed…
At this point, the mask detection function based on JavaCV and Baidu AI open platform has been completed, I hope you continue to pay attention to the “JavaCV Camera Combat” series, the combat will be more exciting

Welcome to the Nuggets: programmer Chen Chen

Learning on the road, you are not alone, Xinchen original all the way accompanied by…

JavaCV camera combat fourteen: mask detection

Welcome to my GitHub

This paper gives an overview of

Problem notification in advance

About Baidu AI Open platform

Code: Add dependency libraries

Encoding: Encapsulates the code that requests and responds to Baidu AI open platform

Implementation of the DetectService interface

The main program

validation

Welcome to the Nuggets: programmer Chen Chen

Related Posts

How to Integrate Activiti — Activiti Workflow Development Plan (5)

Database High Availability and Partitioning Solutions – MongoDB section

You turn day into night