preface

In this era of “user experience is king”, how to ensure the stability of products and services to the greatest extent and provide good user experience is a problem that developers need to think about and urgently solve. Umeng +, a global data provider founded by service developers for 11 years, launched the mobile Application Performance Challenge 2021. Umeng + president, Umeng + technical director, CSDN vice President, SIfu CTO, Ali Mobile terminal feizu, Xianyu and other people jointly constitute the expert review group. After the selection, fortunately won the first place. Now I want to share the content of my work with you. You can also discuss them in the comments section.

Face recognition conference system performance optimization

one Project background

1. Introduction

The company is an Internet of things company, intends to develop a face meeting attendance check-in board machine, in order to improve the competitiveness of the product, the main price + customization difference (cheap and easy to use hardware, cool software pages in the mouth of the day); The software R&D department needs to support the supporting android tablet application of face meeting attendance. The main business functions are: face recognition function (face collection, contrast recognition, face database management), conference module, attendance check-in function, customized interactive module.

Face recognition interaction diagram

Communicate with hardware product manager to provide a prototype and a product list to support software development and testing. The main board is RK3288 quad-core 1.8ghz.2g memory. 8GB storage board, Android 5.1 operating system. The screen is a 15.6-inch 1920 x 1080 resolution 10-point capacitive touch screen.

Schematic diagram of RK3288 mainboard

RK3288 Core parameters of the mainboard

The framework selection is React Native+tracker.js, considering the cost control, without integrating the Android FACE recognition SDK on the market. Through experience, using Tracker. js to replace OpencV to realize face recognition capture, and realizing face comparison on the server side (foiled for subsequent mistakes). After working overtime for a period of time, I developed v0.1-alpha version of face meeting attendance system. The procurement process was relatively slow. When the development board arrived and started a round of real machine test run, the team members said that they could do nothing after a round of testing. All the other machines were fine, but this was not the case. After excluding hardware reasons, I took the lead to optimize the system performance.

2. Problems and challenges

Problem 1.

1. Face recognition is not smooth, the person drawing is not synchronized, the obvious delay, the high frequency of personnel in and out of the lens frame will be accompanied by frustration. 2.POE power supply. The software becomes hot after running for a long time. 3. There are occasional flashes, and no valuable exception logs can be captured.

2. The challenge

1. Did not use pure Android development (team members are basically not familiar with android native development) + Android SDK for face recognition (the appeal of cost control). Limits the upper limit of performance. 2. The hardware performance is low. The RK3288 processor, equipped with mali-T764GPU, was considered a miracle U in 2014, and was praised as the strongest ARM processor in China. But it has been 6 or 7 years, and we are using the basic version. Android board face machine also need to build some other related software, performance and stability requirements are relatively high. 3. The probability of ANR/ flashback crash, fuzzy error reporting, unable to locate the problem. 4. All the team members are front-end developers with insufficient experience in app optimization and debugging.

Steps to solve the problem

1. Duplicate design

Advice: Don’t focus on the problem yet. Especially for performance issues, this is a no-no. This is a common mistake for many developers, who use subtle techniques to hide flaws in the system’s design. (Product design, architectural design, prototyping, interaction design, UI design, etc.) The first step should always be to go back to the system design if the system runs or tests with problems far above the threshold. Inexperienced programmers dive into bugs, experienced programmers use their own way of thinking to understand problems, locate problems, analyze problems, solve problems, and verify problems. As a qualified architect or technical team leader. Be sure to learn to “pull your hair” thinking. Many systems that need to be optimized are not just a technical problem, but may be caused by unreasonable product design, redundant architecture, anti-human interaction, and over-layered UI. However, due to the complexity of the system and the communication cost of the team, as well as the later demand changes and the refinement of the scene, it is often difficult to expose some problems in the early stage of the project. Therefore, for the performance optimization of the software system, the first step is to review whether the previous design and behavior are reasonable. In fact, the so-called differentiated design is still too complicated for an industrial tablet app in terms of animation and interaction after sorting and streamlining and removing unreasonable factors.

2. Data analysis

Face detection acceptance criteria of this project: Package size: ~ 100M Minimum face detection size: 50px * 50px Recognized face Angle: YAW ≤ ±30°, Pitch ≤ ±30° Detection speed: 100ms 720p* tracking speed: 50ms 720p* Face detection time: < 200ms face database retrieval speed: < 100ms detection + recognition process time < 500ms (APP other performance indicators do not do too much description) one of the engineering elements is to use the set standard to measure discrete data. If optimization does not have quantifiable rendering performance criteria, it is up to the developer \leader to decide, so it is not only testers who need to know these criteria, but also developers who need to learn how to use testing tools to locate problems and validate data. Ok start action, Android ADB network connection Android motherboard test, install APK.

1. Render mode analysis

Open Android Developer mode, check GPU rendering speed and overdrawing, screen out pages with excessive rendering pressure,

Schematic diagram of GPU rendering mode analysis

Render color description

Over-drawing: In fact, the input-output ratio should be considered in the optimization related to over-drawing. The overall output of over-fine optimization is not high. In this project, only the over-drawing red area (over-drawing area for 4 times or more) is optimized.

2. Analyze power consumption

Because the software is accompanied by hot running phenomenon, it is necessary to analyze the power consumption. Power consumption statistics are system components, which means he keeps them running. Therefore, statistics need to be reset when obtaining statistical reports. 1. Disconnect the ADB service and then enable the ADB service to kill the ADB service. Run the ADB kill-server command to prevent conflicts and dirty data. Restart ADB: Run ADB Devices or ADB start-server 2. Reset battery data collection

adb shell dumpsys batterystats --enable full-wake-history
adb shell dumpsys batterystats --reset
Copy the code

Normally, we should disconnect the charger and disconnect the USB connection (charging while connected), which can greatly affect the statistical validity. However, since we use POE power supply, we analyze the specific situation and use data to assist in finding abnormal points. Since we are operating on 5.1, we use adb command:

As TXT reports are too large to be seen by the naked eye, they are generally used together with Battery Historian. (Note: Battery Historian is available as Android 5.0 (API 21) and above, if you are lucky enough to still be using Android 4.4 industrial panels you can skip this entry.)

Sample Battery Historian

3. Thread activity and CPU analysis

There are a lot of thread activity and CPU analysis tools out there, but doesn’t Android Studio come with one? (Rn Android packaging still use Android Studio, using vscode packaging holes too much.) Analyze for outliers.

Android Studio CPU Profiler Sample

4. Data summary

The data shows that the CPU is overloaded, and tracking causes the process to block. In fact, it has always been believed that page lag is completely caused by heavy rendering pressure (rendering is the bottleneck of RN’s entire framework), but the report data shows the opposite. For face recognition, GPU is not fully running, and graphics interface rendering is only partially carried out by GPU. Tracking will temporarily wait for the lag after blocking. After that, the canvas key point rendering positioning is completed one by one, the interface is called, and the second slight lag (RN rendering lag) occurs when the information card is rendered and the animation is executed after the returned data is obtained. Then the performance reflects the sine function fluctuation, and the phenomenon of lag and influency disappears. The problem of “head-banging” is that front-office colleagues generally don’t have enough access to logging and data analysis tools.

3. Locate the fault

There are many ways to locate problems, such as binary search (binary comment, binary rollback). Or breakpoint debug and analyze logs. Can effectively help us quickly locate the problem. Then, through data analysis and key classes provided by the tool, we can clearly find out the problem: information card animation + Canvas special effects + face recognition related functions.

4. Analyze the problem

Original implementation: introduce all relevant JS, new multiple tracking. ObjectTracker to detect the face, eyes, mouth area. To achieve the display effect of human face key points through canvas,

And collect faces. Tracking. Js uses CPU to perform calculation, which is slower than GPU in terms of image matrix calculation efficiency. At this point, with the support of data, it decided to replace the face recognition framework layer and cooperate with RN to conduct trial optimization, using face-api.js

Based on tensorflow. js kernel, face-api.js implements three convolution neural network architectures for face detection, recognition and feature point detection. Its interior implements a very lightweight, fast and accurate 68-point face marker detector. Support a variety of TF models, small model only 80KB. It also supports GPU acceleration and can be run using WebGL. The core principle is to implement an SSD (Single Shot Multibox Detector) algorithm for face detection, which is essentially a Convolutional neural network (CNN) based on MobileNetV1. Some face border prediction layers are added to the top layer of the network.

Face – API face marker detector

After confirming the replacement, I optimized the React Native Thread scheduling. I drew a simple diagram to explain the process: JS Thread: React and other JavaScript codes run on this Thread. Bridge, asynchronous, serialized, and batch Shadow Thread: THE Thread that calculates the layout and constructs the UI. Native Modules Provide Native features (such as photo albums and Bluetooth). UI Thread: The main Thread in Android/iOS applications (and other applications).ReactNative thread schematic

For example, if we draw a UI, the JS thread serializes it to form a uimanager.createView message, which is then sent to the Shadow Thread via the Bridge. Shadow Tread After receiving this information, deserialize it to form a Shadow Tree, and then transform the native layout information to the UI Thread. Once the UI Thread gets the message, it also deserializes it and draws it based on the given layout information. Each operation, such as height calculation and UI update, is transmitted through bridge. If there are many tasks, task queues will be generated and asynchronous operations will be processed in batches. Some front-end updates are difficult to be timely reflected on UI, especially animation operations with high update frequency, which have many tasks. It’s hard to keep every frame rendered in time. So, the optimization direction is: 1. Reduce asynchronous communication between JS threads and UI threads, or reduce the size of JSON 2. Minimize JS Thread side computation

5. Solve problems

The overall solution is face-API instead of Tracker; React Native does an optimization. React Native tuning is described in three steps.

1. Enable the animation native driver

UseNativeDrive: Messages are sent between true JS threads and UI threads using JSON strings. For non-layout properties and direct events, the useNativeDriver property can only be used for non-layout related animation properties, such as transform and opacity. Layout related properties, such as height and position, will display an error when enabled. For example, face recognition success, personnel information card animation, we can use useNativeDrive: true to open the native animation drive.

Animated.timing(this.state.animatedValue, { toValue: 1, duration: 500, useNativeDriver: true, // <-- Add this }).start();  
Copy the code

By enabling the native driver, we send all the configuration information to the native end of the animation before it starts, using native code to execute the animation in the UI thread, rather than having to communicate back and forth between the two ends every frame. This way, the animation is completely disconnected from the JS thread at the beginning, so if the JS thread gets stuck at this point, the animation won’t be affected.

2. Use the InteractionManager

Use InteractionManager to execute some of the tasks that need to be optimized after the interaction and animation are complete, such as: jump animation of the venue distribution. The goal is to balance execution timing between complex tasks and interactive animations.

const handle = InteractionManager.createInteractionHandle(); // Execute animation... (` runAfterInteractions ` the tasks now waiting) / / at the beginning of the animation is completed to remove handle: InteractionManager. ClearInteractionHandle (handle); // After all handles have been cleared, the tasks in the queue are now executed in orderCopy the code

According to the official explanation, the runAfterInteractions accept a callback function, or a PromiseTask object, that returns a Promise. If the provided parameter is a PromiseTask, even if it is asynchronous, it blocks the task queue until it finishes executing the next task. This allows you to optimize animation fluency on demand.

3. Re-render

First of all, in RN and React, when setState is triggered in the parent component, any value of state in the parent component is not changed, which will cause all the child components to be re-rendered, or when the props transmitted from the parent component to the child component is changed, regardless of whether the props are used by the child component or not. So, for re-rendering, use PureComponent and shouldComponentUpdate to optimize normal functions; Use memo optimization for hook components; After verification, the overall improvement, the interaction is relatively smooth, and the basic performance indicators are achieved. Now it’s all about the probability of recurrence. Enlist the help of your testing colleagues.

6. Validation issues (application of performance monitoring platform)

First of all, why to use performance monitoring platform: 1. To deal with repetitive information, to avoid some problems being repeatedly processed in multiple apps or in one APP; 2. 2. Continuously capture important suspicious information, improve efficiency and reduce labor costs. Secondly, when and in what scenarios to use performance monitoring platform: In addition to testing, operation and maintenance, developers should also learn to use performance monitoring platform to assist in locating and solving problems. Two solutions are recommended:

1.Google Android Vitals + Firebase

Android Vitals is a Google initiative to improve the stability and performance of Android devices. The Android Vitals console on Google Play can highlight indicators such as crash rates, ANR rates, excessive awakenings, and wake locks getting stuck. Contains developers commonly used functions, the key is not to invade the code, the application is more convenient. In addition, Firebase can get detailed custom crash report data to learn about crashes in your application. The tool sorts crashes by similar stack traces and ranks them according to the severity of their impact on users. In addition to receiving automatically generated reports, you can also log custom events to learn about the actions that caused your application to crash.

Vitals + Firebase

So in general Android Vitals can handle most simple problems, and Firebase has the flexibility to handle custom events. Less convenient are Google’s domestic restrictions, which require companies to sign up for a private line of cross-border networking and, annoyingly, often require authentication when the network fluctuates. Cost: Android Vitals is free to use, but it costs $25 to register a developer account; Firebase has a free version and a paid version. Suitable for foreign companies, multinational companies or companies with relevant qualifications.

2. Friendship + U-APM

2.1 Product Overview:

Due to Google’s domestic restrictions, many enterprises cannot connect to the Internet without network reporting, so Umeng + U-APM can also perfectly meet the above requirements. For my project, I choose to access Umeng +SDK to assist problem detection. Umeng’s push and statistics are relatively good in the industry, and those who are familiar with Umeng should know about the stability function of U-App. Then U-APM is a stability data product for developers to monitor applications upgraded by Umeng + on the basis of the stability function of U-App.

Why choose UmENG + U-APM application performance monitoring platform: This product not only creates a systematic online quality monitoring platform by discovering online problems, locating problems quickly, and solving problems efficiently. Moreover, it has the features of real-time monitoring of online App crash trend, 7*24 hours monitoring of alarms and repair verification, reoccurrence of user crash site, key monitoring of key links, repair test and so on.

The key point is that with the support of Ali technology, we can provide long-term and stable product iteration, project service and expert consulting ability. Sweet ah, enterprise engineering needs is long-term stability! Small factory products may be used to find people.

2.2 Development Preparations

If you have used u-APP before, you can directly view the upgrade instructions on the official website and click to experience U-APM. So those who have not used UmENG products need to go to the official website of umeng +. Register and add a new app to get the AppKey. Note: Please read the U-APM Compliance Guide carefully to meet the relevant compliance requirements of the Ministry of Industry and Information Technology. Avoid APP removal due to privacy policy risks.

2.3 to integrate the SDK

Maven automatic integration: Maven automatic integration is a simple and quick way to first add [Friendly +] SDK new Maven repository address in buildScript and AllProjects sections of the project build.gradle configuration script. The diagram below.

Add the SDK library dependencies to the Dependencies section of the build.gradle configuration script of the project App.

dependencies {  
    implementation fileTree(include:['*.jar'].dir:'libs')  
  
// The following SDKS are introduced on demand according to whether the host App uses relevant services.
    implementation  'com. Umeng. Umsdk: common: 9.4.4'/ / will be selected
    implementation  'com. Umeng. Umsdk: asms: 1.4.1'/ / will be selected
    implementation 'com. Umeng. Umsdk: apm: 1.4.2' / / will be selected
}  

Copy the code

Manual Android Studio integration: So here I am using manual integration 1. First, select the U-APM SDK component and download it. Then extract the. Zip file to obtain the corresponding component package

The following files are obtained:

Jar // Statistics SDK Mandatory Umeng – ASMS – Armeabi-v1.4.1. aar// Mandatory and umeng- APm-Armeabi-v1.4.2. aar//U-APM SDK in the APM directory Mandatory If UTDID is required, integrate utDID4all-1.5.2.1-proguard. jar UTDID service under ThirdParties. If ABTest module is required, It can integrate umeng-Abtest-v1.0.0. aar ABtest module 2 under common. Copy the above JAR packages to the Android Studio project libs directory. Right-click the Android Studio Project and select Open Module Settings. In the Project Structure pop-up box, select the Dependencies TAB. Click “+” in the lower left and select the package type – > Import corresponding component packages.

3 Import the corresponding component package in the build.gradle file of the app. The following is an example:

repositories{  
    flatDir{  
        dirs 'libs'  
}  
}  
dependencies {  
    implementation fileTree(include:['*.jar'], dir:'libs')  
    implementation (name:'umeng - asms armeabi - v1.4.1', ext:'aar')   
    implementation (name:'umeng - apm armeabi - v1.4.2', ext:'aar')  
}  
Copy the code

Note: If you need to adapt to a platform other than Armeabi, or if you have multiple CPU architecture so library loading failure [SA10070], in addition to importing the corresponding package, you need to download and check in the corresponding. So file.

2.4 Permission Granting

Grant the following permissions according to the tutorial on the official website:

The < manifest... > <uses-sdkandroid:minSdkVersion="8"></uses-sdk>  
<uses-permissionandroid:name="android.permission.ACCESS_NETWORK_STATE"/>  
<uses-permissionandroid:name="android.permission.ACCESS_WIFI_STATE"/>  
<uses-permissionandroid:name="android.permission.READ_PHONE_STATE"/>  
<uses-permissionandroid:name="android.permission.INTERNET"/ > < application... >Copy the code

2.5 Confusion Settings

If code obfuscation is used in your APP, add the following configuration

-keep class com.umeng.** { *; }  
  
-keep class com.uc.** { *; }  
  
-keep class com.efs.** { *; }  
  
-keepclassmembers class *{  
     public<init>(org.json.JSONObject);  
}  
-keepclassmembers enum *{  
      publicstatic**[] values();  
      publicstatic** valueOf(java.lang.String);  
}  
Copy the code

2.6 Initializing the SDK

Call the initialization function provided by the base component package in RN’s Android native application.onCreate function:

/** * Note: Even if you have configured the appkey and channel values in androidmanifest.xml, you need to call the initialization interface in your App code (if you need to use the appkey and channel values configured in Androidmanifest.xml, * Set the appkey and channel parameters to null in the umconfigure. init call. * /  
UMConfigure.init(Context context,String appkey,String channel,int deviceType,String pushSecret);  
Copy the code

Or call this pre-initialization function

public static void preInit(Context context,String appkey,String channel)Then turn on the Log switch ** * Set the componentized Log switch * parameter: Boolean DefaultfalseTo view LOG Settingstrue 
*/  
UMConfigure.setLogEnabled(true);  
Copy the code

Now you can use caton analysis, Java, Native crash analysis, ANR analysis and other basic functions. The principle is to report the device information and log of the lag experience through the response time of the main thread. After the device reports, we can see the uploaded Error(printing SDK integration or runtime Error information), Warn(printing SDK warning information), Info(printing SDK prompt information), and Debug(printing SDK debugging information) on the Web console. And reports.

However, it is very difficult to see the error stack directly from the message. U-apm uses aggregation algorithm to provide the function of stuck module, screening 200 stacks with large number of users and bidirectional aggregation from the top of the stack to the bottom of the stack, displaying the top 10 modules with the occurrence frequency. The subtree depth supports up to 50 layers, helping to dig the detailed information of stuck modules.In addition, U-APM also provides boot analysis, memory analysis, network analysis, user details module and other advanced functions. In addition to memory analysis is other functions need to be configured to use. You can try it out. Then the final u-APM is also a smooth verification and solution to the problem. Completed the whole research and development loop. If you’re interested, you can try it for freeU-APM.

3. Project summary

1. Don’t stare at the problem. For app performance optimization, system optimization. The symptoms of the problem may be due to an underlying side effect. For example, if the local phenomenon in this project is stalling and not smooth, we may fall into the dilemma of optimizing rendering, reducing canvas drawing, or even streamlining business if we only focus on the phenomenon. The final breakthrough in our performance bottleneck was achieved by modifying the implementation method, which is more suitable for business scenarios and can better play the machine performance. And all this, need data support. 2. Let the numbers speak for themselves. Instead of detecting performance issues and evaluating performance optimizations by intuition, have quantifiable criteria for rendering performance and quantifiable, visual optimization tools. There is no sediment for the team to use experience to feel and guess, while data and tools can be passed down. For example, if there is no standard for optimization performance, there is no data for results. Then the overall work is meaningless, and success depends on the leader patting the forehead. 3. Use low-profile devices: The same application, on low-end devices, the same problems are more obvious. For example: in the early android development of the real machine and there is no problem, on the industrial real machine to expose the problem of the problem. It has always been a very important issue that both high and low end devices can bring good user experience. 4. Weigh the advantages and disadvantages: optimize on the premise of ensuring product stability and meeting demand on time. When the input-output ratio is too high, other schemes should be adopted instead of excessive optimization. Never forget that optimizing performance is about improving the user experience, not showcasing skills. 5. Discard sunk costs: for the costs that have been paid and cannot be recovered in the research and development, do not affect future decisions. For example, for the face recognition module that has been developed with Track, data prove that selection affects performance. The input-output ratio is within the acceptable range, and the earlier the replacement, the higher the expected return.