I am participating in the 2022 Spring Recruitment series – experience review, click here for more details.

I. Project Introduction:

Review:Dex2oat compiler level optimization scheme

Functions: improve application running, startup speed, optimize performance, etc

Main roles: researching native compilation scheme, collecting performance data, trying to optimize path… Etc.

Ii. Project Background

First, what is the business scenario for this project?

Compiler optimization scheme based on Android platform. Due to the vehicle chip, hardware and other reasons.. Application startup, running speed including performance is not very good to meet the requirements of manufacturers, so it is urgent to expand the original Android compilation optimization strategy, increase compilation level optimization and other measures to optimize.

Second, business pain points and reasons that need to be solved

Based on the collection of performance data such as CPU usage and memory usage of front and background processes, it is found that the CPU repeatedly spikes in a period of time and recovers after a period of time. Guess the reason is: soaring application JIT because the chip is not very good so this period of time will appear stuck, suspended animation and other situations; And after a period of recovery and because these JIT collection of performance data can not meet the requirements of android compilation optimization processing.

Iii. Relevant knowledge:

optimization

Why optimize

Why oAT code executes faster than JIT code: it is similar to learning a foreign language

Different processes also mean different levels of optimization.

One, the entry beginner

Jit interpretation execution, equivalent to every English word has to go to the dictionary to find the annotation translation (this time to view the English document efficiency is the slowest, the word semantics do not know, grammar is not familiar, the sentence do not know what the meaning ~)

Two, know the words

When you see a word many times, you will remember it without checking the dictionary translation, but at this time, you just know what the meaning of the word is, and you don’t understand the grammar, and you still don’t know what the meaning is (the operation of compiling into machine code, the dex bytecode is translated into machine code).

Three, know the common phrases, grammar

When you learn grammar or phrases, you can’t help but use them to fit into your sentences to better understand what they mean. This time will generate syntax tree, logic control flow according to your application logic, mainly for your application code optimization

Advanced intermediate expression optimization operations: constant propagation, common expression extraction, method inlining, etc.

Four, learn pronunciation ~

You’ve learned how to read English documents, and you’ll be embarrassed if you want to show off your English when you travel to a foreign country

There will be some optimizations but they will be for different hardware platforms. That is, the low-level intermediate expression optimization operation is optimized again according to the machine code of the relevant platform ~ select the machine code of the corresponding platform with the best efficiency

Five, travel around the world ~

The syntax tree conversion tree generated by the intermediate expression is replaced with the machine code of the corresponding platform —-> until this stage is the machine code with the highest efficiency in the end ~

Optimization of the unit

It can be seen that the basic unit of learning a foreign language is based on words, and only step by step learning can you finally travel around the world. (The longer you study English, the faster you can read English documents.)

Compiler optimization also has basic units: functions; Step by step optimization operations correspond to optimization levels. (The longer compilation optimizations take, the more efficient they are.)

The more performance data is generated at runtime, the more evidence there is for which part of the function to optimize later in compilation, and the more optimization measures to be taken.

In fact, there seems to be no big problem, very normal ~. But corresponding to the hardware aspect not how good that is another say ~

Optimization based on

Jit interprets run for a period of time and optimizes operations based on relevant runtime data obtained through interpretation execution

A. When a function runs a certain number of times and reaches the threshold for compilation to machine code, compilation optimization is performed. The next time the function is run it checks to see if it is compiled to machine code. If it is machine code, it will run in machine code. If it is not, it will still run in the previous JIT mode. At this time, the function has not executed the stack frame and has not established the stack frame. 2. When a loop method reaches the corresponding threshold of execution times, the compiler optimization process will be triggered. At this time, the stack frame has been established, and switching to machine code operation will involve an on-stack substitution technique.

The optimization process

So how does compiler optimization work? Is processed by a separate process, the Dex2OAT process

When to compile optimizations

1. Apk installation (in the previous system, Android was directly optimized to the maximum level at the time of installation to reach the running speed of the application but did not want to install too long ~)

2.JIT performance data meets the conditions for optimized compilation

Existing strategies:

Compilation optimization is not to optimize all methods at one time, but divided into different levels, dex2OAT will be optimized to different degrees according to different compilation levels (the higher the level, the longer the compilation time, the higher the operation efficiency).

The lowest compiler optimization level is that dex2OAT will be optimized when APK is installed. This part will only optimize the logic related to performance statistics, and other logic will be interpreted and run according to JIT.

The extension point

Make full use of idle system for optimization

When the system is idle but does not meet the compilation condition you can not compile, after running or explain the running speed is slow ~ here is an optimization point

Dynamic configuration optimization scheme.

We can distribute different optimized configurations according to some user groups of the product. Dynamically update optimization solutions based on user performance. Deliver configuration optimization policies based on user preferences

Ensure that existing policies are not disturbed

The above optimization strategies are of course added to the original compilation level of the Dex2OAT process, so this part of the optimization logic will schedule these strategies after the normal workflow is completed in the native.

Realize the principle of

Would it be better to lower the compile threshold, the threshold? Yes, it is. And that’s what we did in the beginning. But this is to modify the parameters of the global threshold method to compile all was reduced, the high priority process may also be robbed in front (the lower priority on behalf of the more important), lower threshold means compiling requests will soar, high risk in doing this (without priority) foreground process would be higher risk. So that was abandoned

Since the generated statistics are not up to par, we can draw up a generic build profile in which these functions will be prioritized again based on the process, the data generated by the running Monkey, the flame map analysis, and so on. (Sets the priority of the compiler function and the process compared to the first one)

Optimizations go far beyond compilation, including GC using threadList to set flag bits during GC Roots, thread state switching to detect flag bits to suspend all threads and run all threads. This part can also be optimized. The thread will have an optimization operation to save the reference variables in the current stack frame which is also the root object, if we can save it in some other memory area or somewhere.

In addition, there are many signals in Linux, and Android SiganlCatcher only detects a few of them. There is a class specially used for signal processing called FaultManager. Do we need to add more signals for corresponding processing according to our own business

And the thread’s TLAB buffer that we can play with as well.

Learn from the system to compile some commonly used classes into. Art image files, directly bypassing the JIT. Of course, it’s only system APP, and that’s what we’re doing so we can modify it as well

And so on…

Iv. Summary thinking:

  • In the process of exploration, there were many questions I didn’t understand, most of which were doubts about why it was designed in this way. With the generous guidance and help from colleagues and leaders, the project met the requirements, the KPI of the team was saved, and the project was delivered ~~

  • There are a lot of interesting designs in the Android source code, and you may have a lot of questions: Learn with questions, ask your colleagues, and always keep learning

When Anr, trace will have information about all processes. All threads all stop, all run is how to achieve ~

What’s the difference between the classes that you’re going to create when you start a virtual machine and the classes that you’re going to create in your App, the first one will definitely be optimized so how do you do that?

Art file, OAT file, dex file structure. Whether oAT files are stored in art and whether OAT files are stored in dex…..

  • Of course, optimization will continue to be explored, including many points in the system are based on a set of general solutions, which actually correspond to different businesses and different scenarios. Therefore, it is necessary to understand these principles and make an optimization suitable for your own platform.

I think the best way to learn something new is:

Based on your previous technology stack and body of knowledge, list the knowledge involved and try to relate it. You may come up with some creative ideas in the process, try them out. Learn when you encounter problems along the way. Over and over again even if you end up with little or no success, you will make great progress.

Write in the last

Of course, this is based on our own system to make the modification, not suitable for all machines.

There is no best, only more appropriate.

If you want to experience these improvements, check out the Android Developer Platform configuration Baseline file to make your APP enjoy these improvements