preface

Some time ago, I took you to explore Android layout optimization and Android Caton optimization, the content is relatively difficult, therefore, this article is the foundation of the above two articles, after mastering the knowledge of this article, the difficulty of reading the above two articles will be much less.

As we all know, the biggest culprit for poor drawing is Caton, and Caton has many main scenes, which can be divided into four categories: UI drawing, application startup, page jump and event response, which can be further subdivided as follows:

1, the UI

  • draw
  • The refresh

2, start,

  • Installing the
  • Cold start
  • Warm start

3, jump

  • Page to page jump
  • Front and background switching

4, response,

  • The keys
  • System events
  • sliding

The root causes can be divided into two categories:

1. Interface drawing

  • Draw depth
  • Page is complicated
  • Incorrect refresh

2. Data processing

  • Data is processed in the UI thread
  • The CPU usage is high, causing the main thread to fail to obtain time slices
  • Increased memory leads to frequent GC, which causes stuttering

A, Android system display principle

The display process of Android can be summarized as follows: The Android application caches the surface data after measurement, layout and drawing, renders the data to the display screen through SurfaceFlinger, and refreshes the data through the Refresh mechanism of Android. In other words, the application layer is responsible for drawing, the system layer is responsible for rendering, and the data that the application layer needs to draw is transferred to the system layer service through inter-process communication, and the system layer service updates the data to the screen through the refresh mechanism.

1. Drawing principle

The application layer

In Android, each View will determine the size and position of the View to be drawn by Measure and Layout, and then Draw it on the surface through Draw. In the Android system, the overall drawing source code is in the ViewRootImpl class performTraversals() method. Through this method, we can see that Measure and Layout recursively obtain the size and position of the View, and take depth as a priority. Obviously, the deeper the hierarchy, the more elements it takes, the longer it takes.

For drawing, Android supports two drawing methods:

  • Software Drawing (CPU)
  • Hardware Rendering (GPU)

Hardware acceleration, supported since Android 3.0, is far more efficient in UI display and drawing than software drawing. But its limitations are as follows:

  • Power consumption: GPU power consumption is higher than CPU power consumption.
  • Compatibility: Incompatibility with certain interfaces and functions.
  • Large memory: the OpenGL interface occupies 8MB memory.

System layer

Rendering data to the screen is done through the SurfaceFlinger service in a system-level process, which works as follows:

  • 1. In response to client events, create Layer to establish connection with client Surface.
  • 2. Receive client data and properties, modify Layer properties, such as size, color, transparency, etc.
  • 3. Refresh the created Layer content to the screen.
  • 4. Maintain the sequence of Layer, and perform clipping calculation on the final output of Layer.

The SurfaceFlinger system process and application process use anonymous SharedClient, and a SharedClient is created between each application and SurfaceFlinger. In each SharedClient, A maximum of 31 SharedBufferStacks can be created. Each SharedBufferStack corresponds to a Surface, that is, a window. (Contains two (smaller than 4.1) or three (4.1 and later) buffers)

Therefore, an Android application can contain up to 31 Windows. Finally, the overall process is shown as follows:

  • 1. The application layer draws to the buffer.
  • 2. SurfaceFlinger renders buffer data to the screen, which uses Android anonymous shared memory SharedClient to cache data that needs to be displayed.

In the process of drawing, THE CPU first prepares data and delivers it to the CPU for rendering through the Driver layer. The CPU is mainly responsible for the calculation of data including Measure, Layout, Record and Execute, while the GPU is responsible for Rasterization and rendering. The graphics API does not allow the CPU to communicate directly with the GPU, so the connection is made through the middle layer of a graphics driver. A queue is maintained in the graphics driver. The CPU adds the display list (list of data to be displayed) to the queue, and the GPU takes out data from the queue for drawing. It finally shows up on the screen.

The Android system signals VSYNC every 16ms to trigger a rendering of the UI, which, if successful each time, will achieve the 60FPS required for smooth graphics.

2. Refresh mechanism

The Android Display system was refactored in Version 4.1 of Project Butter, introducing three core elements: VSYNC (Vertical Synchronization), Triple Buffer (three-level Buffer), Choreographer VSYNC, or vertical synchronization, which is at the heart of the Project Buffer, can be considered a timed interrupt. Choreographer acts as a scheduler, unifying the drawing work to a point in time in VSYNC to keep the drawing work in order for applications.

So why Project Butter?

The goal is to solve the problem of out-of-sync refresh.

Before the Tripe Buffer, Android displays used dual-buffer technology.

Why use double buffering?

On Linux, the Framebuffer is usually used for display output. When the user process updates the data in the Framebuffer, the display driver updates the values of each pixel in the Framebuffer to the screen. When data in the Framebuffer is updated again, there is the problem of residual shadow, which makes the user feel flicker, so double buffering is used.

What does double buffering mean?

Double-buffering means using two buffers (in the SharedBufferStack mentioned above), one called the Front Buffer and the other called the Back Buffer. The UI is always drawn in the Back Buffer, then swapped with the Front Buffer and rendered to the display device. That is, the display device is notified to switch the buffer through the IO_ctl system call only when the data from another buffer is ready.

When the first frame of data is not processed in time, why can’t the CPU start working at the second 16ms when VSync arrives?

Because there are only two buffers; So after version 4.1, there was a third Buffer: Triple Buffer. It takes advantage of the idle CPU/GPU wait time to prepare data ahead of time that is not necessarily used.

Pay attention to

Unless necessary, double buffering is used in most cases. Also, more buffers are not always better, so balance them out for the best.

With all the optimizations Google has made, why do apps still lag in actual development?

The thread priority for VSync interrupt processing must be the highest; otherwise, even if a VSync interrupt is received and cannot be handled in a timely manner, it is futile.

What does Choreographer do?

When a VSYNC signal is received, the user-set callback function is called. CALLBACK_INPUT, CALLBACK_ANIMATION, and CALLBACK_TRAVERSAL are the callback types in descending order of priority.

3. Root cause of caton

  • Drawing tasks are too heavy and it takes too long to draw a frame of content.
  • The main thread is so busy that the VSync signal arrives with no data ready, resulting in frame loss.

Second, performance analysis tools

Android commonly used drawing optimization tools are as follows:

  • Hierarchy View: Views the Layout Hierarchy
  • The Profile CPU tool comes with Android Studio
  • Lint, the static code checking tool
  • Profile GPU Rendering
  • TraceView
  • Systrace

Here we look at the next three analysis tools.

1. Caton detection tool Profile GPU Rendering

Profile GPU Rendering is an auxiliary tool that comes with Android phones. When you turn on Profile GPU Rendering, you can see a color map that is refreshed in real time, where each vertical line represents a frame and consists of multiple colors.

The Android before M

Before The Android M version, each bar chart is composed of red, yellow, blue, and purple, which correspond to the actual elapsed time of each frame in different phases. The colors are explained as follows:

  • Blue: measures how long it takes to draw, and how long it takes to create and update the DisplayList. When the blue line is high, it may be because it needs to be redrawn or because the onDraw function of the custom view is doing too much work.
  • Red: indicates the execution time of Android’s 2D rendering Display List. When the red line is very high, it may be due to resubmitting the view.
  • Orange: Where the processing time or CPU tells the GPU to render a frame, if the bar graph is high, it means the GPU is too busy.
  • Purple: time to transfer resources to the rendering thread. (Available in version 4.0 and above)

Android M and beyond

And, starting with Android M, there are eight steps to rendering:

1, Orange -Swap Buffers

Indicates the processing time of the GPU.

2. Red -Command Issue

The time it takes to render a 2D display list, the higher the number of views that need to be drawn.

3. Light blue -Sync&Upload

The time spent preparing the images to be drawn, the higher the number of images or larger images.

4. Deep blue-Draw

The time required to measure and draw a view, the higher the number of views or the onDraw method has a time-consuming operation.

5, level 1 green -Measure/Layout

Time spent by onMeasure and onLayout.

6. Second-level green-animation

The time it takes to execute the animation. A higher value indicates that an unofficial animation tool is being used or that there are reads and writes in the execution.

7. Tertiary Green -Input Handling

The time it takes for the system to process input events.

Misc Time/Vsync Delay

The main thread performed too many tasks, causing the UI rendering to fail to catch up with vSync’s signal and drop frames.

In addition, the specific render time can be analyzed by exporting it to the log with the following ADB command:

adb shell dumpsys gfxinfo com.**.** 
Copy the code

2, TraceView

It is mainly used to analyze the function call process, can be the Android application and Framework layer code performance analysis.

Use TraceView to check the elapsed Time, focusing on the values of Calls + Recur Calls/Total and (number of Calls to the method + recursion) and Cpu Time/Call (Time to the method), and then optimize the logic and number of Calls to these methods to reduce elapsed Time.

Pay attention to

The actual execution Time of RealTime (actual duration) is longer than CPU Time because it includes CPU consumption of context switching, blocking, GC, and so on.

3. Systrace UI performance analysis

Systrace is a performance data sampling and analysis tool provided by Android 4.1 and later. Its main functions can be summarized as follows:

  • 1. Collect the operation information of key Android subsystems (such as SurfaceFlinger, WindowManagerService and other key Framework modules, services, View system, etc.), so as to more intuitively analyze system bottlenecks and improve performance.
  • 2. Track system I/0 operation, kernel work queue, CPU load, etc., provide good data for UI display performance analysis, especially for problems such as poor animation playback and slow rendering.

1. Use of Systrace

Instructions for use are as follows:

  • Support version 4.1 or later.
  • 4.3 For earlier versions, open Setting>Developer Options >Monitoring>Enable traces.

Normally we use the command line to get the output HTML form, but in version 4.3 and above we can omit the tracking category tag to get the default values. The command is as follows:

cd android-sdk/platform-tools/systrace
python systrace.py --time=10 -o mynewtrace.html sched gfx view wm
Copy the code

The common parameter commands are as follows:

  • -o: indicates the name of the file to save.
  • -t N, –time=N: indicates the number of seconds of data. The default value is 5s, and the current time is retracted by N seconds.

See here for the rest of the label usage.

In addition, we can use code staking. On Android 4.3 and above, we can Trace using the trace.beginSection () and trace.endSection () methods of the Trace class. Note:

  • 1. Ensure that the number of beginSection and endSection calls match.
  • 2. Begin and end of Trace must be executed in the same thread.

2. Analyze Systrace reports

After using Chrome to open the file, the two data most closely related to UI drawing are Alerts and Frame:

  • Alerts: Indicates a point with a performance problem. You can click this point to view the details. The Alerts box on the right also shows the number of Alerts for each type.
  • Frame: Each application has a row dedicated to Frame, and when drawn normally each Frame is shown as a green circle. When shown as yellow or red, the render time is over 16.6ms.

Finally, here’s a list of handy shortcuts in Systrace:

  • W:
  • S: to reduce the
  • A: left
  • D: moves to the right

Third, layout optimization

1. Reduce layers

  • Use RelativeLayout and LinearLayout wisely.
  • Use Merge wisely.

Use RelativeLayout and LinearLayout wisely

RelativeLayout also suffers from poor performance because it measures its sub-view twice. But if you have a weight property in the LinearLayout, you’ll also take two measurements, but because you don’t have any more dependencies, you’ll still be more efficient than a RelativeLayout.

Pay attention to

Because Android is highly fragmented, using RelativeLayout makes the layout you build more adaptable.

Use Merge wisely

Merge: If the merge tag is in the Android layout, add the child element directly to the merge tag Parent.

Pay attention to

  • Merge can only be used on the root element of the layout XML file.
  • When you merge a layout, you must specify a ViewGroup as its parent and set the attachToRoot parameter to true.
  • The Merge tag cannot be used in the ViewStub. The reason is that no attachToRoot is set at all in the Inflate method of the ViewStub.

2, improve the display speed

A ViewStub is a lightweight View that is invisible, does not take up layout space, and consumes very little resources. You can specify a layout for the ViewStub, and only the ViewStub is initialized when the layout is loaded, and then the layout pointed to by the ViewStub is loaded and instantiated when the ViewStub is made visible or when viewstub.inflate () is called. The layout properties of the ViewStub are then passed to the layout to which it points.

Note:

  • 1. ViewStub can only be loaded once, after which the ViewStub object is left empty. So it is not suitable for situations where you need to show hide on demand.
  • ViewStub can only be used to load a layout file, not a specific View.
  • Merge tags cannot be nested in ViewStub.

3. Layout reuse

Android layout reuse can be achieved through the include tag.

4, summary

Finally, here are some tips I use to do layout optimization on a regular basis:

  • Use tags to load some unusual layouts.
  • Use wrAP_content as little as possible. Wrap_content will increase the calculation cost of layout measure. If the width and height are known to be fixed, wrAP_content is not used.
  • Replace RL and LL with TextView.
  • Optimize with low end machines to find performance bottlenecks.
  • Using TextView replaced multiline text line spacing: lineSpacingExtra/lineSpacingMultiplier.
  • Use Spannable/ html.fromhtml to replace text of various sizes.
  • Use the separators of the LinearLayout as much as possible.
  • Use Space to add spacing.
  • Use lint + Alibaba protocol to fix problems.
  • If there are too many levels of nesting, consider using a constrained layout.

Fourth, avoid over drawing

The main reasons for over-drawing are as follows:

  • XML layout: Controls overlap and have Settings backgrounds.
  • View draw: The same area in view.ondraw is drawn multiple times.

1. Overdrawing detection tool

Open the “Show GPU Overdraw” option in the mobile developer option, and there will be different colors to indicate The Times of overdrawing, which are none, blue, green, light red and dark red respectively, corresponding to 0-4 times of overdrawing.

2. How to avoid overdrawing

1. Optimization of layout

  • Remove non-essential backgrounds from the XML, or set them based on conditions.
  • Selectively remove window background: getWindow().setBackgroundDrawable(null).
  • Display placeholder background images as required.

For example, after obtaining the image of Avatar, set the Background of ImageView to Transparent. Only when the image is not obtained, set the corresponding Background placeholder image.

2, custom View optimization

Use Canvas.cliprect () to help the system identify those visible areas. This method specifies a rectangular region within which only the region will be drawn. In addition, it can save CPU and GPU resources, drawing instructions outside the clipRect area will not be executed.

Before drawing a cell, first determine whether the region of the cell is in the clipping domain of the Canvas. If not, directly return to avoid CPU and GPU calculation and rendering work.

Reasonable refresh mechanism

1. Reduce the number of flushes

  • Control refresh frequency
  • Avoid unnecessary refreshes

2. Avoid the influence of background threads

For example, by listening to The onScrollStateChanged event of ListView, pause the picture download thread during scrolling and start again after it is finished, the scrolling smoothness of ListView can be improved, and RecyclerView can be the same.

3. Narrow down the refresh area

If a custom View is refreshed using the invalidate method, you can use the following overloaded methods to specify the area to refresh:

  • invalidate(Rect dirty);
  • invalidate(int left, int top, int right, int bottom);

6. Improve animation performance

Animation performance improvement mainly starts from the following three dimensions:

  • 1. Fluency: Control each frame of animation to be completed within 16m.
  • 2, memory: avoid memory leakage, reduce memory overhead.
  • 3, power consumption: reduce the amount of computation, optimize the algorithm, reduce CPU occupancy.

1. Frame animation

It consumes the most resources, it does the least, it doesn’t use it.

2. Tween animation

The use of tween animation implementation results in frequent redrawing of the View, excessive updating of the DisplayList, and the following disadvantages:

  • Can only be used with View objects.
  • There are only 4 animation operations.
  • 3, Just change the View display, but do not really change the View properties.

3. Property animation

Compared to tween animation, property animation redraw is obviously much less and should be used in preference.

4. Use hardware acceleration

1. Hardware acceleration principle

Core class: DisplayList, one for each View.

When a View is drawn after hardware rendering is turned on, the draw() method that performs the drawing logs all drawing commands to a new DisplayList. This DisplayList contains the output view-level drawing code, but it is not executed immediately after being added to the DisplayList. When the DisplayList of the ViewTree is all recorded, OpenGLRender is responsible for rendering the DisplayList from the Root View to the screen. The invalidate() method simply records and updates the display hierarchy in the display list, marking views that do not need to be drawn.

2. Hardware acceleration control level

If you only use standard views or Drawable in your application, you can turn on hardware acceleration globally for the entire system.

3. Use hardware acceleration for animation

At this point, a View is animated using a hardware texture operation, which reduces frequent redrawing of the View itself by not calling the invalidate() method. At the same time, Android 3.0 property animation also reduces repainting. When the View returns through the hardware layer, eventually all the cascading images are displayed on the screen, and the View properties are processed at the same time. Therefore, as long as you set these properties, you can significantly improve the efficiency of drawing, they do not require View repainting, after setting the properties, the View will automatically refresh. As a result, property animations draw much less recursion than tween animations.

Prior to Android 3.0, the off-screen buffer was rendered using the View’s draw buffer or canvas.savelayer () function. Android 3.0 uses the view.setLayerType (type, paint) method instead. Type can be one of the following Layer types:

  • LAYER_TYPE_NONE: Normal rendering, does not return an off-screen buffer, default.
  • LAYER_TYPE_HARDWARE: If the application uses hardware acceleration, this View will be rendered as a hardware texture in the hardware.
  • LAYER_TYPE_SOFTWARE: This View is rendered as a Bitmap by software.

The process for designing an animation is as follows:

1. Set the LayerType of the View to be animated to LAYER_TYPE_HARDWARE.

2, calculate the animation View properties and other information, update the View properties.

3. If the animation ends, set LayerType to NONE.

Hardware acceleration needs to be noted:

  • In software rendering, you can reuse bitmaps to save memory, but this doesn’t work if hardware acceleration is turned on.
  • Hardware-accelerated views consume extra memory when running in the foreground, and extra memory may not be released when the accelerated UI is switched to the background.
  • Hardware acceleration can be problematic when there are transitions in the UI.

7. Caton monitoring scheme and implementation

At present, the most popular scheme is to use Printer in Looper to realize monitoring.

1. Monitoring principle

The message queue processing mechanism of the main thread is used to customize Printer, and then the time difference between two calls is obtained in Printer. This time difference is the execution time. If this time exceeds the set lag threshold (such as 1000ms), the main thread freezes and throws various useful information for the developer to analyze. (Alternatively, you can start an asynchronous thread outside of the UI thread and periodically send a task to the UI thread, noting when. The content of the task is to synchronize the execution time to the sending thread. If the UI thread is blocked, the sent task cannot be executed on time. However, this method will increase system overhead and is not desirable.)

Caton information capture

The following four types of information should be captured immediately to improve the efficiency and accuracy of locating the stuck problem.

  • 1. Basic information: system version, model, process name, application version, disk space, UID, etc.
  • 2. Time consuming information: start and end time of the card.
  • 3. CPU information: INFORMATION about the CPU, the overall CPU usage, and the CPU usage of the current process (you can roughly determine whether the CPU usage of the current application is too high or other reasons).
  • 4. Stack information.

Pay attention to

The information here is suggested to be reported by sampling or it can be saved locally and then compressed to the server for analysis when the time is right and a certain amount is reached. Specific monitoring code implementation can refer to the code of the BlockCanary open source project.

Eight, summary

At this point, here we analyze the several processes that should go through in rendering optimization:

  • 1. Find the problem: In addition to the perception of the lag during use, we should also find the overall time through the lag monitoring tool, or open some auxiliary tools of the developer option to find the problem.
  • 2. Analyze problems: You can use Systrace and TraceView to analyze time, and use Hierarhy Viewer to analyze page levels.
  • 3. Find the cause: Explore the root cause of the problem.
  • 4. Solve problems.

In addition to drawing problems, memory is another factor that can cause application lag. Improper use of memory can not only cause lag, but also have a significant impact on power consumption and application stability. In the next performance optimization article, THE author will give a comprehensive explanation of memory optimization in Android. If readers feel that there are bad or wrong places to write, I hope to make more criticism and corrections. I hope we can make progress and grow together!

Reference links:

1. Best practices for Android application performance optimization

2, will know will be | Android in aspects of performance optimization are here

Thank you for reading this article and I hope you can share it with your friends or technical group, it means a lot to me.

I hope we can be friends inGithub,The Denver nuggetsShare knowledge together.