Buffer1: graphic buffer

Buffer2: Hardware frame buffer

1. Graphic rendering process

1.1 App layer drawing

PerformTraversals initiated by View wrotim pl to begin the View’s rendering

  1. Measure the width and height of the View
  2. Set the width and height of the View
  3. Create a display list and Draw
  4. Drawing is achieved through a graphics processing engine, generating multiple deformation and texture (Record, Exceute)

Engines include:

**2D: ** Canvas, the API called by Canvas encapsulates the skia implementation to the underlying layer

3 d: * * * * OpenGL ES, when an application window flag = WindowManager. LayoutParams. MEMORY_TYPE_GPU, need to use OpenGL interface to map the UI.

1.2 Surface

Each window corresponds to a surface, and any view should be drawn on the Surface canvas.

The graphics are transferred through Buffer as the carrier, and surface is the further encapsulation of Buffer

The surface has multiple buffers for use by the upper layer.

Surface is the producer proxy object. Surface (native) corresponds to the producer local object.

Process:

  1. The upper app obtains buffer through surface for the upper layer to draw. The drawing process is completed through Canvas, and the bottom implementation is SKIA engine.

  2. After drawing, the data is queued to the BufferQueue via the surface

  3. The listener then notifies surfaceFlinger to consume buffer

  4. SurfaceFlinger then acquie the data to synthesize

  5. When the composition is complete, release buffer back to bufferQueue

  6. This loop, forming a buffer to be recycled process

Status:

  1. Free: available for upper-layer use
  2. Dequeued: queue out, being used by upper layer
  3. Queued: entered column, drawn by upper layer, waiting for SurfaceFlinger synthesis
  4. Acquired: The buffer is being Acquired and SurfaceFlinger is holding it for synthesis

Main functions of Surface:

  1. Get a Canvas to do the drawing
  2. Apply buffer and put the final graphics and texture data produced by Canvas into it

1.3 SurfaceFlinger

  1. A single service that accepts all surfaces as input
  2. Create a Layer (whose main component is a BufferQueue) that corresponds to the Surface
  3. According to Zorder, transparency, size, position and other parameters, calculate the position of each layer in the final composite image
  4. The final rasterized data is generated by HWComposer or OpenGL
  5. Put it on the layer framebuffer

1.4 Layer

  1. Surfaceflinger is the basic unit of operation for composition. The main component is a BufferQueue
  2. Created inside surfaceFlinger when the application requests to create a surface
  3. One surface corresponds to one layer
  4. The layer is actually a framebuffer with two graphicBuffers, FrontBuffer and BackBuffer

1.5 the HardWare Composer

The main goal of HWC is to determine the most efficient way to combine buffers based on available hardware

  1. Surfaceflinger provides HWC with the complete list of layers and asks, “Do you want handle it?”
  2. Which to use based on hardware performance? Mark each layer as overlay or GLES composition to respond
  3. SF processes layers synthesized by GPU, and submits the results to HWC for display (HAL). HWC processes layers synthesized by hardware layer synthesizer by itself.
  4. HWC is preferred when layer is synthesized. SF adopts the default 3D composition, calls the OpenGL standard interface, and draws each layer on FB.

There are two ways of synthesis:

  1. Offline synthesis 3D synthesis
  2. Online Overlay technology

Graphic data stream:

1.6 Screen display

The contents of the display are read from the hardware frame buffer.

Reading process:

Starting from the starting address of the Buffer, scan the entire Buffer from top to bottom and from left to right to map the contents to the display.

Double buffer:

A FrontBuffer is used to provide on-screen display content

A BackBuffer is used to compose the next frame in the background

  1. The first frame is displayed, the next frame is ready

  2. The screen will start reading the contents of the next frame (the contents of the back buffer)

  3. Perform a role reversal for the front and back buffers

  4. The previous back buffer becomes the front buffer for the graphical display

  5. The previous front buffer becomes the back buffer for graphic composition

The official diagram shows how the key components work together:

Summary of the rendering process for Android App View:

  1. The measurement process is used to determine the size of the view
  2. The layout process is used to determine the position of the view
  3. The drawing process finally draws the view on the application window
  • Android application Windows are first drawn on a Canvas using the Skia graphics library API

  • A graphic buffer that is actually drawn inside the canvas by surface

  • This graphic buffer is eventually synthesized by SurfaceFlinger in the form of layer

  • The operation of rasterizing the synthesized data is generated by HWC or OpenGL, rendering the graphics buffer to the hardware frame buffer for screen display.

2. The composite frame rate of CPU/GPU is synchronized with the refresh frequency of Display

** Screen refresh rate (HZ): ** (speed of content consumption)

Represents the number of times the screen refreshes in 1 second, which is generally 60Hz for Android phones (60 frames refresh per second, about 16.67ms refresh per frame).

System frame Rate (FPS): (the speed of producing content)

The number of frames synthesized by the system in 1 second. The size of this value is determined by the system algorithm and hardware.

Introduce three core elements: VSYNC, Tripple Buffer, and Choregrapher,* to solve the synchronization problem I

2.1 Vsync

  1. After the screen has scanned a frame from the buffer onto the screen and before it begins to scan the next frame, a synchronization signal is emitted, which is used to switch between the front and back buffers.
  2. This vSYNC signal allows the resultant frame rate to be benchmarked against the screen refresh rate (which is fixed), and the system frame rate can be changed.

CPU and GPU division:

  • CPU: Measure, Layout, texture and polygon formation, send texture and polygon to GPU
  • GPU: Rasterize and compose textures and polygons generated by CPU

1) When no VSYNC signal is synchronized

  1. The first 16ms starts: Display displays frame 0. After CPU finishes processing the first frame, GPU continues processing the first frame immediately after. All three are working properly.
  2. Enter the second 16ms: as early as in the previous 16ms time, the first frame has been processed by the CPU and GPU. Therefore, Display can directly Display frame 1. It shows no problem. However, during the 16ms period, the CPU and GPU did not draw the second frame data in time (the blank area in front indicates that the CPU and GPU were busy with other things), and the CPU/GPU did not process the second frame data until the end of the cycle.
  3. Enter the third 16ms: at this time, Display should Display the data of frame 2, but since CPU and GPU have not finished processing the data of frame 2, Display can only continue to Display the data of frame 1. As a result, frame 1 is drawn more than once (a Jank is marked on the corresponding time period), which leads to missing the Display of frame 2.

2) After introducing VSYNC signal synchronization

After VSYNC signal synchronization is added, the CPU starts processing each frame as soon as a VSYNC interrupt is received.

The issue of out-of-sync flushes has been resolved.

However, if the frame rate of CPU/GPU is lower than the refresh rate of Display, the situation is as follows:

  1. In the second 16ms period, the Display should Display B frame, but because the GPU is still processing B frame, A frame is repeatedly displayed.
  2. In the same way. During the second 16ms period, the CPU is idle because A buffer is in use by Display. B Buffer is being used by the GPU. Note that once the VSYNC point is passed, the CPU cannot be triggered to process the drawing.

Question:

  1. The CPU/GPU frame rate is higher than the refresh rate of Display, and the CPU/GPU is in idle state. Can this state be exploited to prepare the data ahead of time for the next round?
  2. Why can’t the CPU start drawing at the second 16ms? Because there are only two buffers

2.2 Tripple Buffer

Google’s solution to these problems is to add a third Buffer

CPU, GPU and SurfaceFlinger each occupy a Buffer to process graphics in parallel.

In the figure above, the CPU uses C Buffer to plot the second 16ms period. The A frame will still be displayed once more, but the subsequent display will be smoother.

Conclusion:

  1. One Buffer: If you read and write (compose) from the same Buffer, the screen will display multiple frames and the image will tear.

  2. Two buffers: Solve the tear problem, but have the lag problem, and the CPU /Gpu utilization is not high

  3. Three buffers: Improves CPU /Gpu utilization, reduces stackage, but introduces latency (three buffers)

2.3 Choreographer

The Vsync signal on the screen is only used to control the switch of the frame buffer, but does not control the upper drawing rhythm, that is, the upper production rhythm and the screen display rhythm are separated:

If the Vsync signal only switches the buffer without notifying the upper layer to start working on the CPU GPU, then the composition of the display content will not be synchronized.

So Google added the logic for the upper layer to receive vSYNC signals

So how does the upper layer receive this Vsync message?

Google has designed a Choreographer class for the upper layer to act as the upper receiver of VSYNC signals.

Choreographer needs to register a vsync signal receiver, DisplayEventReceiver, with surfaceFlinger.

At the same time, Choregrapher maintains a CallbackQueue internally, which is used to hold the upper components that care about Vsync signals, including ViewRootImpl, TextView, ValueAnimator, etc.

The upper layer receives the timing diagram of Vsync:

Typically, new UI needs to be drawn because the View’s requestLayout or invalidate method is called.

  1. RequestLayout or InValidate triggers a view update request

  2. The update request is passed to the ViewRootImpl, which adds a blocker to the main thread MessageQueue that intercepts all synchronization messages. At this point, any messages we send to the main thread MessageQueue via Handler will not be executed.

  3. ViewRootImpl registers a Vsync signal with Choreographer

  4. Choreographer registers a Vsync signal with the framework layer through DisplayEventReceiver

  5. When a Vsync message is generated at the bottom, this signal will be sent to DisplayEventReceicer and finally to Choreographer.

  6. Choreographer, upon receiving the Vsync signal, sends an asynchronous message to the main thread MessageQueue that will not be intercepted.

  7. Finally, the executor of the asynchronous message is the view wrootimPL, which actually starts drawing the next frame.