The background,

In the process of App use, page fluency is second only to Crash in influencing user experience. ProMotion is supported on Apple’s newly launched iPhone 13 Pro and Max, with a maximum refresh rate of 120Hz, which makes users more sensitive to changes in refresh rate caused by page fluency. This paper summarizes the work of snowball iOS client in feed flow page and body page fluency optimization in community business, including identification/testing of caton tool use and caton optimization practice.

Second, the tool

Define the caton

The non-high-brush iPhone has a refresh rate of 60Hz, the frequency of the VSync signal, which requires a frame of content to be rendered in 16.67ms. If the rendering time of the next frame B exceeds 16.67ms, that is, after the VSync signal arrives, the current frame A will be stuck on the screen and frame B will have to wait for another VSync signal before rendering to the user. Apple calls frames that miss the expected VSync signal Hitches [1].

When the user operates on the page, such as swiping up and down or jumping to the page, the main focus is on the interaction of gestures, and the lag is perceived by the user as “jitter”. A good interactive experience is to provide smooth response speed; otherwise, users will perceive obvious lag, which will affect the user experience and even make users lose interest in the App.

Identification card,

Instrument Animation Hitches

Animation Hitches template in Instrument can check the freezes. The Hitches column in the following figure shows the freezes occurred. Click on one of the freezes, and the type of freezes will be displayed below. The Hitch Type of Catton 6 in the figure below is Expensive Commits, suggesting that the lengthy commit session caused the catton.

Enter the name of the current project in the filter box in the upper left corner, and circle the COMMIT that caused the lag. Switch to a Profile in the lower left corner, and view the Time Profiler calls for that COMMIT.

The Animation Hitches tool can detect the lag and, combined with Time Profiler, analyze the call Time that causes the lag. Time Profiler collects the call stack of the running thread and then summarizes it in a statistical manner, so Time Profiler does not show the actual code execution Time, but only the Time when the stack appears in the sample statistics, as shown in the figure below. So Time profilers are only suitable for coarse-grained analysis.

Flame figure

Os_signpost is a reliable choice if you need to fine-tune your analysis to make time-consuming calls due to lag, but manually inserting a large number of OS_SIGNPOST code statistics functions is inefficient. Hook objc_msgSend can count function time spent in message sending, and fire charts are a very effective tool for analyzing CPU time calls. We combine the two as a fine-grained analysis of function time.

Trace Event Format [2] defines a flame chart data Format. Combined with hook objc_msgSend scheme, a flame chart can be generated to display data at the beginning and end of method call [3]. Below is a flame diagram of the test code. The child function calls inside the function appear as “flames” that go vertically downward, and the deeper the function call stack, the deeper it goes. In a flame diagram, a flat underlying “flame” indicates a possible performance problem with the function. In the test code, the -testFunction1_1_1_1_1 and -testFunction1_1_1_2 internal threads are asleep for some time, and are shown in the diagram as two flat bottom functions, meaning that these two functions get the most benefit from optimizing first. In addition, the two variables that control the flame chart are the function call depth limit and the minimum function time limit, which control the degree of statistical refinement.

Hitch ratio

The reduction of function consumption calls cannot be directly transformed into a page fluency indicator, and an objective indicator is needed to evaluate the optimization effect. WWDC20 [1] defines the Hitch time and the Hitch ratio, and the Hitch time is the frame delay display time (in ms). The Hitch Ratio is the ratio (ms/s) of the Hitch time per second for a page slide or other animation. Apple uses the Hitch Ratio to quantify page lag, and gives a suggested value for the Hitch Ratio. It believes that user experience is better when the Hitch ratio is lower than 5ms/s.

The Hitch Ratio is collected for UI tests in the XCTest framework. We measured the Hitch Ratio before and after optimization of feed stream pages and body pages to determine the optimization effect.

Third, optimization practice

Based on the tool analysis above, the gridding of the Snowball Community feed flow page and body page focuses on the parsing and drawing phases of rich text, as well as constraints on remakeConstraints that accumulate as page style complexity increases, which are optimized for below.

Rich text optimization

Snowball’s community business mainly revolves around rich text processing. When rich text is complicated, parsing and drawing will take a lot of time, which is the main factor leading to page lag. The optimization of rich text is mainly introduced from two aspects: parsing and drawing.

Rich text parsing

The figure above shows the original rich text parsing process. The process of inserting < IMG > tags in special <a> tags and removing HTML tag parentheses requires multiple traversal of rich text. When rich text contains a lot of <a> tags or <img> tags, it takes up a lot of main thread time, resulting in serious lag.

One-time traversal parsing

We used DTCoreText to optimize the existing rich text parsing process. DTCoreText is an open source iOS rich text component that converts HTML+CSS rich text to NSAttributedString in one go. The process of DTCoreText data parsing is shown in the figure above:

  1. Rich text HTMLString string into DTAttributedStringBuilder DTAttributedStringBuilder receiving DTHTMLParser callback to generate a DOM tree, A process for handling special tags can be added to the DTHTMLParser callback.
  2. Each node of the generated DOM tree tree is a custom DTHTMLElement. It uses DTCSSStylesheet to parse the corresponding style of each element. At this point, each DTHTMLElement already contains the content and style of the node. Finally, NSAttributedString is generated from the DTHTMLElement.

When DTCoreText parses rich text, it exposes the parsing process to the user. Through the callback function, it tells the caller what element is being parsed and lets the user decide what to do with it. So DTCoreText is parsed and processed at the same time, and only needs to traverse rich text once, so we can efficiently accomplish the need to insert < IMG > tags in special <a> tags and so on.

Asynchronous parsing

DTAttributedStringBuilder created three queues: Parse the HTML _dataParsingQueue, generate the DOM tree _treeBuildingQueue, and assemble the NSAttributedString _stringAssemblyQueue, assigning the parsing process to three queues. Block dispatch_group_wait and wait for a result after all tasks are complete. So parsing is done on non-main threads, and rich text can be parsed asynchronously to further reduce time consumption on the main thread.

Rich text drawing

Custom text asynchronous drawing

To meet business requirements, feed flow pages use CoreText for rich text rendering. When rich text content is complex, especially when it contains a lot of expressions, it will cause serious lag in main thread drawing, so asynchronous drawing is adopted to optimize.

In iOS UIView handles event passing, and drawing is done through CALayer, which draws with the “-(void)display” method. Asynchronous drawing is implemented by inheriting CALayer and overwriting the “-(void)display” method, internally placing the drawing task on the non-main thread. YYAsyncLayer [4] is a CALayer that implements asynchronous drawing. When it needs to display content, it will request an asynchronous drawing task from a delegate, namely UIView.

We use YYAsyncLayer to asynchronously draw the rich text of the feed stream. SNBTextLabel is the rich text display component, which implements the protocol YYTextAsyncLayerDelegate defined by YYAsyncLayer. Through “- (YYTextAsyncLayerDisplayTask *) newAsyncDisplayTask” create asynchronous mapping task, internal calls in asynchronous rendering tasks SNBTextData encapsulation of self text rendering process. This asynchronous rendering modification is less intrusive to the original drawing process, reduces the test regression point and ensures the quality of on-line drawing.

YYLabel is optimized for asynchronous drawing

The comments section of the body page uses YYLabel, which provides the displaysAsynchronously property to control whether asynchronous drawing is enabled. But when asynchronous YYLabel open after drawing, comments section in loading the next page reloadData flashing [5], this is because the YYLabel clearContentsBeforeAsynchronouslyDisplay attribute defaults to YES, Within the YYLabel rewrite change the properties of the function, if displaysAsynchronously and clearContentsBeforeAsynchronouslyDisplay to YES at the same time, will clean up the original content first. Therefore, even if the text content of YYLabel in the cell remains unchanged, the existing layer.contents will be cleared first, and then the new layer.contents will be drawn asynchronously. There will be a period of time in the middle YYLabel content is empty, resulting in flashing.

// YYLabel.m - (void)setTextColor:(UIColor *)textColor { if (! textColor) { textColor = [UIColor blackColor]; } if (_textColor == textColor || [_textColor isEqual:textColor]) return; _textColor = textColor; _innerText.yy_color = textColor; if (_innerText.length && ! _ignoreCommonProperties) { if (_displaysAsynchronously && _clearContentsBeforeAsynchronouslyDisplay) { [self _clearContents]; } [self _setLayoutNeedUpdate]; }}Copy the code

So in the current scenario, the YYLabel displaysAsynchronously set to YES, at the same time set the clearContentsBeforeAsynchronouslyDisplay to NO, avoid flashing.

Constrained layout optimization

Reduce the number of remakeConstraints

The cell of the Snowball Feed flow page holds a lot of business and contains a lot of code for remakeConstraints, which is the most time-consuming function call outside of rich text drawing and parsing. Changing to Frame layout can eliminate this part of time, but it involves a large number of test regression points, which is relatively risky. On the other hand, for the most part, the FEED flow cell displays the same UI component and does not require remakeConstraints of the various subviews each time the data is set up.

Therefore, as shown in the code below, for the view component viewX, each time the data is set, the difference between the bound historical data _model and the new data model is checked to determine whether the data has changed to require updating the viewX constraint, thus reducing the number of remkaeConstriants.

- (void)setModel:(Model *)model { BOOL needReLayoutViewX = [self measureRelayoutViewNecessary:model]; If (needReLayoutViewX) {[self.viewX mas_remakeConstraints:^(MASConstraintMaker *make) {// set constraints}]; } _model = model; } - (BOOL) measureRelayoutViewNecessary Model (Model *) {if (Model && self. The Model && 'UI components shows that change does not meet the') {return NO; } return YES; }Copy the code

Other optimization

Reduce the number of views created and removed

  • It is also time consuming to create and remove views frequently. For example, avoid creating a new UIImageView every time you display an image in the 9 grid image section of the feed stream cell. Instead, reuse the created UIImageView and hide the extra UIImageView when less than 9 images are displayed.
  • Avoid triggering UI component lazy loading calls directly. Lazy loading is triggered only when a display condition is met, otherwise instance variables can be used instead.

4. Experimental results and summary

The Hitch Ratio of feed flow pages and body pages was tested using the XCTest framework to simulate extremely complex data. After several iterations of optimization, the Hitch ratio of feed streams on iPhoneXs decreased from 16ms/s to 3.5ms/s. On the iPhone6s, the Hitch ratio for text pages dropped to 5.5ms/s from 60ms/s before optimization.

In the smoothness optimization practice of Snowball iOS community page, Instrument Animation Hitches and flame graph can be used to locate time-consuming function calls in the COMMIT stage, and several header time-consuming function calls are optimized. The optimization effect was quantified by the Hitch Ratio index, and the actual use experience on low-end mobile phones was greatly improved. The optimization points involved in this paper are mainly time-consuming optimization in the commit stage. For optimization in the rendering stage, we can refer more to Apple’s technology sharing [6].

Five, references,

[1] Session 10077 – Eliminate animation hitches with XCTest

[2] Trace Event Format

[3] www.speedscope.app

[4] iOS tips for keeping the interface smooth

[5] github.com/ibireme/yyk…

[6] WWDC Demystify and eliminate hitches in the render phase