Author: Li Kai (Shen Bu)

Hema Android short video second broadcast optimization scheme: Reject the delay and reveal the hema Fresh APP Android short video second broadcast optimization scheme

“Content, as a new activation point of App products, has received more and more attention and investment. Short videos are a sharp tool to increase user stickiness and prolong user stay. The content and experience of short videos are directly related to whether users are willing to stay for a long time. Hema also puts forward the plan of full link content video to realize the improvement of product power expression. At present, there are short video scenes including: home page, search, product details, Talent show, immersive video, sweet video, box area home page feeds stream, topics, UGC content, topic collection landing page, community, recipes, box shoot one-click clippings, live playback, WEEx, etc.”

The objective of this optimization is to align hema App with the experience of mainstream short video apps, such as Douyin and Handtao. The specific hard indicators of optimization are the success rate of playback, the rate of delay, second on rate. In addition, in order to reflect the real experience of users watching short videos, Hema also added a motion indicator: the rendering time of the first frame.

Optimization effect comparison

From 350ms to 80ms, create the ultimate silky experience of iOS short video in the new retail scene

The above video test is based on the iPhone 6S, and you can see that In most cases, Douyin can start playing immediately after sliding to the next video; Before hema optimization, after sliding to the next video, it will show the cover image first, and then continue to play, with a flash jump process. After optimization, hema effect has been close to douyin effect.

In order to measure the comparison of experience with Douyin before and after optimization, several frames of screen recording are currently adopted to calculate the time between the full display of the video page and the rendering moment of the first frame. The motion sensing data are as follows:

In addition, some hard indicators were optimized, and the results are as follows:

Optimization scheme

In the early stage of this optimization, we investigated many excellent solutions in Ali Group, most of which are connected to Mobile Tao player and the kernel is based on the open source ijkPlayer. However, the threshold of player level itself is relatively high, and the optimization of mobile has been better, so the optimization direction of this time mainly focuses on the preloading scheme of the upper business. Specifically from the following aspects:

Unified video playback proxy and cache

The loading speed of a video is largely determined by the time it takes to download the video from the network. Increasing the video cache can effectively improve the playback speed of the video. In order to realize the caching mechanism, a proxy server needs to be introduced to take over the video data download process, as follows:

A. Optimize the pre-playback process:

B. Optimized playback process:

Before setting videoUrl for the player, the business layer encrypts the original videoUrl, replaces it with the 127.0.0.1 local proxyUrl, and directs requests to the proxy webServer. At this point, the proxy module is called to parse the original VIDEO URL, read the cache or make remote requests, and finally return data to the player through the server.

It is also a common means in the industry to add intermediate agents for video playback. The Hand-Tao player hema relies on also has a ready-made agent service, but its agent function is placed in another independent DW library, which is redundant for Hema. In addition, the SDK currently does not support independent pre-download interface, so the upper layer cannot optimize the premiere. So hema has now made a separate agent layer to support flexible customization of the upper layer.

There is also the benefit of the self-built agent that some businesses that do not use the uniform Mobile player can also enjoy the caching service, such as the system player used by some Flutter pages. At least for the management of cache, the protection of the maximum value of cache area is currently set, and the video cache is cleaned every time the App returns to the foreground.

Proxy and cache for M3U8

In addition to the common MP4 video, the daily encounter m3U8 video. Unlike MP4, this type of video does not return the video stream directly when requesting the URL. Instead, it returns the playlist text, which is the playlist of the video clips that can be played, as follows:

This kind of video cache processing, using is to modify the URL in the M3U8 playlist, replace the proxy URL implementation, you can go proxy. There was a problem with the cache support of M3U8 on iOS side before, and the crash happened. The reason was that after the URL of the first video in the Playlist of M3U8 was changed to proxy proxyUrl, the first clip was played normally, but the url of the subsequent clip was still the original URL. When loading the original relative URL path, the player will splice the domain name and path of the first short paragraph internally, resulting in url problems after the second paragraph and direct crash. The current approach is to change all urls in the PlayList to the fullPath of the proxy URL.

With mp4 and M3U8 videos available, the complete flow is as follows:

Independent preloading capability

The proxy cache mentioned above can improve the speed of secondary playback, but there is still no cache available for the video played for the first time, and the download process is still time-consuming. Therefore, independent preloading capability is required, in conjunction with the business layer, to download video data in advance at the appropriate time (no rendering).

At present, the bottom layer provides [HMVideoLoader preLoadUrls:URLS] method, internal video cache by URL, download size limit 1M. When multiple videos are pre-downloaded at the same time, they are executed in serial order to ensure that they do not occupy too much bandwidth and affect service processing. When users scroll to the video position, they can directly start playing to achieve the optimization of the first start speed.

It should be noted that the preloading here reuses the proxy class mentioned above and also caches data with the URL as key, so that subsequent secondary playback can also read from the same cache. If the video starts to play during preloading, stop the preloading task to avoid cache conflicts caused by repeated downloads of the same video.

Video bitrate and resolution optimization

Video preload, proxy cache, prepare video data in advance are based on the point of view, this has a premise, is the preparation time is short, the business can be used in a timely manner, if the video is very big, the network is bad, the business and the need for immediate consumption, may not be able to enjoy the optimization effect, so we need to further optimize on video bit rate, resolution.

In the early days, Hema played H264 videos, and all of them were hd videos, which in many feeds streams is actually not needed so much, affecting loading speed and wasting traffic. At present, H265 transcoding has been applied for and configured on cloudVideo. After hema video is uploaded, two channels of video can be obtained at the same time, with hd, STANDARD DEFINITION and universal definition resolutions, thus providing freedom for the terminal to choose according to the business scenario. Take a look at the comparison of the same video size after switching:

A. H264 is switched to H265 (both hd) : the original SIZE of H264 is 10.6m, which is changed to 7.1m after switching

B. Switch to H265 and change the resolution. The original H264 was 21M and the resolution changed to 8.3m

As can be seen from these two examples, on the premise that the same video is in HD, the size of the VIDEO decreases by about 30% after the video is cut to H265. If the resolution is reduced to STANDARD DEFINITION at the same time, the size of the video decreases significantly, which means that the video bit rate decreases, and users can download the first frame data more quickly.

At present, hema server interface has been modified to support direct return of H265 video address. The strategy of iOS is to use H265 preferentially and request different resolutions according to the current environment:

A. Use h264 for iOS11 or later. IOS11 and above, using h265(hard solution enabled by default)

B. Resolution: different resolution request sequences are defined according to the current model (high, medium and low), network type (wifi/4g), and current network condition (strong and weak), as shown below. The final returned array is sorted into resolution parameter priority, for example, hd#sd#ld indicates the priority of hd.

static NSString * const VIDEO_HD = @"hd"; static NSString * const VIDEO_SD = @"sd"; static NSString * const VIDEO_LD = @"ld"; static NSString * const VIDEO_HD_H265 = @"hd_265"; static NSString * const VIDEO_SD_H265 = @"sd_265"; static NSString * const VIDEO_LD_H265 = @"ld_265"; + (NSArray*) getExpectedVideoDefinition { NSArray *VIDEO_PRIORITY_GOOD_ENV = nil; NSArray *VIDEO_PRIORITY_NORMAL_ENV = nil; NSArray *VIDEO_PRIORITY_BAD_ENV = nil; If ([[UIDevice currentDevice] systemVersion] compare:@"11.0" options:NSNumericSearch] == NSOrderedAscending) { VIDEO_PRIORITY_GOOD_ENV = @[VIDEO_HD, VIDEO_SD, VIDEO_LD]; VIDEO_PRIORITY_NORMAL_ENV = @[VIDEO_SD, VIDEO_LD, VIDEO_HD]; VIDEO_PRIORITY_BAD_ENV = @[VIDEO_LD, VIDEO_SD, VIDEO_HD]; } else{ VIDEO_PRIORITY_GOOD_ENV = @[VIDEO_HD_H265, VIDEO_SD_H265, VIDEO_LD_H265]; VIDEO_PRIORITY_NORMAL_ENV = @[VIDEO_SD_H265, VIDEO_LD_H265, VIDEO_HD_H265]; VIDEO_PRIORITY_BAD_ENV = @[VIDEO_LD_H265, VIDEO_SD_H265, VIDEO_HD_H265]; } AliHADeviceEvaluationLevel deviceLevel = [AliHADeviceEvaluation evaluationForDeviceLevel]; NetworkQualityStatus networkQualityStatus = [[NWNetworkQualityMonitor shareInstance] currentNetworkQualityStatus]; NetworkStatus nwStatus = [[NWReachabilityManager shareInstance] currentNetworkStatus]; NSArray *videoPriority = VIDEO_PRIORITY_NORMAL_ENV; if (networkQualityStatus == SEMP_StrongSemaphore) { if (deviceLevel == HIGH_END_DEVICE) { videoPriority = VIDEO_PRIORITY_GOOD_ENV; } else { if (nwStatus == ReachableViaWiFi) { videoPriority = VIDEO_PRIORITY_NORMAL_ENV; } else { videoPriority = VIDEO_PRIORITY_BAD_ENV; } } } else { if (deviceLevel == HIGH_END_DEVICE || deviceLevel == MEDIUM_DEVICE) { videoPriority = VIDEO_PRIORITY_NORMAL_ENV; } else { videoPriority = VIDEO_PRIORITY_BAD_ENV; } } return videoPriority; }Copy the code

Immersive video page-turning motion optimization

After the above solution went live, looking back at the data, the average loading speed was improved, but the loading time was still nearly 200ms, including the player initialization, downloading or loading cached data, and rendering the first frame. The reason is that in a complex network environment with a large number of users, it is difficult to ensure the best experience for all. 200ms in the full-screen immersive video scene, although much faster than before, will still make users feel the instant not smooth, that is, after users turn to the next page, they still stay for a short period of time before the first frame is played. To make things worse, many of the videos on the horse have their cover art uploaded by the artist himself, which is likely to be different from the first frame, making the jump from the cover art to the first frame even more noticeable.

In order to achieve the silky feeling of Douyin, in addition to the above measures, another layer of preprocessing is needed on the upper body sensation, which adopts the dual-player strategy as follows:

The basic process is to pre-instantiate the second player while playing the current video, load the URL of the video and play it to the first frame, then pause the third and fourth videos for serial pre-download (pre-download is a process of pure download without rendering logic). After adding the “pre-play” mechanism of the next video, when the user slides to the next video, the user can immediately resume from the pause state of the first frame to play, no longer need to show the cover image in advance, but also improve the speed of the playback on the sense of motion. Rendering of business data other than video can be carried out while the user is swiping through pages.

First video load optimization

The above optimizes the user’s page-turning experience, but the loading experience of the first video on an immersive page still needs to be optimized separately because there is no pre-loading time for it to enter the page. As follows:

As shown in the figure, when entering an immersive page, it is always necessary to request the videoList data of the page first, and then request the data of the first video successively. Even if the cover image is added, the user will feel slow. Therefore, the strategy is changed to the right picture. When jumping to the immersive page, the previous page needs to transmit videoUrl in advance, play it in advance, and make MTOP request and render business data at the same time. In this way, the video and service data can be loaded asynchronously. As users focus on the video, the page loading speed is faster from the user’s perspective.

Audio Experience optimization

Hema did not pay attention to the optimization of audio in the early stage and received a lot of feedback. Now the optimization strategy is as follows:

App startup without interrupting music;
Interrupt the music when entering the exclusive audio page (such as true fragrance video, immersive video);
Restore music when exiting App or retreating to background;
Audio playback is not controlled by the mute button (similar to Tiktok).

Direction of subsequent optimization

The player layer provides further encapsulation: encapsulates all the boundary logic such as video loading, preloading, double player, judgment of the first video in the screen, exit, pause, etc. At present, each business needs to consider more such boundary conditions, which can be considered in the encapsulation layer;
Seamless switching of playing progress between pages: Click to switch from small size video to immersive full-screen process, realizing seamless switching. Playing progress continues from previous page without interrupting audio. This can further optimize the experience of the first video on an immersive page, completely realizing the “zero time” motion sensation;
Performance optimization for simultaneous playback of multiple videos: Hema plays only one video at the same time in most scenarios, but some services need to play multiple videos at the same time, which poses high challenges to memory and scrolling performance.
Video to Gif: In some scenes, the screen is full of videos that need to be played at the same time. If N players are instantiated at the same time, the effect can be imagined. Consider trying to synchronously produce GIF source in the production stage of video content. In certain scenarios, APP can use GIF to replace player to achieve preview;
Video clip – Voice to subtitles: The video clip function has been established in The box based on the panpai ability, providing common and easy to use editing capabilities for content producers. Consider adding voice to subtitle module to enhance the expression of hema commodity power in video.

In the next issue, we will continue to share the experience optimization practice of hema short video on iOS/Android.

Pay attention to [Alibaba mobile technology] wechat public number, every week 3 mobile technology practice & dry goods to give you thinking!