By Deng Xiaolong (Bai Zhan)

This is the youku play black technology series first “free view technical experience optimization practice”, then we will have launched the WebRTC based implementation of live “cloud” multiple points of view interpretation technology is the technology in the view of the free all link strategy and ground practice “, welcome to click on the top left corner on alibaba mobile technology 】 【 us, pay attention to not get lost ~

Have you seen the fourth season of This is Hip-hop? I don’t know if there are any friends like me, “DNA” to follow the dance. In addition to the explosion of the stage, comparable to the cross-dimensional real film experience, so that users in the free perspective of video experience effect immersive.

As a novel viewing mode in Youku, free-perspective video has brought users a brand new movie-watching experience. As a highlight of Youku in many external cooperation, it has also attracted high attention. However, with the continuous expansion of sound volume of the product, there are still many problems in the overall playback experience and delivery link of the current free perspective, such as poor playback, unclear content and low device coverage, which need to be optimized and solved.

Based on this, youku technology team in the first half of the free perspective of a comprehensive optimization upgrade. In the following content, we will explain the overall optimization strategy and plan of Youku Player team from the overall goal of comprehensive optimization and upgrade, centering on the playing experience and user scale.

What is free perspective

Figure 1

Figure 1 shows the style of each frame of the free-view video. The following is called the depth map.

Principle of free view: On the basis of the original playback link, the free view algorithm SDK is added to process the depth map of each frame and generate a picture of the specified Angle for display to the user.

Client architecture design

This section introduces the core logic of the free perspective implementation. The left side of the two modules separated by two dotted boxes is the logic of the free view implemented in the PLAYER SDK, and the right side is the optimization strategy for the free view playback experience, which will be introduced in detail later.

Playback business layer:

  1. The core user interaction includes Angle rotating gear (mainly generating Angle information for use by calculating side), user prompt of free perspective video and transition animation;
  2. Online function can be turned on or off at any time by switch control.

Player middle layer: it mainly consists of two parts: one is the link transformation of the middle layer to support free view; the other is to download the algorithm file required by the free view video, and transfer the file path to the algorithm layer for use after downloading.

Player kernel layer: deal with the data interaction between kernel and algorithm layer, and then synthesize the texture data processed by algorithm SDK and display it directly on the screen.

Downloader: Responsible for the download of youku on-demand and live video files, without special transformation for free viewing Angle, mainly using the multi-fragment download function of the downloader to improve the download efficiency.

Algorithm: The main responsibility of this layer is to reconstruct the depth map based on the algorithm to generate a specified Angle of the screen.

Free perspective performance optimization solution

  • Direction of optimization: First we need to understand why we are stuck in order to know how to optimize. After investigation, the reason for the lag is due to the insufficient current data of the player, and the player needs to wait for enough data before continuing to play. Therefore, we come to the conclusion that the lag rate can be reduced by downloading data in advance, multi-channel downloading and lowering the bit rate of the video.

  • Scheme attempts: In the early stage, we tried 9 schemes including pre-cache, video stream intelligent file, kernel dynamic Buffer, multi-channel download, free view dynamic Angle reduction, player dual-instance switch bit rate reduction, continuous broadcast pre-loading, overspeed mode and video AV1 coding bit rate reduction. After practical feasibility research, four schemes, namely pre-cache, video stream intelligent file, kernel dynamic Buffer and multi-channel download, are finally determined.

To optimize the practice

Optimization of holdup rate

Video streaming smart file

The figure above is the schematic diagram of the intelligent effect. The intelligent file dynamically determines the bit rate of the next TS fragment according to the intelligent file algorithm, so as to achieve the effect of dynamically reducing the bit rate.

Intelligent file algorithm architecture diagram

The following points need to be highlighted:

  • Interaction and control between intelligent file controller and data source and other modules: collecting video metadata and playing state information (such as buffer length), network information, bit rate/definition selection at fragment level, definition switching control, event response and timeout control on other data source links, etc.
  • Policy engine framework: an interface/environment/container that supports multiple policy implementations, each algorithm strategy implementation based on input from the player kernel and network environment information, given a clear choice of output;
  • Data link closed loop: client decision-making information buried point reporting, cloud data analysis and processing, optimized configuration update or model delivery. Among them, the strategy framework and algorithm strategy implementation of various articulation choices is the core soul of the whole intelligent file, the strategy framework provides a platform, at present, youku intelligent file using ABTest method to support the implementation of a variety of algorithm strategies based on discrete rules to neural network model based on reinforcement learning. These algorithms can dynamically adjust algorithm parameters according to the configuration or model, compare and optimize each other, and complement each other.

Kernel dynamic Buffer

The policy configuration platform delivers a specified policy to dynamically set the size of the kernel buffer to maximize the utilization of downloaded resources.

Multichannel download

As shown in the figure above, multi-channel download technology divides each independent file into N small chunks for downloading, and each small chunk corresponds to a download channel in the figure on the right. In this way, multiple channels can be downloaded in parallel, improving download efficiency and reducing the lag.

Pre cache

  1. Caches broadcast control information and video stream files; (As shown in Figure 2 below)
  2. Dynamically delivering the video cache size by policy. (As shown in Figure 2 below)
  3. Unified free view capability: the configuration of free view capability is uniformly sent to the broadcast control background, and the configuration item of free view capability of the client is removed, so as to avoid unpredictable problems caused by inconsistent configurations between the broadcast control background and the client in the future. (As shown in Figure 3 below)

Figure 2

Figure 3

Scene coverage

Since the FREE view algorithm SDK supports two rendering modes, the normal mode based on DIBR and the cut camera degradation mode with DIBR turned off. On the basis of this condition, for devices that do not perform well enough to support DIBR, it is feasible from both a technical and product point of view to enter the free view through degrade mode.

Before modification

After transforming

Data comparison & effect

Business & Technical optimization effect:

Hip-hop 4 free view video (one month) compared with the same period last year, the total play volume of hip-hop 3 points increased by nearly 2 times. Fluency improved by nearly 70%.

Scenario coverage benefits:

  1. Not all low-end models can support the downgrading mode of free viewing Angle. Because the scene is special, the resolution of the video input source required by the algorithm should be no less than 4K. Therefore, only devices in the range of [4K decoding, supporting DIBR] are expected to be improved in this optimization.
  2. Through this new upgrade of degraded rendering technology, the coverage of nearly 30% of low-end models was added, and the final total coverage was increased from 50% (only supporting middle and high-end models) to nearly 80%;

conclusion

Providing users with better and richer movie-watching experience has always been our goal of Youku, as well as our motivation to continuously explore and try. How to let users feel the temperature of technology, rather than just see the cold literal and digital improvement, so that users can feel better experience, this is one direction of our future efforts. At the same time, we are also creating a free perspective of live broadcasting, and we are trying our best to explore more and newer ways of watching movies.

Next week, we will release this series “WeBRTC-based live broadcast” cloud multi-perspective technology analysis “, thanks for paying attention to [Alibaba mobile technology], we will continue to talk in the next part.

Pay attention to [Alibaba mobile technology] official wechat public number, every week 3 mobile technology practice & dry goods to give you thinking!