Abstract: With the development of the Internet, more and more people like live broadcasting, and Baidu Live is also developing rapidly. In order to improve the user experience, this paper summarizes the complex process of Baidu Live, and explains in detail a series of optimization work carried out in the beginning of broadcast.

Full text 4216 words, estimated reading time 9 minutes.

The background,

Baidu has two goals for live streaming:

1. Copy the real world online. Online and offline have the same experience, including media, consulting, e-commerce, shows, etc.

2. It is a perfect shaping of beautiful imagination. With the arrival of 5G, VR and AI, it brings us a space for imagination.

The first thing to do in live broadcasting is QOE (Quality of Experience). As technical students, we should not only ensure the business objectives, but also improve the user Experience.

There are a lot of experiences: first screen time, latency, picture quality, sound and picture synchronization, clarity, noise, echo, etc.

From the perspective of QoE, there are also many QoS (Quality of Service) : push stream success rate, push stream Caton flow, push slow ratio, end-to-end delay, CDN fluency rate, start broadcasting time, pull stream Caton rate, video bit rate, push slow ratio, memory consumption, CPU&GPU consumption and so on. These technical indicators constitute a whole, as a standard to measure the quality of live broadcast service.

However, from the perspective of live broadcast users, when users click on the live broadcast, they want to be able to see the picture immediately, that is, the start speed of live broadcast should be fast. When swiping up and down in the immersive studio, users also want the live stream they are watching on their current screen to start playing quickly. As the user’s first perception of live broadcasting, the start time is placed in the primary optimization goal.

Second, the status quo

Baidu Live is divided into pan-service live broadcast and pan-entertainment live broadcast. Among them, pan-service live broadcast provides media live broadcast, consulting live broadcast, e-commerce live broadcast and other services, while pan-entertainment live broadcast provides entertainment live broadcast, such as show venue, audio broadcast and voice room.

Among them, pan-service live broadcasting is more complex, with the following characteristics about the broadcast initiation process:

1. The process is complicated: it is divided into two sets of processes: external broadcast room and immersive broadcast room, in which there are many flow links;

2. There are many states of live broadcast: including live broadcast, playback and medium state generated by playback

3. Wide coverage: involving multiple teams of Baidu APP: live streaming, player, kernel, network, CDN, etc

4. Changing wheels of driving cars: Baidu attaches great importance to live broadcasting and fast business iteration, so it needs to change wheels of driving cars.

After summarizing the characteristics of pan-service launch, it is necessary to analyze the whole process of launch quantitatively and measure the whole link of launch with data.

3. Data analysis

Analyzing the data of the whole process, the process can be roughly divided into three stages: time consuming of live broadcasting service, time consuming of player and time consuming of kernel. The following diagram is a rough breakdown:

In the actual data statistics, more detailed statistics will be carried out for each link. A brief introduction will be made to the data statistics of the live broadcast business and the kernel:

The data quantification of each link of the live broadcast business is accurate to the time consuming of each step, so that the analysis can be made on the basis of the data.

Take the kernel as an example:

You can then get a chart with detailed data about the kernel.

From the user clicking to jump to the broadcast room or the immersive broadcast room sliding switch to the broadcast room, to the final broadcast success, it is expected that there will be more than 60 points for some series of tracking analysis. The detailed data table shows the elapsed time of each step in the start-up process, and then optimizes for the areas where the elapsed time is high.

First of all, the largest proportion in the report is the time consumption of business scenarios, accounting for more than 60%, so the first thing to solve is this time consumption; The second time consuming stage is the time consuming of pull flow. When relatively large time consuming is solved, targeted optimization is needed for small time consuming.

Optimization of business scenarios

After the integration of the media live broadcast of Baidu APP and the nationwide live broadcast of the show, it will be mixed and distributed in the broadcast room. From the point of view of live broadcast jump, there are two kinds of jump in the broadcast room: one is the external jump to the broadcast room through Scheme, the other is the sliding jump in the immersive broadcast room;

Here is the simple process of external jump to the studio via Scheme:

There are two situations for jumping to the broadcast booth through Scheme:

1. In the case of no Roomid, first request the List interface, and then continue to request the relevant information of the current broadcast room according to the Roomid in the List. According to the relevant information, install the component & plug-in and render the operation. Start playback until successful playback callback;

2. In the case of Roomid, the information about the current broadcast room will be directly requested, and the same action as in Step 1 will be performed.

Here’s the slide in the immersive studio:

Compared with the external jump, the jump in the immersive broadcast room has more users’ sliding operation. From the slide stop to the start statistics until the successful callback, the entire start time is estimated to be about 1700ms, while in the whole process, the entire live broadcast business takes about 1000ms. External Jump: Internal Jump = 1: N; N should be much greater than 1, so see, priority to optimize the broadcast room jump.

After qualitative and quantitative analysis, the optimization plan is formulated:

Based on the consideration of page lag, the overall scheme has two A&B types:

Plan A works on iPhone 8 and up: start creating the player when the user swipes and call play directly.

Plan B is implemented on models below iPhone 8: create the player and prepare the resources when the user swipes, destroy the last studio after the user swipes to stop, and start the next studio.

The implementation of A&B scheme makes all models with a slow rate above and below iphone8 perform well, and will not bring a deteriorating experience to users’ slow rate. For models, it is also a temporary balance point to seek for the balance between launch and slow rate after continuous testing by ABTest. Later, continuous optimization will be made for the delay to avoid frequent creation and destruction of objects, and to reduce the continuous calculation of CPU, so that the model suitable for the scheme A will continue to expand.

With the optimization of the internal jump in the broadcast room, then the external scheme jump is obvious:

The communication scheme between components of Baidu APP uses the scheme to communicate, so the URL of live broadcast is added to the scheme, and the player is directly created through the URL carried by the scheme when jumping to the page of live broadcast, and the live broadcast is started, which is executed in parallel with the business logic.

Four, DNS pre-parsing

DNS (Domain Name System), which is based on the role of the Domain Name to find the IP address, it is the premise of the HTTP protocol, only the Domain Name is correctly resolved into the IP address, the following process can be carried out. When the kernel processes the live stream, the live DNS takes 80 bits about 60ms, which also takes a lot of time.

In Baidu AppApp, to prevent the hijacking of DNS, but also in order to reduce the network delay, we use the HTTPDNS scheme:

In order to optimize the time consuming of live DNS parsing, the DNS pre-parsing strategy is adopted to cache the corresponding IP in advance, which can reduce the time consuming of DNS parsing.

In the application of cold start 10s, network handoff, front-background handoff and other opportunities, according to the model to judge whether to adopt live DNS pre-parsing scheme, the model’s judgment criteria: the user browse live broadcast time M in N days, the current network state, back-end control and so on. When the model decision is passed, HTTPDNS will be called to asynchronously initiate the resolution of the live domain name to get a list of IP, and then measure the speed of IP respectively. After the speed measurement, it is not to directly choose the one with the best result, but to randomly select within a certain range that the test result is acceptable and cache the pre-resolved result. To avoid a large number of users are clustered in a few nodes, resulting in node load imbalance.

The valid time of the HTTPDNS cache IP is obtained from the server, and the default is 300s. When the resolution of the live stream name begins, the cache will be searched, and if there is a valid IP(cache time <300s), the corresponding IP will be returned directly, and HTTPDNS will be called to asynchronously request update. If it does not exist, an update request is made asynchronously, at which point it will be degraded to LocalDNS resolution.

When the network changes (WiFi <-> 4G), clear the current cached IP and restart the pre-resolution of all the domain names in the domain name list.

Final Benefits: PVs with DNS time less than 3ms account for more than 90% of the total live streaming PVs after using HTTPDNS preresolution.

Five, some optimization of the kernel

In view of the time-consuming stages in the report, a series of targeted optimizations are carried out:

1. Forced rendering of the first screen: it is mainly that the first frame of the video will be decoded and forced to go through the rendering process, and will not be synchronized with audio and painting;

2. In the case of weak network, start the low bit rate initiation strategy;

3. Load the next player kernel for high-end models, prepare, and perform frame chase operation when switching live broadcast. If the buffer delay is longer than 3 seconds, 200ms of audio will be lost every 2 seconds. If the buffer delay is longer than 3 seconds, no audio will be lost. If it is longer than 16 seconds, it will be completely lost and reconnected.

6. Optimization of live broadcast starting

Broadcast optimization – media information analysis module optimization

Video width and height, image encoding format and other information are essential information for Android hardware decoder MediaCodec and iOS hardware decoder Video Toolbox. Before configuring the platform hardware decoder, the playback kernel needs to prepare Video width and height, image encoding format and other information.

Some packaging formats (such as MP4) will describe the video width and height, image encoding format and other information in the header. If the video container does not contain the above information (such as FLV), the FFMPEG native process will cycle download the video stream, and then decode the video through the software decoder to obtain the above information.

In H.264/AVC video coding standard, the whole system framework is divided into two levels: video coding level (VCL) and network abstraction level (NAL). The former is responsible for effectively representing the content of the video data, while the latter is responsible for formatting the data and providing headers to ensure that the data is suitable for transmission over a variety of channels and storage media. SPS in NAL, PPS has width and height information, image coding format description, you can get the necessary information through the analysis of SPS, PPS, save the process of soft solution video.

VII. Live Replay (HLS) M3U8 Prefetching

Broadcast videos are usually encoded into HLS format for storage, which is called live playback. The starting 80th bit of video in the encapsulated HLS format is 1250ms, and the starting speed is the core index reflecting the playback experience. The starting performance of live broadcast playback is significantly lower than that of live broadcast (HTTP-FLV) at 510ms.

The main reason for the slow start of HLS encapsulation is that the video index file (M3U8) needs to be downloaded first, and the real video file can only be obtained by parsing the index file. In other words, there is at least one more HTTP request than the live broadcast (HTTP-FLV). To solve this problem, prefetch the HLS video index file (M3U8) to the local SDCard in advance; The time spent on downloading M3U8 is saved when starting the program. Benefit of AB experiment: compared with the hit and missed prefetch M3U8 experimental group, the hit prefetch profit is 346ms.

References:

https://chromium.googlesource…\_instructions.md

https://tools.ietf.org/html/r…

https://github.com/bilibili/i…

Recruiting Information:

Baidu – Broadcast Research and Development Department – Generic Knowledge Broadcast Group. The team aims to build the industry’s first-class live broadcast experience, drive the business by technology, and constantly innovate in the live broadcast scene, combining with VR&AR&AI and other technologies to constantly explore new ways of playing. To play the kernel, the team continuously improves the basic performance of the kernel, enhances the kernel capability, improves the indicator monitoring, and improves the service capability, so as to finally achieve better user experience and product quality.

We sincerely invite you to join our search architecture department by clicking “push” in the menu bar. We are looking forward to your joining!

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

How to Optimize User Experience Like Baidu Live (Part 1)

The background,

Second, the status quo

3. Data analysis

Optimization of business scenarios

Four, DNS pre-parsing

Five, some optimization of the kernel

6. Optimization of live broadcast starting

VII. Live Replay (HLS) M3U8 Prefetching

How to Optimize User Experience Like Baidu Live (Part 1)

The background,

Second, the status quo

3. Data analysis

Optimization of business scenarios

Four, DNS pre-parsing

Five, some optimization of the kernel

6. Optimization of live broadcast starting

VII. Live Replay (HLS) M3U8 Prefetching

Related Posts

Understand the state and props in React from engineering practice

The good wheel of Flutter is recommended for the eight -flutter bottom TAB bar with a cool animation

Thinking about the front end buried point statistical scheme