The introduction

Network video broadcast live there for a long time, TDD as mobile has increase bandwidth and lower tariffs, live video is endowed with more entertainment and social attributes, people live and enjoy anytime, anywhere to watch, the host is not satisfied with a one-way broadcast, viewers are more eager to interact, broadcast the opening time and delay into the development important indicators affect product function. So, the question is: how to achieve low latency, second on live?

First, let’s take a look at the five key processes of live video broadcasting: recording -> encoding -> network transmission -> decoding -> playback. Each step will have varying degrees of influence on the delay of live broadcast. The focus here is on mobile devices. Limited by technology maturity and hardware environment, we have simply summarized four points of live broadcast delay optimization for mobile scenes: network, protocol, codec and mobile terminal, and will be divided into four phases to decode technical details of UCloud live broadcast cloud to realize low delay and second opening.


In “About Live Broadcasting, All the technical details are here (iii)”, we described the principle and optimization of the live broadcasting back-end system. Then, are there any points that can be optimized for the live streaming and the broadcasting end? The answer is no. The optimization of the client is crucial to the realization of the second on and delay experience of live broadcasting. The mobile terminal is mainly introduced here.





Analytical optimization

Refer to the DNS process described earlier, as shown below:

Based on the needs of control and disaster recovery, mobile code generally does not push the server IP address of stream and playback by Hardcode, but uses domain name instead. In the case of IP outage or network interruption, you can also change the DNS to eliminate the faulty IP address. However, domain name resolution takes tens of milliseconds to several seconds. For newly generated domain names with low popularity, the average resolution delay is generally 300ms. According to each link in the figure above, as long as there is a channel network fluctuation or equipment high load, it will increase to second level. In the case of tens of milliseconds, the ISP NS layer will cache the resolution of the domain name if the heat is high enough. The diagram below:

According to the above analysis, the delay in this province is about 15ms, so the minimum domain name resolution can be about 15ms. However, due to the particularity of the live broadcast scenario, the popularity of the domain name used by the push stream and playback is difficult to meet the standards of ISP NS cache, so it is often necessary to go back to the Root NS query path.

Then the principle of client resolution optimization comes out: the local cache domain name resolution results, domain name pre-resolution, every time you need to live push stream and play no longer need to go through the DNS process. This saves tens to hundreds of milliseconds of opening delay.





Play the optimization

Related technical points of live broadcast player include: live broadcast delay, first screen time (refers to the time from the beginning of playback to the first time to see the picture), audio and video synchronization, soft decoding and hard decoding. Refer to the following playback process:




Step description:


  1. According to the protocol type (such as RTMP, RTP, RTSP, or HTTP), a connection is established with the server and data is received
  2. Parsing binary data to find relevant flow information;
  3. According to different packaging format (such as FLV, TS) demultiplexing (DEMux);
  4. The encoded H.264 video data and AAC audio data are obtained respectively.
  5. Decompress audio and video data using hard decoding (corresponding system API) or soft decoding (FFMpeg);
  6. After decoding, the original video data (YUV) and audio data (AAC) are obtained.
  7. Because audio and video decoding is separate, we have to synchronize them, otherwise there will be audio and video out of sync phenomenon, for example, people will speak with their mouths out of sync;
  8. Finally, the synchronized audio data to the headset or external playback, video data to the screen display.








Optimized the first screen time

  1. Starting from Step 2, the time of detecting file type is saved by presetting the decoder type.
  2. Starting from Step 5, reduce the detection range of video data, which also means reducing the amount of data to download. Especially when the network is not good, reducing the amount of downloaded data can save a lot of time to start playing. When I frame data is detected, immediately return and enter the decoding process.






The delay

Video buffer or video caching strategies, the strategy principle is to increase the user waiting time when network card immediately to cache a certain amount of video data, to achieve the effect of subsequent smooth watch, the technology can effectively reduce the card number, but will bring live content on the delay, so the technology is mainly applied to on demand, has removed the broadcast strategy, To eliminate or minimize the time it takes to get content from the web to the screen; (Helps to reduce delay).

Download data detecting pool technology, caton happened when the user download speed is insufficient, then the network suddenly smooth again, before the stranded on the server data will accelerate hair down, caton before then in order to reduce the time delay caused by the player will accelerate detection pool play video data and discard the current acceleration portion of the audio data, to ensure that the current viewing content delay stability.





Push the flow optimization

Push stream step description: it is easy to see that push stream with playback is actually in reverse, the specific process is not to say more.

Optimization 1: Appropriate Qos policies.

Push flow controlling of uplink network will be based on the current situation contract awarding and coding, audio and video data under the condition of the poor network, audio and video data, but did not send data stranded at the local, at this moment, will stop the encoder prevent sending data retention, further will also choose appropriate strategies according to the network situation control audio and send.

For example, in the case of poor network, the streaming end will preferentially send audio data to ensure that users can hear the voice, and send key frame data within a certain interval to ensure that users can see some changes in the picture after a certain interval.

Optimization 2: reasonable keyframe configuration.

Reasonable control of keyframe sending interval (2 seconds or 1 second is recommended) can reduce back-end processing and create conditions for smaller buffer Settings at the back end.





Soft and hard marshalling options

There are plenty of articles on the web about soft vs. hard solutions, and here are some lessons, but the bottom line is that there is no one-size-fits-all solution that works best for all operating systems and models.

Push stream coding: hard coding is recommended for android 4.3 (API18) or above, soft coding is recommended for the following version; IOS uses a fully hardcoded solution;

Playback and decoding: Both Android and iOS players use soft decoding solutions. After testing and summarizing by us and a large number of customers, although power consumption is sacrificed, the performance in some details is better, and controllability is strong, compatibility is strong, and errors are few. Therefore, it is recommended to use.

The advantages and disadvantages of soft and hard codec are compared:


Cloud models and network adaptation

The parameters of the above analysis a lot of the video codec, but in fact the best codec effect is needed according to the type of adapter, due to the iOS device type is less, can be done for each models targeted testing and tuning, but for Android is very difficult to do by models targeted tuning, and produce a lot of new machines every year, If configuration or judgment logic is dead in the code, it is very bad for maintenance and iteration.

So we came up with the idea, can this judgment logic or configuration be put on the cloud? This has led to the creation of cloud models and network adaptation technology.

The terminal obtains and reports the current model configuration, network status, and IP information through protocols before pushing and playing streams. The cloud will return a most suitable codec policy configuration: soft codec or hard codec, the configuration of various parameters, the IP address of the nearest streaming service, and the IP address of the nearest playback service. The terminal can obtain once, do not need to push the stream every time, before playing to obtain once.

In this way, while we continue to iterate and improve the codec adaptation library of models, all live streaming apps using this technology will benefit.





conclusion

According to the analysis of many optimization technologies about low delay and second opening of live back-end and terminal, relevant practices have been carried out in UCloud live cloud, which are some relatively “static” technologies. The actual provision of stable, low-latency and smooth live broadcast services is the result of a great deal of meticulous monitoring, algorithm and dynamic operation in daily life. It is not to achieve some technical points to enjoy a stable set of live broadcast services, but to complete the first brick in the Great Wall.


— — —

And that’s it for Decrypting the Details of UCloud Live Cloud technology. In this series, we have introduced the four points of live broadcast delay optimization: network, protocol, codec and mobile terminal, aiming at the application scenarios of live broadcast, hoping to help people to establish a framework understanding of live broadcast technology.

Technology is always evolving and optimization is still ongoing. If you have any questions or doubts, please contact us 🙂





Related reading recommendations:

All the technical details about live streaming are here.

All the technical details about live streaming are here (II)

All the technical details about live streaming are here.





This article is provided by the UCloud Streaming Media Development Team.

UCloud organization number will exclusively share technical insights, industry information and anything you need to know about cloud computing.

Questions & attention are welcome (*////, ////*)


The above.