As a person who has never been in contact with real-time streaming (live streaming), I had no idea about real-time video before. However, the project I recently participated in just happened to have the demand of video surveillance. Before participating in the selection of technology, I had a feel for the display of front-end real-time streaming.

An overview of

Video has a concept of streaming, so it is called streaming media. Live video streaming is easy to understand, because video is live and you need a place to constantly output video, so the entire video can be called a stream. So can the video output directly to the front page? Unfortunately, if I could, I would not have this article. At present, the real-time video stream of the camera generally adopts the RTSP protocol, and the front-end cannot play the RTSP video stream directly.

  • Real-time Stream Protocol (RTSP) is an application-layer Protocol in the TCP/UDP Protocol system, which is at the same layer as HTTP. RTSP is architecturally superior to RTP and RTCP and uses TCP or RTP for data transfer. RTSP real-time effect is very good, suitable for video chat, video surveillance and other directions.

Then we need an intermediate layer to transfer RTSP to a protocol that the front end can support, which leads to several directions of current real-time streaming technology:

  • RTSP -> RTMP
  • RTSP -> HLS
  • RTSP -> RTMP -> HTTP-FLV

RTMP

RTMP (Real Time Messaging Protocol) is a set of video protocols belonging to Adobe. This solution requires special RTMP streaming media, and if you want to play on the browser, HTML5 video tag cannot be used. Flash player only. (You can use the video TAB for playback by using the following version at [email protected], but still need to load Flash). Its real-time performance is the best in several schemes, but because it can only use Flash scheme, it is directly GG in the mobile terminal, and it is also a thing of the past in the PC terminal. Since the following two methods also use RTMP, here’s how to convert an RTSP stream to RTMP. We use ffmpeg+Nginx+nginx-rtmp-module to do this:

Configure the RTMP protocol at the same layer as HTTP
rtmp {
    server {
    	  # port
        listen 1935;
    		# path
        application test {
    		# Enable live stream mode
            live on;
            record off; }}}Copy the code
Execute ffmpeg on bash to convert RTSP to RTMP and push to port 1935
ffmpeg -i "rtsp://xxx.xxx.xxx:xxx/1" -vcodec copy -acodec copy -f flv "RTMP: / / 127.0.0.1:1935 / live/"
Copy the code

This gives us an RTMP stream, which we can play directly using VLC or IINA.

HLS

HLS (HTTP Live Streaming) is an HTTP network transmission protocol proposed by Apple. It divides the entire stream into small FILES based on HTTP and downloads only a few files at a time. HLS is cross-platform, supporting iOS/Android/ browser, with strong versatility. But it’s not real time: Apple’s official recommendation is to ask for three movies before starting to play. Therefore, HLS is rarely used as the transmission protocol of Internet broadcast. Assuming that the list contains five TS files, each of which contains five seconds of video content, the overall delay is 25 seconds. Apple’s official recommended length for small files is 10s, so there is a delay of 30s (n x 10). Here is the full link of the HLS live stream:


# Enable HLS on RTMP server
As the Server in the figure above, it is responsible for processing the stream
application hls{
		live on;
		hls on;
 		hls_path xxx/; # Folder to save HLS files
		hls_fragment 10s;
}
Copy the code
Add HLS configuration to HTTP server: # as Distribution in the image above, is responsible for the shard file and the output of the index file location/HLS {# provide HLS fragments, declared type types {application/VND. Apple. Mpegurl m3u8; video/mp2t ts; } root /Users/mark/Desktop/hls; # cache-controll no-cache; expires -1; }Copy the code

Then use FFMPEG again to push the stream to the HLS path:

ffmpeg -i "rtsp://xxx.xxx.xxx:xxx/1" -vcodec copy -acodec copy -fFLV RTMP: / / 127.0.0.1:1935 / HLSCopy the code

At this point, you can see that there are already many stream files in the folder, and they are constantly updated:


<html>
<head>
<title>video</title>
<! -- Add CSS -->
<link href="https://unpkg.com/video.js/dist/video-js.min.css" rel="stylesheet">


</head>
<body>
<div class="videoBox">
    <video id="video" class="video-js vjs-default-skin" controls>
        <source src="http://localhost:8080/hls/test.m3u8" type="application/x-mpegURL"> 
    </video>
</div>

</body>
</html>
<script src="https://unpkg.com/video.js/dist/video.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/videojs-contrib-hls/5.15.0/videojs-contrib-hls.min.js"></script>
<script>
videojs.options.flash.swf = "./videojs/video-js.swf"
		videojs('video', {"autoplay":true}).play();
</script>
Copy the code

In my test, the delay of HLS is around 10-20 seconds. We can reduce the delay by adjusting the size of slices, but due to the limitations of architecture, the delay is a non-negligible problem.

HTTP-FLV

Next is the highlight of HTTP-FLV, which combines the versatility of HLS and the real-time performance of RTMP. It can play real-time streams with low latency using HTML5 video tags on the browser. Http-flv relies on the characteristics of MIME to select the corresponding program to process the corresponding Content according to the content-Type in the protocol, so that the streaming media can be transmitted through HTTP. In addition, it can flexibly schedule and load balance through HTTP 302 redirect, supports HTTPS encryption transmission, and is compatible with Android, iOS and other mobile terminals. Http-flv essentially converts RTMP streams into HTTP FLV files. On Nginx we can use nginx-http-flV-module to convert RTMP streams into HTTP streams.

In fact, the FLV format is still Adobe’s format, the native Video tag cannot play directly, but fortunately we have Bilibili’s FLv.js, which can reuse the FLV file flow code into ISO BMFF (MP4 fragment) fragments. The MP4 clips are then fed to the browser through Media Source Extensions. In browser-enabled protocols, delayed sorting looks like this: RTMP = http-flv = websocket-flv < HLS and performance sorting looks like this: RTMP > http-flv = websocket-flv > HLS

  1. First we need a new nginx plug-in: nginx-http-flV-module
  2. Make some new configurations in nginx.conf:
# rtmp server application myvideo { live on; gop_cache: on; # HTTP server location /live {flv_live on; }Copy the code
  1. Still use ffmpeg to push the stream, using the RTMP command above
  2. The front end imports flv.js and then uses it to play
// Use flv.js to enable real-time mode, and then access the path under the nginx address
import flvJs from 'flv.js';

export function playVideo(elementId, src) {
  const videoElement = document.getElementById(elementId);
  const flvPlayer = flvJs.createPlayer({
    isLive: true.type: 'flv'.url: src,
  });
  flvPlayer.attachMediaElement(videoElement);
  flvPlayer.load();
}

playVideo('#video'.'http://localhost:8080/live? port=1985&app=myvideo&stream=streamname')
Copy the code

As you can see, flv.js uses the VIDEO /x-flv MIME to return data.


  1. Flv.js can be configuredenableStashBufferField, which is used by FLv.js to control the switch of the cache buffer. After it is turned off, the minimum delay can be achieved, but because there is no cache, you may see the video lag caused by network jitter.
  2. Try turning off the HTTP configuration in nginxgop_cache 。gop_cacheAlso known as key frame cache, its meaning is to control whether the cache between key frames of the video is enabled.

Here we introduce a concept of key frames: we use the most widely used H.264 video compression format, which adopts compression schemes such as intra-frame predictive compression/inter-frame predictive compression, and finally get three kinds of BPI frames:

  • I frame: key frame, using the intra – frame compression technology.
  • P frame: Forward reference frame. During compression, only the previously processed frame is referred to. It represents the difference between the current frame and the previous frame (which may be I frame or P frame). Interframe compression is used.
  • B frame: a bidirectional reference frame which, when compressed, refers to both the preceding frame and the following frame. B frame records the difference between the current frame and the previous frame. Interframe compression is used.

Typical video sequences with I, B, and P frames. P frames only need to refer to the preceding I or P frames, while B frames need to refer to both the preceding and subsequent I or P frames. Since P/B frames are directly or indirectly dependent on I frames, I frames must be decoded before the player decodes a video frame sequence and plays it. Assuming that GOP (the time distance between two I frames in the video stream) is 10 seconds, that is, there are only key frames every 10 seconds, if the user starts playing at the fifth second, the current key frame will not be available. This is where gop_cache comes in: ** Gop_cache controls whether or not the last keyframe is cached. ** When gop_cache is enabled, the client will immediately receive a key frame and display the image when it starts playing. Of course, since the cache of the previous frame is increased, the delay is naturally increased. If you have higher requirements for latency but not for first screen time/playback fluency, you can try to turn off gop_cache to achieve low latency.

thinking

Delay and lag

Real-time video delay and lag are the two most important parameters in video quality. However, in theory, these two indicators are a pair of contradictory relations — the need for lower latency means that the buffer of both the server side and the player side must be shorter, and abnormal jitter from the network is prone to stall; When the service can accept a high delay, both the server and the player can have a long buffer to cope with network jitter and provide a smoother experience.

How do livestreaming manufacturers do it?

At present, all live broadcast platforms basically give up the above traditional methods and use the CDN provided by cloud service providers, but they still cannot do without the protocols and methods mentioned above. The following picture shows aliyun’s live streaming service. It can be seen that the process is roughly divided into the following steps:

  1. Video stream push (RTMP is used to push the stream at the anchor end)
  2. SDK push stream to CDN node (up-stream)
  3. CDN nodes are transferred to the live broadcast center, which has powerful computing capacity and can provide additional services such as storage (recording/recording to cloud storage/vod), transcoding, auditing, output of multiple protocols, etc.
  4. Live broadcast is distributed to CDN nodes in the middle
  5. User playback (Ali Cloud supports RTMP, FLV and HLS three broadcast protocols)


PS: If you have seen here and think my writing is ok, please give a thumbs up, thank you! 🙏