Building a simple live streaming system from 0 to 1 (1)

One, foreword

With the advent of the era of 5 g, audio and video industry is also likely to usher in an industry in the spring, live is a new video business has always been an important part of the product form, from the first live show, the game is live, to this year because of the outbreak, the comparison of fire live online education, live with goods and so on, all kinds of new live form is more and more shows in front of the public.

As technical developers, today we would like to have a brief understanding of how to quickly build a set of the most simple live broadcast system, and a brief understanding of the architecture model of mainstream live broadcast.

2. Push-pull flow model

First of all, let’s take a look at a complete model diagram of live broadcast push and pull flow. We can clearly see the macro structure model diagram of live broadcast.

2.1 Three main modules of live broadcasting

Push flow module

Push flow module consists of audio and video data collection, if it is a live show class, can do facial filter related functions, to improve the quality of live pictures and user experience, finally, coding compression, reduce the volume of audio and video data, finally, streaming media transmission protocol transfer data according to the fixed format to RTMP server, This completes the whole push end of the stream.

RTMP server module

In the traditional sense of the RTMP server actually may be only the function of transcoding, will push passed data flow end, into a FLV web format of data files, such as convenient and play the view, but the cloud vendor provides a complete set of solutions, such as resolution transcoding, health examination, generated on the cover of the live broadcast, data statistics, recording playback, and other functions, This is also the service encapsulation on the basis of RTMP server, so as to provide a complete set of solutions.

Playback module

Play the logic is relatively simple, in a nutshell is to obtain current address, for audio and video playback, but in actual development process, play the workload of business and technology is one of the most optimization point, as shown in the above the first screen of seconds, decoding optimization, the switch studio, and other functions, takes a lot of energy, according to business to evolve continuously optimized.

Third, the construction steps

This introduction live simple tutorial is mainly divided into the following modules:

Set up the live broadcast server;

OBS was used to push the current.

How to watch a live stream;

The realization of broadcast room message.

3.1 Set up the live broadcast server

The live streaming server parses and codecs the uploaded video streams from the push-streaming end in real time for the viewing end that supports RTMP, HLS or HTTPFLV and other live streaming protocols.

At present, there are many open source live broadcast server solutions in the market, such as LiveGo, SRS and Nginx-RTMP, or the mainstream cloud solutions. At present, Ali Cloud, Qiniu Cloud and Tencent Cloud all provide standard and mature solutions. This article aims to quickly build a simple live broadcast. LiveGo is written in pure GO language, with high performance and cross-platform. It is very simple to install and use. It supports common transmission protocols, file formats and coding formats, or it can directly launch live broadcasting services on cloud providers by installation as shown above.

There are three main ways to install LiveGo: 1) directly download binary runnable files; 2) Start from Docker; 3) Compile from source code.

docker run -p 1935:1935 -p 7001:7001 -p 7002:7002 -p 8090:8090 -d gwuhaolin/livego

Where, the meaning of each port is as follows:

8090: HTTP management access listener address

1935: RTMP service listening address

7001: The http-flv service listens on the address

7002: HLS service listening address

3.2 Using OBS push flow

OBS (Open Broadcaster Software) is an Open source and free Software which provides video recording and broadcasting functions. Download the Software of corresponding platform on OBS official website for installation.

In order to push a stream, the first thing to solve is the “what to push” problem, that is, to identify the source of the stream. Open OBS and click the “New Source” button, as shown in Step 1 of the figure below. It can be seen that OBS supports a variety of sources, such as media source, display acquisition, browser and window acquisition, etc. The existing MP4 file is used here to circulate the stream, so select “Media Source” as the source and use the default name. After clicking “OK”, set the video file to be played, and then click “OK”.

Then there is the problem of “where to push”, which is to have a usable push stream address.

We have already set up good livego live in front of the server, it provides a default address: push flow RTMP: / / localhost: 1935 / live, a standard RTMP server push streaming URL similar to this format: RTMP: / / domain/AppName StreamName, but if you want to use the current address, need to have authorization channelkey.

By visiting http://localhost:8090/control/get? Room =movie to get the channelKey used to push the stream, as shown below, where the data field is the channelKey retrieved this time.

{
    "status": 200,
    "data": "rfBd56ti2SMtYvSgD5xAV0YU99zampta7Z7S575KLkIZ9PYk"
}

Now, both the push stream address and the channelkey are available, and the push stream can be carried out only by setting relevant Settings in OBS. First, click the “Settings” button of “Control” to enter the Settings panel.

Then, select the Push Stream option. Choose “custom” service, the server is set to: RTMP: / / localhost: 1935 / live, streaming key is set to get in front of the channelkey: rfBd56ti2SMtYvSgD5xAV0YU99zampta7Z7S575KLkIZ9PYk. After setting, click the button of “Control” “Start Pushing Stream”, you can proceed to push stream.

In general, the default output configuration will suffice for most scenarios, but if you want to get the look you want, you can set the “Advanced” output mode in the “Output” option and skip this section if you don’t need it. As shown in the figure below, streaming, video, audio, and playback caches can be configured in the Advanced Output Settings screen, with streaming being the most important. The encoder software can be selected from X264 and QuickSync H.264, and the powerful X264 will do the job. “Resize the output” can set the resolution of the output, default to the original video resolution.

Bit rate (bit rate) refers to the amount of data per second after the video is compressed and encoded. The unit is Kbps, where K=1000. The larger the value is, the larger the video data stream pushed per second will be, and the higher the quality of the video will be, but it will take up more bandwidth, which can be adjusted according to the need. Generally, show live broadcast is usually 2000-2500kbps, while game live broadcast may have higher requirements on the bit rate, so corresponding adjustments can be made.

There are many ways to control the bit rate, including CBR, ABR, VBR and CRF.

CBR (Constant Bitrate) Constant Bitrate, the Bitrate is basically Constant within a certain time range. When this mode is used, the image quality will become worse in scenes with more dynamic images, while it will become better in scenes with more static images.

VBR (Variable Bitrate) Variable Bitrate, the Bitrate can vary with the complexity of the image. When this mode is used, less bit rate is allocated in the scene with relatively simple image content, while more bit rate is allocated in the scene with complex image content. This not only guarantees the quality, but also takes into account the bandwidth limitation, and gives priority to the image quality.

ABR (Average Bitrate) is an interpolation parameter of VBR. Simple scenarios are assigned low bit rates, and complex scenarios are assigned sufficient bit rates, similar to VBR. At the same time, the average bit rate in a certain period of time is close to the set target bit rate, which is similar to CBR. ABR can be considered as a compromise between CBR and VBR.

CRF (Constant Rate Factor) The CRF value can be understood as a fixed output value for the clarity and fluency of the video, that is, a stable subjective video quality is expected in both complex and simple scenes.

Keyframe interval (Group of Pictures (GOP)) refers to a set of I frame, multiple P frame and B frame sequence. A frame is a frame in a video, in which:

I frame (intra coded picture) : The most complete picture, with all information, can be decoded without reference to other frames. Every GOP starts with I frame;

P frame (predictive coded picture) : Predictive coded frames between frames need to refer to previous I frame or P frame for decoding, with high compression rate;

B frame (Bipredictive Coded Picture) : Bidirectional predictive coded frames, with the highest compression rate, with previous and subsequent frames as reference frames.

For ordinary videos, increasing the GOP length is conducive to reducing the video volume. However, in live broadcast scenes, too large GOP will lead to longer playing time of the first screen of the client. The smaller the GOP, the higher the quality of the picture. It is recommended to set it to 2 seconds, and no more than 4 seconds.

3.3 Live streaming

We have just built the RTMP server and used OBS, a relatively mature tool with rich functions, to push the stream. Next, we need to solve the problem of how to watch the stream in the user terminal.

FLV (Flash Video) is a kind of network Video format, is a streaming media format, some mainstream live broadcast network use more streaming media format is FLV, it can be played without the need to install any plug-ins.

3.3.1 Small test: use the VLC tool to watch

VLC is an audio and video player, can play the local media, can also play on the network media, the website https://www.videolan.org/index.zh.html to download the corresponding installation package.

Click on the “media” under the TAB “open network streaming” option, and then the network address is set to: RTMP: / / localhost: 1935 / live/movie, after click “ok” will be can see video of OBS push flow.

The use of VLC is mainly convenient for the development of students to watch the test, such as watching the lag problem, resolution view, delay problem positioning, VLC is a more professional tool, can facilitate us to locate the problem and solve the problem

3.3.2 Browser-side viewing with flv.js

Flv.js is the target of the most popular HTML5 pure JavaScript, is also the current domestic more mainstream browser terminal play FLV format solution, this section we use flv.js for a simple play, open the following URL: http://bilibili.github.io/flv.js/demo/.

You can see, as shown in the figure, enter the input field for the following streamURLhttp://127.0.0.1:7001/live/movie.flvAfter that, click Switch to MediaDataSource and Load to play the following screen.

3.3.3 Brief introduction of the live broadcast agreement

So far, we have successfully built the RTMP small framework and understood the whole process of push and pull stream. Next, we need to have a basic understanding of several live broadcast network transmission protocols strongly related to RTMP protocol.

There are several common live broadcasting agreements in China:

RTMP

HLS

HTTP-FLV

HLS stands for HTTP Live Streaming. This is Apple’s proposed live streaming protocol. Currently, both iOS and higher versions of Android support HLS. The two main contents of HLS are.m3u8 files and.ts player files. The receiving server will cache the received video stream, and then cache to a certain extent, it will encode and format these video streams, and generate a.m3u8 file and many other.ts files at the same time. The advantage of HLS is good cross-platform, HTML5 can be directly opened and played, and mobile terminal compatibility is good. The disadvantage is also obvious, that is, the delay is relatively high, if some live broadcast, such as the live broadcast with low interaction, you can use this protocol, HLS network transmission format is very suitable for the scene of VOD.

RTMP is Real Time Messaging Protocol. For developers, we should first clarify that RTMP is the application layer Protocol and the TCP transmission Protocol is used at the bottom. Here we know that RTMP is the Protocol in audio and video related fields. So this using TCP as the main transport layer protocol for the subsequent RTMP also about the various network evolution, left a lot of space, in the broadcast industry, especially in pushing flow, the RTMP protocol is worthy of the name, basically all mainstream live sites is to support the RTMP protocol to push flow, for the details of RTMP protocol details, Subsequent articles have specific analysis.

FLV(Flash Video) is another kind of Video format launched by Adobe. It is a kind of streaming media data storage container format transmitted on the network. The format is relatively simple and lightweight and does not require a large amount of media header information. The whole FLV consists of The FLV Header, The FLV Body and other tags. So it loads very fast. Files packaged in FLV format have the suffix.flv.

RTMP, HTTP-FLV, HLS Streaming Protocols

3.3.4 News in live broadcast

In the live show system, if the realization of the function of audio and video, dress up is to live on a splendid new appearance, so live system in the implementation of information system, is the soul of the whole live under the gorgeous new clothes, how to build a high availability of live broadcast news system, is also a live every system must solve the problem.

Before designing the message system of live show, we need to comb through the message type of the broadcast room simply.

Notification messages such as gifts, barrage, advances, list changes, level changes, etc. Their characteristics are to inform users of the events in the broadcast room, create the atmosphere in the broadcast room and improve the experience of users watching the live broadcast.

Functional messages such as kick, anti-spam approval, red envelopes, PK messages, and so on. The feature of this kind of message is to assist the development of live broadcast business, connecting the three roles of broadcast end, viewing end and service end in the process.

We can from the business point of view, analyzed all kinds of news out of the studio though since the business forms of all kinds, the ultimate form is also colorful, but we can form can be shown from all kinds of news analysis, news from the perspective of development, has the following characteristics, whether we according to the news can be discarded, and real time, We can group all business messages into the following categories:

In the live system, live show, live with goods news live broadcast signaling communication is relatively more, mainly because the essence of business, shows broadcast live and take the goods on the two kinds of live interaction is relatively strong, play is more diverse, according to the classification of the above, we can discard of each business news and real-time requirements are different, Therefore, when developing a message system, it is also necessary to prioritize messages and consider the real-time performance of message distribution.

Just for live broadcast news real-time and not throw away these two properties this paper expounds in business related, but for the broadcast news, the first element is stability, message how accurate stable distribution to the designated studio and is one of the problems we need to consider, live news distribution, said in general can be divided into two kind of way, First, it relies on Instant Messaging, which is also known as IM Messaging system. Second, it relies on HTTP short polling. For example, the client requests the server once every second, and the server returns the incremental message information occurring within this second, and the client gets the incremental message information. Again according to the specific business types, and then to relative page UI rendering of the business, it is ok, technically, is a “push” model, one is a “pull” model, because today we build a simple live broadcast news system, we first use a simple “pull” model is a simple implementation.

Basic implementation idea: Every very short period of time, such as 1 second or less, the client calls the interface of the server according to the ID of the broadcast room and polls the messages happening in the broadcast room. On the server side, we use the SortedSet data structure of Redis to store the messages, where the key is the ID of the broadcast room. Score is the timestamp generated when the server receives the message event, value can simply directly store the serialized string of the message, so that the message can be stored in chronological order, and the deletion logic of the expired message can be configured, and the storage of the entire message can be simply set up.

The message store is shown in Java pseudocode:

long time = new Date().getTime(); Try {// Inserts message data into Redis, jedistemplate.zadd (V_UNIQUE_ROOM_ID, time, json.tojonstring (roomMessage)); If (probability()) {deleteOvertimeCache (V_UNIQUE_ROOM_ID); } } catch (Exception e) { log.error("message save error", e); }

As you can see, message storage is more convenient if SortedSet of Redis is used for storage. Next, we need to deal with the deletion of expired messages in Redis, because invalid expired messages are of no value (all messages can be stored as persistent). If too many messages are stored by a single key in Redis, This will also lead to slow message lookup and increasing memory usage, which we do not want to see, since this is the sample code, so we will simply handle the deletion logic.

private void deleteOverTimeCache(String roomId) { Long totalCount = jedisTemplate.zcard(roomId); log.info("deleteOldTimeCache size is {}", totalCount); if (totalCount < 600) { return; } / / delete obsolete data Set in reverse chronological order < Tuple > tuples. = jedisTemplate zrangeWithScores (roomId, - 601-1); if (CollectionUtils.isNotEmpty(tuples)) { for (Tuple tuple : Tuples) {score double score = tuple.getScore(); jedisTemplate.zremrangeByScore(roomId, 0d, score); break; }}}

The pseudo-code probability() above first makes a probabilistic judgment. For example, we make a random judgment of 1% to determine whether the request should be deleted or not (please note that our deletion logic is placed in the inserted logic). If every insert has to decide whether to delete stale data, the performance of the insert will be affected. After probabilistic judgment, we will prioritize the number of messages in a certain broadcast room. If the number of messages is still relatively small, we will exit the deletion logic. If the number exceeds the message threshold, we will delete the expired messages in reverse chronologically.

With the storage of HTTP short polling messages out of the way, let’s finally talk briefly about the client-side message query implementation logic. The client through the studio id and the timestamp two fields to request the server to query studio, the “time stamp” is every server returned, the timestamp is gradual, next time the client to request data from the server, will bring the server returns the timestamp of the last time, pseudo code is as follows:

@Override public RoomMessage queryRoomMessages(MessageMessageReq messageMessageReq) { RoomMessage result = new RoomMessage(); long timestamp = messageMessageReq.getTimestamp(); Set<Tuple> tuples = null; If (timestamp == 0) {if (timestamp == 0) {if (timestamp == 0); We can be as long as the returns a latest news recently returned tuples. = jedisTemplate zrevrangeWithScores (UNIQUE_ROOM_ID, 0, 0); } else // Add a millisecond, return the following messages, 5 at a time, in case the client is due to low end phone problems, Too much message renders out tuples = jedisTemplate. ZrangeByScoreWithScores (UNIQUE_ROOM_ID, timestamp + 1, System. CurrentTimeMillis (), 0. 5); } List<EachRoomMessage> eachRoomMessages = new ArrayList<>(); long lastTimestamp = 0L; if (! CollectionUtils.isEmpty(tuples)) { for (Tuple tuple : Tuples) {// After the last loop, the timestamp of the last message is returned to the client, LastTimestamp = new Double(tuple.getScore()).longValue(); eachRoomMessages.add(JSON.parseObject(tuple.getElement(), EachRoomMessage.class)); } } result.setTimestamp(lastTimestamp); result.setEachRoomMessages(eachRoomMessages); return result; }

The above three sections of relatively complete code mainly states the ability of a broadcast room that relies on HTTP short polling, which is relatively rough, but it is a good idea to implement. At present, some of our online businesses are also based on the idea of polling to implement some modules.

There is also a small pit in this approach, which can be avoided if there is one that adopts this approach. Off if the Android client network, under the condition of the polling thread will not stop, for example is without Internet at 8 o ‘clock, 8 o ‘clock 01 restore network points, when the network recovery, polling for the first time can cause the server returns a lot of news, this is the need to process, otherwise it will return too much information, the server will happen slowly, The client will also experience some flash in the message display interval due to the expired message rendering. Such as public screen area may be “crazy” in all kinds of news, these may be agreed by both the client and server to avoid the client when the network problems, for example, in more than 5 seconds, you can put a time stamp to 0, request the server returns the latest live broadcast news, the lost in the middle of the news, can be discarded within business returns.

Four, summary

The purpose of this paper is to give you a preliminary understanding of live broadcasting, the basic conceptual model and some basic concepts of live broadcasting. Later, we will study the specific modules of live broadcasting and further understand the principles of live broadcasting, which will also help us to do a better job in the business of live broadcasting.

By Li Guolin, Vivo Internet Server Team