Alibaba Cloud PTS stands on the shoulders of double 11 giants and is the extension of Ali full-link pressure measurement. PTS’s flexibility allows users to easily initiate millions of levels of traffic without machine and labor costs. PTS flow control, real-time pulse, accurate control; It is an excellent solution to cope with the rapidly rising traffic pulse of live video.

Author: Zi Jin

“From January this year to now, the number of users of Taobao Live has exceeded 500 million. By August, the traffic of Taobao Live has also increased by 59%, and its core merchant GMV has increased by 55%. Double 11 starts from the evening of October 20, we hope taobao live as the home to undertake this matter.” Recently, Cheng Daofang, head of Taobao business group live broadcasting Division, revealed in an interview with the 21st Century Business Herald that the live broadcasting in the past year was lively, and this year will be more professional.

Under such a large user volume, what different challenges do live streaming applications bring to back-end services? Today we are going to introduce some of the architecture of live streaming and the challenges that this architecture brings to our application architecture.

Structure of Live streaming

We usually see the following kinds of live broadcasts:

1. Single livestream, such as Taobao livestream, is usually accompanied by business logic such as second kill, bullet screen and sending rockets;

2. Multiple people live simultaneously, such as linmac and conference;

3. Recording. For some live broadcast scenes such as training and meetings, the live broadcast videos need to be saved for dissemination and retention, and there is a demand for recording live broadcast. This often has low requirements for real-time performance.

When users watch live broadcast, if the service is connected to CDN, and if CDN is connected, the player chooses the nearest CDN node to play the pull stream, and the pull stream pressure is at CDN. If no CDN is connected, the player pulls streams from the source station.

Here is a common video stream architecture and two data trends:

1. Push-pull logic of video stream, as shown in blue line

2. The general service logic is shown in the yellow line

You can see that there are four main modules:

1. Streaming end: The main function is to collect the audio and video data of anchors and push it to the streaming media service end;

2. Streaming media server: the main function is to convert the data transmitted from the streaming end to the specified format, and push it to the player end so that different players can watch it. Of course, at present, cloud manufacturers have a complete set of solutions for streaming media server;

3. Business server: mainly deals with some common business logic, such as seckill, bullet screen and so on;

4. Player end: In short, the player end is to play the audio and video and present the corresponding content to the user.

The protocols of the four key modules are actually streaming media transmission protocols. The structure of most live broadcasts adopts the format shown in the figure above. The big difference is whether to introduce CDN. Generally speaking, we suggest clients to introduce CDN to reduce the impact of live streaming traffic on the server. The protocol between the four modules does not emphasize consistency.

Next, we discuss along the framework what are the more vulnerable risks and how we can identify them early through pressure testing.

Challenges in live broadcasting

Challenge 1: Pressure from video streaming on streaming media servers

In this push-pull logic, the streaming media server will be impacted due to the large amount of video traffic and long route. When the user starts to watch the video, he/she will first pull the stream from the nearest CDN. If the video content is not cached in the CDN at this time, the CDN will return the source to the streaming media server.

However, the risk exists in the instant when a large number of users watch CDN at the same time and a large number of CDN back sources; This kind of pulse flow can have unpredictable effects on the stream server.

** We usually verify the validity of the link in advance through pressure measurement, and even preheat the video in CDN in advance through pressure measurement. ** However, the traditional HTTP request protocol does not support this scenario because:

1. Even srs_bench, the open source software, and JMeter provide plug-ins to use. However, these open source software requires users to have a deep understanding of the video protocol, so the threshold for use is slightly higher.

2. The video pressure measurement itself requires very high bandwidth, which means that the cost of the pressure measurement machine is relatively high;

3. The influence of region on transmission quality should be considered in video pressure measurement.

To solve the above problems, PTS adds the RTMP/HLS protocol, and makes an abstraction based on the pressure test scenario, so that users can use different protocols in an interface.

In addition, PTS provides a variety of choreography modes that make it easy to arrange scenes; More importantly, problems can be detected more quickly by simulating customer requests from different locations using PTS’s nationally customized model.

Challenge 2: Low latency interactive protocols

Different from traditional promotion, livestreaming often pursues interaction with offline customers. Such as bullets, comments, chat, seckill and so on. The host chat in full swing, users have no response, this is a failure of the live broadcast. However, the ordinary HTTP request cannot meet the demand of timeliness. Therefore, these functions are usually implemented using WebSocket. Because HTTP is a stateless and connectionless protocol, WebSocket establishes a long chain through the server/client to ensure the real-time performance of messages and reduce the performance overhead.

Each time a WebSocket connection is established, an HTTP request is made during the handshake phase. The version number, protocol version number, original address, and host address supported by WebSocket are agreed to the server through HTTP. The key point of the Upgrade packet is the Upgrade header, which tells the server to Upgrade the current HTTP request to the WebSocket protocol. If the server supports it, the status code returned must be 101:

HTTP / 1.1 101 Switching separate Protocols Upgrade: websocket Connection: Upgrade the Sec - websocket - Accept: XXXXXXXXXXXXXXXXXXXXCopy the code

With the above return, the Websocket connection is established successfully, and then the data is transferred in full accordance with the Websocket protocol.

JMeter provides plug-ins to simulate the whole process of WebSocket communication, but it also requires users to understand the gameplay of the protocol and is relatively obscure to use. PTS abstracts the service meaning, and users can perform scenario and pressure configurations. Only a few simple parameters such as pressure URL, exit parameter Settings, and checkpoint Settings are required to play complex protocols.

In addition to being used in live broadcast, Websocket is also widely used in online games, stock funds, sports updates, chat rooms, bullet screens, online education and other scenes with high real-time requirements.

Challenge 3: High concurrent pulse traffic

Different from ordinary applications, live streaming applications are used in a very concentrated time period, so a large number of users will flood in within a few hours. A big V live streaming usually generates millions of users to log in, so the ability of the live streaming system to cope with pulse flow becomes very high. And when rob cargo, and conventional seconds kill, tend to be the host for suddenly went down to a certain time – the time are often not very accurate, high pulse flow requirement for system at the same time, a lot of there will not be a problem at ordinary times, lazy loading, for example, jit preheating, heat and cold data switch and so on traditional big traffic won’t appear problem, will appear.

These two characteristics require the tool to be able to initiate large flows instantly. This requires not only more machines and engines, but also more precise control of the flow — to meet the demands of rapid increase in traffic.

These two points are the strengths of ALI Cloud PTS. Ali Cloud PTS stands on the shoulders of double 11 giants and is an extension of Ali full-link pressure measurement. PTS’s flexibility allows users to easily initiate millions of levels of traffic without machine and labor costs. PTS flow control, real-time pulse, accurate control; It is an excellent solution to cope with the rapidly rising traffic pulse of live video.

The last

PTS has made comprehensive updates to the protocols supported by PTS in response to changes in the video and live streaming industries. It not only supports traditional HTTP requests, but also introduces HTTP 2, streaming media, MQTT and other protocols, allowing users to Test Anywhere!

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.