We have the honor to invite Mr. Lin Zhengxian, chief architect of 5G from Huya Live, to introduce the misunderstandings and opportunities of 5G low latency. This paper starts with the introduction of the principle of 5G low delay, and gradually solves the public’s five misunderstandings about 5G low delay. Finally, it shares the idea of Huya Live in the construction of low delay deterministic network technology and the application of 5G in other scenarios.

Hello everyone, my name is Lin Zhengxian, from Huya Live. Today, I would like to talk about some issues related to 5G low delay. Remember to talk about this topic because the tiger tooth in doing some 5G landing practice, found that the data on the project and propaganda, media data are very different. We did a lot of exploration and analysis. Today we want to share with you what we think, think and see.

We want to talk a little bit about what is 5G low latency? How does it work? I’ll take a look at the myth of low latency in 5G; There are also MEC (Multiple Access Edge Computing), which is closely related to 5G; There are also some practices on 5G low latency or MEC; Finally, let’s talk about Huya’s thinking on 5G low latency, as well as future challenges and opportunities.

Some news about 5G (Wu) (Dao)

In the past year or two, 5G has been a hot topic, which has been covered by many media. The graph on the left tells us that Wi-Fi bandwidth is poor when there are lots of people. 4G is so-so, but 5G is very fast. Another application of 5G is remote driving. It is claimed to be able to use 5G to drive cars thousands of miles away without leaving home with ultra-low delay. Others say 5G could greatly speed up downloads. The above reports, true or false, you will have your own understanding and cognition after listening to my explanation today.

Definition of 5G delay

What is the low latency of 5G? The definition of 5G low latency in 3GPP is very clear: the standard value for the top and bottom lines of URLLC (high reliability, low latency communication scenarios) is 0.5 ms (RTT 1 ms). For EMBB (Enhanced Mobile Broadband scenario), the up-down delay is 4 milliseconds each. We usually surf the Internet, live broadcast, watch video and so on almost all is the eMBB service.

Why can 5G achieve low latency

How does 5G achieve low latency? On the wireless side, which is the mobile terminal to the base station, what measures are being taken? What is the difference between 5G and 3G and 4G inside the core network? Let’s talk a little bit about this graph. Whether it’s 5G, 4G or 3G, if we haven’t sent data before, our mobile phones have to uplink data to the public network or base station, which is similar to when we raise our hands to answer a question in class. First of all, a uplink wireless resource should be applied to the base station, and the base station should reasonably allocate it according to the current load, and provide an available time and wireless resources to return to the user, and then the user can complete data transmission. This raised hand time is a periodic window, it’s not always on call, and we call this a dispatch request. Scheduling requests introduces latency, which accounts for a significant portion of the overall latency in wireless. 5G requires low latency, so we eliminate the scheduling process through configuration. It can be considered that as long as there is data arriving, it can be sent directly in the configured time window, and no additional application is required. The other thing is Slot. 5G supports mini-slot. Slot is translated as time Slot in Chinese. In 4G and 5G, a time Slot can be divided into 7 or 14 symbols, one symbol corresponds to the wavelength period of a sine wave. For 4G, the bandwidth of the subcarrier is 15kHz, so a symbol is 1/15 millisecond, and the whole slot is about 1 millisecond, which is the basic unit of 4G scheduling. The same is true in 5G, where Slot is always a basic scheduling unit. However, when it comes to 5G, the bandwidth of its subcarriers can be wider, reaching 30KHz, 60KHz, 120KHz or even 240KHz. Then the duration of a symbol will decrease proportionally, so the whole scheduling cycle can become shorter. So in extreme cases, 5G’s complete dispatch cycle can be as low as tens of microseconds. Furthermore, 5G can support mini-slot, and only two symbols can be used as the basic unit of scheduling, and the scheduling cycle can be controlled at the microsecond level. On the other hand, when using the EMBB service, because the dispatch unit is 1 millisecond, I am sending the data of EMBB, and the data of URLLC (higher priority) comes, then I can pause the transmission of EMBB, and send the data of URLLC first on the channel of the original wireless resource. This is called preemptive scheduling. This ensures that higher priority data can be sent out faster. The last diagram describes the reliability of the transmission. Low delay is closely related to transmission reliability. Due to the unreliability of transmission, packet loss and retransmission are caused, which objectively leads to the increase of delay. The practice of 5G is redundancy. A packet is made multiple copies, and the same data is sent through different subcarriers and channels. Here, different carriers may be two carriers belonging to the same base station, or even different carriers of two base stations, so as to ensure transmission reliability and stability through redundancy. In addition, to ensure reliability in extreme cases, it may be necessary to sacrifice part of the wireless efficiency in favor of more anti-interference modulation for transmission. For example, I could have used 16QAM modulation, but for stability, I chose QPSK. These measures are largely on the table.

In the core network, starting with 3G, the mobile network to the public network will go through the gateway (the data gateway between the mobile network and the public network). In WCDMA, we call GGSN, in 4G, we call P-GW, and in 5G, we call UPF. Although the name is different, but they are all gateways. When 3 g or 4 g, gateway in each province in 1, 2, a local deployment, such as big probability is deployed in the southern province of guangdong in guangzhou or shenzhen or dongguan, other places even if access to the local server, also can through the base station, through the core network and just mentioned gateway, around a circle, then back to the original server process is very tortuous. But 5G can sink the gateway closer to the base station we visit, even in the same machine room, which we call the UPF sink. See the green line on the right. We go through the base station, the base station connects to the local UPF, directly to the MAN, and then we can access the local server. Do not need to go thousands of miles to Guangzhou like the yellow line, through the backbone network, and then to the MAN finally to access. Now, the operators have a strategy for how to do this, and we won’t go into detail here with the uplink shunt. It sounds like 5G is doing a lot of things in terms of low latency, and it feels like it’s working, and 5G’s low latency effects are being talked about by the public. However, it is a pity that many people have some deviations in their understanding of this topic when discussing it.

Misconception #01: Theoretical extreme delay is confused with actual delay

First, it’s easy to confuse the theoretical delay with the actual delay, especially when friends in the media, intentionally or unintentionally, are giving some guidance.

Let’s take a closer look at the “scheduling” process mentioned earlier. When we have a piece of data about to be produced on the phone, we can’t send it right away, we need to wait for a periodically configured “dispatch request” (SR) window (e.g. 1ms, 80ms), After sending SR request (which can be read as the process of pressing the button to pass at the traffic light intersection), the base station will determine the user’s priority, the occupant status of the port and so on to decide when to send “grant” to the mobile phone. The channel granted may be small. Maybe just enough for users to send a BSR (buffer status report), mobile phones will tell base station now has 20 k data needs to be sent, receipt of base station know you have a lot of data will be more channel is granted for users to send, this is the talk of a happens, most will experience, whether it is 4 g or 5 g. Scheduler-free 5G just removes the process of applying for SR, but it is unlikely to be possible for EMBB services because of the high cost involved. The longer the upstream scheduling request cycle, the greater the latency, because this process introduces on average half the latency of the SR cycle, which is faster if it is lucky, and the SR has just passed and needs to wait for the next cycle if it is unlucky. If our cycle configuration is small enough, our latency is still guaranteed. In theory, even with 4G, the latency can be a few milliseconds.

Unfortunately, although the 4G delay data can reach a few milliseconds in theory, our 4G measured data are not ideal. In the peak period of Huya, the RTT of 4G users to the same city server is more than 40 milliseconds, and the RTT of Wi-Fi users to the same city server is probably less than 20 seconds at the same time. Last year, there was a white paper, which listed many scenarios of 5G to B. After 5G sinking in combination with UPF, the basic end-to-end delay is between 16 and 20 milliseconds, which is similar to the data of our 5G network. Why is there such a big gap between reality and ideal? When we talk about latency, we include processing latency, queuing latency, sending latency, and propagation latency. The processing delay is the delay introduced by operations such as verifying the validity of the packet. This time is negligible. The propagation delay depends on the speed of light, from A to B, through the optical fiber, A speed we can’t interfere with, and the distance is short enough that the time is negligible. The remaining problem lies in queuing delay and sending delay, which will be discussed in detail later. In general theoretical analysis, we tend to ignore them, which is the main reason for the gap between our reality and ideal. Another problem is retransmission due to packet loss. In many cases, it is not the network RTT that is bad, but the retransmission caused by occasional packet loss, which makes the actual latency higher.

Myth #02: Different technologies are confused with latency in different application scenarios

The second myth is often seen in the media. There are many different scenarios and technologies for 5G, but when it comes to reporting, it’s always about the most advanced technologies.

For example, 5G has three application scenarios, EMBB (Enhanced Wireless Broadband) for large bandwidth, URLLC for high reliability and low latency services, and MMTC for mass machine-type communication, which corresponds to the wide connection of the Internet of Things. Typically, the extremely low latency of URLLC is only required for scenarios such as autonomous driving, where the end-to-end latency may be as low as two or three milliseconds. But there is one thing that cannot be ignored, in order to achieve very low latency has to pay a considerable price. Frankly speaking, in many scenarios, extremely low latency is not necessary. For example, when browsing a web page, does it need an end-to-end low latency of 3 milliseconds? Because costs vary widely, it is not possible to cover low latency in all scenarios. In addition to the high cost of redundancy, modulation and other aspects of wireless measures, and end-to-end low latency also need a very strict end-to-end QoS guarantee.

What is the QoS of 4G? There are two kinds of carriers in 4G data transmission, one is called default carrier and the other is called proprietary carrier. It is possible to create if the data is hosted by default. Proprietary hosts build additional channels for different QoS required streams. The bearer itself is set up at a limited cost. It is also granularity, as a phone typically can only establish a few to a dozen proprietary hosts. So the use of 4G QoS is rarely heard in daily life, because it tends to only appear in the internal services of operators, such as Volte to do some high-priority guarantees, but now the operators are slowly opening up.

5G QoS is relatively more flexible, we can set different PCC rules according to different streams, corresponding to different packet charging and quality assurance rules. It is not based on bearer but on stream, and streams are often identified by triples or other characteristics. The figure above is different from other 5G QoS diagrams because it has QoS involving multiple access technologies. Multi-access. For example, in 3GPP AN, 5G can not only connect to the target address through the air port, but also form multiple Access channels through non-5G media, such as Wi-Fi, to ensure the reliability of data Access. However, 3GPP represents the interests of operators or equipment vendors, so this feature may not be widely used. For example, when we are doing transmission, we usually consider using Wi-Fi, 4G, 5G or MP-TCP to do multi-channel transmission at the same time, which is usually realized in the application layer. 3GPP is trying to encapsulate this under the hood, but from discussions I’ve had with other app vendors, it seems that no one is willing to pay for this, and app vendors are more likely to take control of this. Another QoS counterpart is slicing. I’m not going to go into this because it’s too much. The purpose of the slicing is to ensure different priorities for different services, but the technology has been a “pit” for phones, as support for URSP in 5G phones currently on the market is minimal — almost non-existent. Although 3GPP is defined, in the short term, this technique can only be used in TO-B scenarios, such as accessing different slices according to different DNNs. For current mobile phones, it is difficult to switch from one slice to another: in theory we can map different slices to different apps or streams with different features and enjoy different QoS guarantees, but this technology is not currently supported by terminals.

Myth #03: It-port delay is confused with full network link

A more common misconception is that when we talk about 5G’s low latency, we’re really talking about its nothing-to-the-mouth latency, but many people confuse it with end-to-end latency of the network, or even end-to-end latency of the business.

We’ve been talking about 5G low latency, but we’re really talking about the low latency between the phone and the base station. And the end-to-end process will go through the port to the bearing to the core network, from the core network to the public network, after a number of IDC may be changed to another operator’s core network, and then to the access network finally arrived at the port. Frankly, even if the 5G port can achieve zero or a millisecond delay, at best it will even out the gap with the fixed network. If you use wired access, in fact, how is this link than the air. What we really need is low end-to-end latency. What about the rest?

We can refer to TSN (Time Sensitive Network), which will be used in the industrial Internet. Through this network, we hope to get 5G to realize the end-to-end, high stability and ultra-low latency scheme. The core point — redundancy. As you can see from the figure, when the controller wants to control the remote device, it joins the 5G network in the middle. At the End Station, we are divided into two channels, which are connected to two terminals and transmitted through different transmission channels. In 5G, there are a lot of similar multi-link, multi-path processing, which is worth learning from, due to time reasons, I can not expand more. I want to emphasize that the low end-to-end latency is more important than the low latency of the entire port, and we can look to the Industrial Internet for inspiration.

Misconception #04: The no-load delay is confused with the on-load delay

Another misconception is that people confuse no-load latency (when the network is idle) with on-load latency or even overload latency.

On holidays, for example, when our freeways are crowded with traffic, we don’t expect to be able to move quickly. In the network this corresponds to queuing latency, queuing for resources, waiting for the last user to empty the buffer, which is a big source of latency for us. For wireless users, everyone wants to upload data, and as a base station needs to schedule data, it is bound to cause the phenomenon of queuing. Unless my “lane” is extremely wide (many), there is plenty of lane for everyone, but this is hard to achieve. For example, if you’re interested in wireless networking, there’s a lot of places you can test it. The most classic is at the subway station, especially during the rush hour, such as the Xierqi subway station in Beijing. If the station is measured at the peak, the intra-city RTT may be at 200,300 ms. Even in no-load conditions, measured early in the morning or in the middle of the night, a delay of 80 milliseconds was found. This is related to the setup of the base station. Going back to the “scheduling cycle” mentioned earlier, the operator usually adjusts this value based on how busy the base station is. If I want to make sure that a lot of people can access the base station at the same time, I have to get more people to access the base station in a limited control channel. As a result, the scheduling cycle must be stretched so that every user has access. My guess is that the operator’s scheduling cycle for that point is no less than 80 milliseconds, and the average RTT is then introduced with an additional 40 milliseconds, which is also an interesting phenomenon in subway stations.

Myth #05: Ignoring the impact of bandwidth on latency

A final misconception is that we tend to ignore the impact of bandwidth on latency. For example, when we process some small packets, the latency is still very low, and once the business is loaded, the latency becomes 15 milliseconds, even 40 milliseconds, 100 milliseconds.

I think a lot of you in this room do video, except for infinite GOP scenarios, where when you encode video there’s frame I, followed by frame P, and maybe frame B, and so on and so forth. An I frame is very large, possibly many times as large as a P frame, depending on the parameters set at encoding time. Given an average bitrate of 10 megabits of available bandwidth, if you give us a smooth stream of 10 megabits of bandwidth, we can transmit it really fast, but video is burst, especially for I frames. In the end, we’ll find that the transfer of frame I takes a lot of time and even blocks the transfer of the next frame, frame B or frame P. For example, if I upload a 10 Megabyte video stream I frame may reach 200KB, then even using 100 Megabyte bandwidth to transmit, it may take 16 milliseconds. What is this concept? In a cloud game, assuming a frame rate of 60, the interval between two frames is 16 milliseconds. So for a 10-megabyte stream of video, I need to use 100 megabytes of bandwidth so that it doesn’t affect the next frame. So why we say 5G is an advantage for the development of cloud gaming, because large bandwidth brings low latency. It’s hard to get a lot of downlink with 4G, and a lot of cloud game vendors are using 20 or 25 megabytes of bandwidth.

Low Delay and “5G+ Edge Computing”

Now we can put all of this together and talk about MEC. In the narrow sense of MEC, we became Mobile Edge Computing. Later, 3GPP might not feel cool enough, and changed M to Multi-Access (Multi-Access Edge Computing) to expand its scope, not just for mobile. The initial idea was to use the UPF sink (shorten the distance between the exit gateway and the base station). At this time, the computing resources and storage resources could be placed in the computer room where the UPF was located. The advantage was that the delay would be very low when the distance from the end user was very short. The delay gap between putting the computing resources in the UPF machine room and the central computer room is quite large, especially the central computer room may not be in the same province as the UPF machine room.

Tiger tooth has made an attempt in the aspect of edge calculation, we hope to do some comic style transformation for the image of the anchor in the live broadcast. However, we found that the performance of many anchors’ live broadcast devices was not particularly excellent, especially some low-end or mid-end mobile phones, so it was difficult to make some difficult AI style conversion on mobile phones. And that’s when we thought, can we put the computing part in the cloud and do it with low latency? The general process is as follows: the anchor’s head is processed by ISP after being sensitized by CMOS through the lens, and then sent to APP. APP does some pre-processing, then encodes, and then sends it to the APP through the network. The AI node in the edge machine room must first decode, and then do some corresponding AI processing. At this time, the first stream must be recoded, and then distributed to the audience. The other stream is returned to the anchor. For anchors, this process is just like a process done on a mobile phone. This is a very beautiful idea. The delay that the anchor can feel is steps 1 to 4 in the figure. So what do you think is the highest latency in each of our phases, such as acquisition, encoding, transmission, decoding, rendering? It’s actually the collection stage. On Android phones or iPhone 11 or below, sensitive people can feel the delay, even if their own camera is doing a local preview, the delay time is between 80 and 120 milliseconds. The operation of the phone is nothing more than imaging, which is read from the sensor after the imaging is completed, and then sent to the ISP for processing, and then sent to render after processing. The depth of the pipeline and the overall architecture of the Android camera determines the latency, which we didn’t notice at the time. In some scenarios, the network is not necessarily the bottleneck. The network can achieve very low latency, especially in the transmission, RTT20~30 milliseconds can be achieved. A typical 1080P phone code would take 30 to 40 milliseconds, but the capture would likely take up to three frames. So when we try edge computing, we have to think, is our bottleneck really on the network?

5G versus Wi-Fi

Now I want to talk about Huya’s thinking on 5G low latency. We have had a lot of internal discussions about the relationship between Wi-Fi and 5G. Through the whole network data you can find the user’s two channels, one is Wi-Fi, the other is 4G, frankly most of the traffic is still coming from Wi-Fi, from home broadband traffic. For mobile 4G users, they also rarely use the ultra-low latency service provided by carriers. Even when it comes to 5G, most of the time, we will not use URLLC, or use MTC, the special network of the Internet of Things. We will use more EMBB services. In the air, the RTT of the eMBB is 8 ms, while the better Wi-Fi is 2 ms. The poor quality of Wi-Fi on the market is due to the fact that you are still using the 2.4G band (which is easily jammed) or Wi-Fi 4 or poor AP gateways. Frankly speaking, if you could upgrade the Wi-Fi for all users of your platform, I believe you could reduce a lot of video lag rates. The theoretical latency of the EMBB is somewhat higher than the actual latency of Wi-Fi, and even the current Wi-Fi 6 and anti-jamming Wi-Fi 6E (which use a new frequency band and are not affected by 2.4G or 5G) may widen the latency gap even further. All we can do is wait and see how 5G optimizes in the current network and millimeter wave applications. Millimeter wave is not widely used in our country, we use the sub-6G frequency band. Millimeter wave coverage is limited, but there are a lot of bandwidth resources available, so it can reach a very large bandwidth. In the future, it will be used in some hot technologies as reinforcement applications in China. Once millimeter-wave hits the market, I think it’s possible to close the gap between 5G and Wi-Fi in terms of low latency, but of course you can just close the gap, and that’s just my opinion. That raises the question of how do we position 5G if it can’t compete with Wi-Fi in terms of latency? 5G has advantages over Wi-Fi in terms of mobility and wide-area coverage. We can’t carry Wi-Fi everywhere, especially in outdoor scenes. Therefore, we need to judge for ourselves which businesses and scenarios are suitable for 5G. 5G’s advantage in wide area coverage is worth paying attention to, and we need to consider whether it has affinity with our business. Personally, I think such services as AR, AR glasses and outdoor live broadcasting have a good affinity with 5G. By the way, I won’t pay for things like 5G+VR, because VR has nothing to do with 5G. VR is more about indoor applications, where Wi-Fi isn’t better than 5G? At the same time, 5G also has the problem of traffic charges. But AR is different, because I have to go, I have to integrate with the live, the same with the outdoor live.

10 millisecond level of interaction

We have another consideration, with the progress of 5G technology, the 10 millisecond level interaction era is coming. We now have another type of 5G called F5G, which is the so-called fifth generation fixed network, which is fiber-to-room and Wi-Fi 6. Its delay must be very low, and the coverage is strong, high bandwidth. Back to our 5G, in theory we have a good mesh cloth, the RTT of 10~20 milliseconds can be satisfied. Even in the field, without Wi-Fi and 5G coverage, even with low orbit satellites (300km~500km distance), we can achieve RTT of 20~ 30ms, which means the whole world can be surrounded by relatively low latency network. If the whole chain can be built, we can achieve an end-to-end delay of within a hundred milliseconds. Whereas before we were thinking more in terms of second, hundredth millisecond streams, now we need to think more broadly in terms of 10ms. So in the future, Wi-Fi will be mainly used indoors, and 5G May be mainly used outdoors, shopping malls, etc. As for the wild, we’ve always wanted to go live to Tibet, but with the advent of low-orbit satellites, that could be an option.

The practice of tiger tooth mainly develops in two directions, real-time content operation and live broadcast interaction. We think that the latency is going to go down and the interaction is going to be a different experience. So we put into the cloud games that you are familiar with, and the multi-player interaction, and even the multi-player interactive games combined with live scenes, and the virtual and real interactive live broadcast we have been exploring. But at the technical level, I need a good end-to-end network to ensure ultra-low latency interactions. In addition to Wi-Fi (which still has a lot of room for optimization), operators will also step up their efforts to open up 5G QoS. We will also rely on 5G QoS plus multiple access (dual link, Wi-Fi 5G simultaneous access) and public network multi-path, plus means similar to SDWAN to build a LDDN network (low delay deterministic network).

5G low delay Other landing directions

Regardless of the live broadcast, regardless of the tiger teeth, let’s take a look at the landing direction of other scenes with 5G low delay. To B, I am very optimistic about the application of vehicle-road collaboration and edge intelligence. For example, video application uploads video to the edge, and after the edge is processed, some structural or extraction processing is done. After that, the video will not be sent to the central computer room in the cloud thousands of miles away. This is a good offloading of resources for both computing and networks. There’s also some remote control and collaboration, industrial AR applications, for example, where we can sit at home and instruct a worker in Europe to fix a car. And, of course, there’s autonomous driving. One thing to grasp on To C is the distinction between 5G and Wi-Fi. Wi-Fi can do things that no one wants to do with 5G. In China, more than 90% of households have broadband, 94% of which are optical fiber, so in China, there should be little need to use 5G to do some broadband access indoors. We need to focus on which applications are compatible with 5G. Maybe in the future, after the automatic driving into the market, driverless, then sitting in the car may need some entertainment, the car can only be connected through wireless, that can only use 5G.

However, there are still many problems with the low latency of 5G. The whole country’s 5G network is under light load, and the current latency is just so-so. It remains to be seen how much latency can be maintained after the heavy load. Millimeter wave has its own advantages, but when will it go online in China? In addition, we pay more attention to the low end-to-end delay in the business, which depends on the maturity of the whole ecological chain. For example, camera acquisition, if there is a delay of 2-3 frames all the time, the end-to-end delay bottleneck will be difficult to break. As well as the operating system itself, the delay of the camera is also inseparable from the operating system itself, such as the setting of the camera in Android phone. The depth of the entire pipeline determines the delay level. As an additional example, Android 11 Media Codec2.0 has some enhancements for low latency decoding, and we’ll see more examples of this in the future. Also, do we prepare for low latency when we write our applications? I talk to a lot of developers, and they still use the Java layer API for audio playback. If you don’t use OpenSL ES or AAudio, the Java layer of audio playback latency can be up to 200 milliseconds on the phone. All previous efforts at the network level were in vain. For technical challenges, the RTC was also mentioned above. For Huya, we want to follow up the QoS of the wireless access network. We may rely on the QoS that the operator gives us. At the same time, because the optimization of the port is far from enough in the optimization of the whole link, we will also adopt multi-access (Wi-Fi 5G or Wi-Fi 4G dual link), including multi-access routes on the public network, and even multi-path low-correlation routes to ensure the transmission of reliability. At this point, I’m not building a carrier-grade deterministic network. We just need a simple version, which is enough to support my ultra-low latency. On the market operation level, the operators keep publicizing 5G, but 5G has no price advantage compared with Wi-Fi, and there is still the problem of insufficient data. In addition, there will certainly be delayed and differentiated products in the future (low delay and ultra-low delay and normal delay), which relates to the issue of how to promote QoS. As a result, operators’ strategies have a profound impact on the development of 5G.

conclusion

To sum up: there is a considerable gap between the theoretical and engineering data of 5G delay, and we need to face up to this problem; What we need more is not only the low latency of the port, but the low latency of the end to end; 5G is not a panacea, it’s more about expanding your business from a spatial dimension, expanding your business where you couldn’t do it before; Technically, in order to meet our 10 millisecond low delay era, we need to build a cumulative low delay deterministic network. This is the end of today’s sharing, thank you!