This article is shared and organized online according to qi jiyue, senior engine engineer of netease yunxin, in MCtalk Live#4: the balance of video QoE. At the end of the article, there is a live video review and QA compilation.
takeaway
With the rapid development of the Internet, the demand for Real Time Communication (RTC) is increasing day by day. How to obtain the best Quality of video Experience (QoE) under various complex network Quality of Service (QoS) and uneven hardware terminal is an important part of RTC technology.
Starting from the Video Quality Controller (VQC) module, this paper introduces some work done by netease Yunxin NERTC in improving Video QoE.
Role of VQC in video QoE
QoE of video mainly includes three indicators of video clarity, video fluency and video delay, which are collectively determined by network QoS, video processing algorithm and VQC.
- Network QoS: Provides the fullest available bandwidth possible
- Video processing algorithm: output the best possible video quality at a certain bit rate
- VQC:
- Responsible for QoS, control bit rate, ensure smoothness and delay
- Responsible for video algorithms to ensure performance and balance clarity and fluency
VQC monitors video QoS status and video algorithm status and outputs control signals to achieve the best QoE performance of the scene, including balancing clarity, fluency and delay. Today, we mainly share VQC implementation and QoE tuning work on netease Yunxin NERTC.
Netease Yunxin VQC implementation
The VQC module of netease yunxin.com partly refers to the module design of WebRTC. The overall structure is shown in the figure, which mainly consists of four monitoring modules and a policy module. Input parameters through the monitoring module to obtain the current various state results, and then by the VQC strategy module to determine the final output control signal, control video pipeline work. Below, we describe each module in detail.
QualityScaller
The function of the QualityScaller module is to monitor the current coding quality, mainly responsible for clarity and encoder stability.
The module input the QP threshold determined by the type of encoder and encoding algorithm, the QP value of the current output encoding frame, the statistics of the current frame loss, and output the results of the quality of the video.
The QPgh-smoother module uses an exponentially weighted summation to determine the QP statistics as follows:
Let’s look at the composition of this formula in detail. In the formula:
- Sample is the QP of the current frame
- Y is the QP statistic
- Alpha is a set of coefficient values summarized according to the QP variation characteristics of the encoder, using different coefficient values according to the upper and lower limits. For example, when we test the OpenH264 encoder, the QP upper limit coefficient value can be 0.9995, and the QP lower limit coefficient value can be 0.9999. Through the upper and lower limits of this difference coefficient, the effect of rapid response when the video quality deteriorates (QP increases) and slightly dull response when the video quality deteriorates (QP decreases) is achieved.
The upper and lower QP statistics of the final statistics are compared and judged with the input QP threshold to determine the quality of the current picture. The input QP threshold is also different according to the encoder. For example, on OpenH264, the lower limit of the threshold is 24 and the upper limit is 37. Hardware The hardware code on iOS devices uses other values, and the hardware code on Android is different, both of which require a lot of validation from the device.
MovingAverage is a sliding window function. The frame loss ratio in this window is taken, and when it exceeds a certain threshold, it is considered as quality deterioration.
The final internal periodic query module will collect the statistical results of qpgh-smoother and MovingAverage and output two results (no output is not included in the results) :
- Good video quality
- QP smoother LOWER LIMIT OF QP statistics is less than or equal to the QP lower limit
- MovingAverage Indicates that the packet loss rate exceeds the threshold. Procedure
- Poor video quality
- QP smoother STATISTICAL upper limit greater than QP upper threshold
OveruseFrameDetector
The main function of the OveruseFrameDetector module is to monitor whether the current performance can support the operation of the current frame rate and be responsible for the smoothness of the video.
The module input the current target frame rate, resolution, CPU Usage threshold, capture and send video frame time, output the current performance is good or bad results.
ProcessingUsage module through the input video frame capture and send time, statistics of the whole video to send link or video collection to send the length of the, use this time to do some smooth operation after get a statistics, use the statistics and the current frame rate under the frame theory of interval time comparison, statistics and the length is more than the theoretical value and record number. Then collect the number of times periodically. If the number of times exceeds a certain threshold, the result of bad CPU performance is displayed; if the number of times exceeds a certain threshold, the result of good CPU performance is displayed.
In this module, you need to prevent some false CPU good or bad results, such as:
- With a small sample size (such as a low frame rate), the time required to periodically collect data does not change, which is prone to result errors
- When the new frame rate resolution started to work, there was no problem with the processing time of each link, which also needed special treatment
RateAllocator
The RateAllocator module determines the current bit rate usage and acts as a policy module for the usage of large and small flows in large and small flow scenarios.
This module has several key functions:
- There are multiple users on the remote end, and some of them subscribe to streams and some of them subscribe to streams. The module decides what proportion of the limited bit rate is appropriate to allocate
- In the same scenario, when the bit rate is very insufficient, the module will decide to combine the large and small streams into one stream to improve the picture quality
- In the case of limited bandwidth in the following line, the module decides whether it is necessary for the sender to reduce the bandwidth to send
MediaOptimization
MediaOptimization module is mainly responsible for monitoring and correcting the real-time bit rate and frame rate to prevent the network congestion caused by the bit rate overload, because the network will further deteriorate after congestion, resulting in the overall reduction of picture quality, fluency and delay.
This module controls the real-time bit rate mainly through the internal FrameDropper module, which uses the funnel algorithm to decide whether the current bit rate is overshot or whether it needs to drop frames to stabilize the bit rate.
Before encoding each frame, the target bit rate of the frame is put into the funnel as input. After encoding, the actual bit rate of the current frame is taken as the output of the funnel, and then check whether the funnel is full. If so, the next coding frame is discarded to control the bit rate. The capacity of the funnel is related to the allowable delay and needs to be defined in a scene.
The result of frame loss or not will also be output to the QpScaller module as a basis for evaluating the coding quality.
VQC decision module
The VQC decision-making module decides the current video strategy based on the results of all the preceding modules in the VQC and the user’s scene setting.
It contains two state machines and a decision module.
The two state machines are independent of each other and do not affect each other:
- Video quality state machine
- Performance state machine
A decision module in which we specify some important functions:
- Set thresholds for various internal adjustments based on user-set scenarios and desired video parameters
- Based on the results of the state machine, decisions are made to increase or decrease the parameters of the video (resolution, frame rate), and strategies to increase or decrease the video
- Based on other information, other parameters of the current frame encoding are decided, such as whether a stream or stream is encoded in a Simulcast dual-stream scenario
- Based on other information, determine whether the algorithm needs to be adjusted, such as encoding algorithm, post-processing algorithm, etc
Video QoE tuning via VQC
VQC ensures good video QoE through full-link monitoring and regulation of video quality. The following introduces some work of YUNxin RTC in tuning QoE through VQC.
Judge code quality correctly
There are many parameters to characterize the coding quality: PSNR, SSIM, QP, VMAF, etc. Because of the particularity of the hardware encoder and the consideration of the calculation cost of parameter acquisition, QP is selected as the evaluation standard.
If you choose to use QP as an indicator of the quality of the correct response code, the following points should be considered:
- Conventional Slice QP in H264/265 encoding, the general encoder can only reflect the quality of the first few encoding macroblocks. In software coding, the better Average QP can be used as the QP of the video frame, so as to judge the better quality of software coding.
- Different coding algorithms have different QP thresholds. For example, OpenH264 can use (24,37) as the upper and lower limits of QP judgment, but it needs to be adjusted for different encoders and different coding algorithms. For example, our NE264, NE265, NEVC coding algorithm need to do the corresponding adaptation adjustment.
- The QP threshold of the encoder varies with hardware acceleration platforms, such as iOS, Android, and even Android chip platforms.
- In order to extract features, it is necessary to adjust the statistical coefficient of the statistical method.
Correctly identify performance problems
In order to prevent the video QoE reduction caused by performance problems, we need to be able to accurately identify performance problems and make correct and effective adjustments. Currently, in our VQC, video frame processing time is used to represent performance state. In order to correctly identify performance state, the following aspects need to be considered:
- Can judge the performance of the whole process of coding and preprocessing
- Some hardware has pipeline latency to consider
- If the frame interval is not uniform, it will lead to the qualitative energy problem of misjudgment and need to recognize this feature
In order to make effective adjustment, we need to consider the following aspects:
- Adjust according to the priority of performance consumption in the test. For example, the priority of some modules in our test is: pre-processing > coding algorithm > Frame rate adjustment > resolution adjustment
- If the corresponding adjustment is made, the statistical performance state still remains unchanged, we need to have corresponding processing means, feedback the adjustment content and results to the state machine, and let the state machine report to the decision-making module for the next decision
- If the performance status changes too much, expand the adjustment step
Optimization adjustment
The effective adjustment is that the QoE of the video is significantly improved after adjustment. We can mainly adjust it in the following aspects:
- Resolution adjustment
- Frame rate adjustment
- Simulcast stream tuning
- Some algorithm switches for preprocessing
- The Codec adjustment
How to optimize the VQC, as follows:
- Users can configure multiple scenarios and policies
- Communication mode, live mode
- High customization: special scene mode, resolution invariable mode, frame rate invariable mode, minimum frame rate minimum bit rate and other Settings
- Internal adaptive adjustment, according to a large number of tests to determine the combination of parameters in a specific scene, adjust the step size and the best path, such as video resolution and frame rate adjustment step size and path
conclusion
This paper mainly introduces the design of video quality control system VQC in netease Yunxun RTC and some work in QoE tuning. No strategy is perfect and you can’t have your cake and eat it too. The work we do in QoE tuning is to balance clarity, fluency, and latency in a number of ways under certain conditions. The optimal strategy can be found through a combination of strategies and a large number of data tests.
QA organize
The following contents are sorted out according to QA records in the online live broadcast group:
- @Yiwei Yihang question:
Q: In ratecontroller, SFU forwarding mode is generally adopted. At this time, does Simulcast adjust the bit rate of the stream from the server to the sender considering all the feedback from the subscribers? A: Our server has made strategies for various scenarios. The default is TopN strategy that can be configured. Part of the audience at the head try to use high definition and high flow stream, while A small number of users with poor network quality use small stream. Well, my question is not about the SFU forwarding policy of the server, but the bit rate adjustment policy of the sender in the large and small streams. You mentioned a point when you talked about ratecontrol. The sender receives a CC bandwidth feedback, and the sender provides Simulcast. It seems that you have the ability to adjust the bit rate of large and small streams taking into account the network state of different receivers. Do you have the ability? A: We have downlink CC. The server will estimate the bandwidth according to the output of downlink CC, and then synthesize an appropriate bandwidth and feed it back to the sender. Simulcast is the end based on the total bit rate, in our module to make the decision whether to flood, small flow, or double flow. The server also decides whether to send downstream streams or small streams based on each end.
Q: How much delay will beauty and super bring? A: About how much delay beauty and super will bring: if it is smaller than the frame interval, it will not delay our pipeline. We have made dynamic adaptation, and if it is larger than the frame interval, it will be dynamically closed. The delay of the whole pipeline is less than one frame interval.
Q: Generally, those who do this work are based on WebRTC. It seems that if coDEC needs to be switched in the middle of the process, peerConnection should be re-created and re-negotiated. Do you support it because it was added by your own research, or WebRTC itself supports it? A: We have some reference to WebRTC, coDEC switching does not need to be renegotiated, we have done the ability negotiation within the channel, it is A private protocol, we do not need to do SDP exchange like WebRTC, we have our own ability negotiation protocol, and then switch within the audio and video engine.
Q: Another question, how do you handle keyframe requests in the RTC conference room? If each newly joined user sends keyframe requests, the room will have a lot of traffic, but if not, it will take a long time to wait for the next GOP. What kind of strategy do you adopt to balance? A: 1. The general logic is that each newly added user will send an intra request, and then the receiver will control the sending interval of A key frame. This will not have too many key frames and will not result in slow drawing. 2. Of course, we have made some optimization for quick drawing, such as intrarequest of the server in advance, saving the recent key frame and other operations. These are some of the details we actually adjust, need to adapt according to their own scenes, we are also different scenes. Livestreaming and communication strategies are different.
- @ galen questions:
Q: Could you please tell me how to detect stacken? Is there any particular time between stacken frames? A: We will count 200ms and 500ms stutter, detect the actual frame rate interval of rendering, and count small or large stutter over 200ms or 500ms.
The authors introduce
Qi Jiyue, senior engine engineer of netease Yunxin, has been engaged in audio and video related development for a long time, and has in-depth research on WebRTC engine, audio and video conference system, video coding and decoding, etc. Currently, I am mainly responsible for the video experience of NERTC engine of netease yunxin.
Address: video review McTalk.yunxin.163.com/details-liv…