DVQA, a full-reference video quality evaluation algorithm based on deep learning designed by Tencent Multimedia Lab, has been officially opened on Github. The performance of the algorithm model has achieved industry-leading results in the public test data set.

Open source: github.com/Tencent/DVQ…

Domestic mirror address: git.code.tencent.com/Tencent_Ope… Tencent worker bee source code system for open source developers to provide a complete, the latest Tencent open source project domestic mirror

In the audio-visual era, audio and video applications are becoming more and more extensive: live broadcast, short video, video programs, audio and video calls… Recently, due to the rise of online collaborative office and online education products brought by COVID-19, the demand for online audio and video has exploded, and users have increasingly strong demands for audio and video quality.

In the whole video link, most modules can be accurately measured, such as acquisition, upload, preprocessing, transcoding, distribution, etc. The unknown, however, is the most critical part: how users will experience watching video. At present, video quality assessment methods in the industry can be divided into two categories: objective quality assessment and subjective quality assessment. The former calculates the quality score of the video and further breaks it down based on whether hd video is used as a reference, whether the source video is professional or user-generated. The latter mainly relies on human eyes to watch and score, which can directly reflect the audience’s feelings on the video quality. However, these methods still have difficulties such as time-consuming and laborious, high cost and subjective perception bias.

Multimedia lab, proposes a solution for video quality assessment of the first combined with the needs of the business, use the “online subjective quality evaluation platform”, to build large-scale subjective quality database, using the subjective data collected at the same time to train the objective quality assessment algorithm based on depth of learning, training good quality assessment algorithm finally deployed on lines of business, Closed loop monitoring of possible quality problems. From the above three perspectives, DVQA can meet the two requirements of efficiency and accuracy on the premise of taking into account different businesses and scenarios.

DVQA contains multiple quality assessment algorithm models, and this open source is the algorithm C3DVQA for PGC video. The project is developed in Python and the deep learning module is PyTorch. The code uses a modular design to facilitate the integration of newer deep learning technologies, flexible custom models, and training and testing of new data sets.

In terms of algorithm design, the network structure used by C3DVQA is shown in the figure below. Its input is damage video and residual video. The network consists of two layers of two-dimensional convolution to extract spatial features frame by frame. After cascaded, four three-dimensional convolution layers are used to learn spatio-temporal joint features. Three-dimensional convolution output describes the spatio-temporal masking effect of video, and then it is used to simulate the perception of video residuals by human eyes: where the masking effect is weak, the residuals are more easily perceived; Where the masking effect is strong, the complex background can mask the distortion better.

Finally, the network has the pooling layer and the full connection layer. The input of the pooling layer is the result of the masking effect of the residual frame, which represents the residual perceived by human eyes. The full connection layer learns the nonlinear regression relationship between the global perceived quality and the target quality score interval.

In terms of evaluation results, Tencent Multimedia Lab verified the performance of the proposed algorithm on LIVE and CSIQ video quality datasets. Standard PLCC and SROCC are used as quality criteria to compare the performance of different algorithms. The proposed C3DVQA was compared with commonly used full-reference quality assessment algorithms, including PSNR, MOVIE, ST-MAD, VMAF and DeepVQA. The results are shown in the table below.

(Performance comparison of different full reference algorithms on LIVE and CSIQ databases)

At present, the evaluation algorithm has been used and verified in a number of internal and external products of Tencent. For example, Tencent Conference uses hundreds of indicators in the laboratory that conform to ITU/3GPP/AVS and other foreign internal standards to evaluate, and monitors the user experience quality of the whole network in a closed-loop manner, constantly optimizing product performance based on the real user experience.

As one of the earliest companies in the field of audio and video distribution, From the earliest QQ platform, Tencent tried to solve a number of audio and video communication problems under the network conditions of that year. With the development of 5G, cloud computing, big data and artificial intelligence technologies, Tencent Multimedia Lab has gradually polished a complete and high-quality audio and video technology chain based on years of technological precipitation and industry experience.