Photo by Thuanny Gantuss from Pexels

The 5G era is the era of ultra-high definition. However, Rome was not built in a day. There will always be many frustrating problems in the research and development of services such as UHD live video broadcast on demand. In this LivevideOstackCON 2020 online summit, we invited Cai Yuan, senior technical director of Jinshan Cloud. She will start from Jinshan Cloud Mirror platform itself to explain how the platform can help users improve the efficiency of picture quality evaluation, guarantee the quality of evaluation, and solve the difficulties of user picture quality evaluation in a one-stop way.

The text/Cai Yuan

Organizing/LiveVideoStack

Hello everyone, I’m Cai Yuan from the video cloud team of Jinshan Cloud. I’m very glad to meet you at the first LiveVideoStackCon2020 online summit. The theme I share this time is: Everything is for HD — Kingshan Cloud Mirror Platform to boost 5G HD applications. In the 5G era, the UHD live video on-demand business is developing rapidly. Do you also face such problems, such as whether there will be a bad case for AI super score or beauty? How about the local dark field and color enhancement? Does dermabrasion to remove noise lose detail? Jinshan Cloud Mirror platform is born to solve the above problems, its purpose is to establish in line with the user “perceived pleasure” evaluation system, in-depth analysis of the impact of algorithm on subjective picture quality.

01

PART

Jinshan cloud video cloud introduction

This content will be introduced from the following aspects: firstly, the overall situation of Kingsoft cloud video cloud; Secondly, it analyzes the pain points of the video industry. Again, it focuses on how Kingsoft Cloud Mirror Platform (https://kqoe.ksyun.com/) solves the pain points of video image quality and its core technical difficulties. Finally, share the client case of magic mirror platform.

1.1 Brief introduction to video cloud of Jinshan Cloud

Founded in 2012, Jinshan Cloud is one of the top three Internet cloud service providers in China. It was successfully listed on NASDAQ in the United States in May 2020, and its business covers the world and many countries and regions. Since its establishment eight years ago, Jinshan Cloud has always adhered to the customer-centered service concept, with more than 120 industry solutions, serving 243 top customers, providing customers with safe, reliable, stable and high-quality cloud computing services.

Jinshan Cloud video cloud has six core technologies, including video coding and decoding technology, image processing technology, quality evaluation system, artificial intelligence, edge computing, network transmission technology. On the basis of the six technologies, five products are built, including Jinshan Cloud Mirror, AV1, KSC265, face repair and super points. On the basis of the five products, eight industry solutions are built, including live broadcasting industry solutions, on-demand solutions, cloud games, evaluation system, smart HD, KIE, intelligent audit and VR. These solutions serve the nine industries in the figure.

1.2 Introduction of Kingsoft Cloud Video Cloud Advantages

The advantages of Kingsoft Cloud Video Cloud are as follows: First, it provides services based on the huge cloud computing and cloud storage of Kingsoft Cloud, so it has a large-scale cache system, traffic scheduling system and various codec and transcoding systems; To meet the needs of various Internet video application business scenarios, to provide customers with complete solutions; And there is a big data platform that can help customers with real-time big data analysis. Second, it has a strong foundation of experience and technology accumulation in the field of video vertical segmentation. For example, Jinshan Cloud Video Cloud has a very strong technical reserve in storage, CDN, video codec and artificial intelligence. He won the “Best Innovation Award” of the 2018 Asia-Pacific CDN Annual Conference and won the MSU Coding Contest for many times. And AI-based cloud transcoding “Jizhi HD” can save more than 58% code rate for customers, which belongs to the leading level of the industry.

02

PART

Video industry pain point

2.1 Pain points in the video industry

The video industry is actually in a stage of rapid development, and users’ pain points are constantly changing. The pain points of the video industry are mainly reflected in the following aspects: firstly, the client terminal has changed, from TV to laptop to Pad to mobile phone, and the application of mobile phone terminal has also been greatly enriched.

Hd experience, resolution of the second video from 1 k to 4 k and 8 k ultra-high resolution, stereo from mono, dual channel to stereo sound channel, frame rate FPS from 10 to 30 FPS to 60 FPS, have very big promotion, so as the development trend of the video industry, under the background of bandwidth, network greatly rich, hd compared with the rate of demand will increase. At the same time, video content changes and the emergence of PGC, UGC and cloud games. When mobile entertainment reaches a bottleneck, it will spiral to the direction of large screen, including Pad and 8K Ultra HD TV. Netflix, Google Stadium are great examples.

Netflix is the largest streaming service in the US and the most popular video site in the US, the TV membership is very healthy, and Google Stadium is more impressive on the big screen than it is on the mobile.

2.2 Difficulties in quantifying subjective quality

In the 5G era, with the continuous improvement of video HD development, how to continuously improve the HD experience and how to measure such video quality effects as overscore, repair and enhancement are the difficulties in quantifying subjective quality.

We crawled 140,000 pieces of data from the web and sifted through 1,000 videos, each about five seconds long. Through random coding and scaling distortion in 1000 videos, 2000 segments of noise data were produced, and these 2000 segments of noise data were annotated, totaling 150,000 times.

According to the VMAF and PSNR values, the SROCC values of VMAF and PSNR are not high, which reflects that although objective evaluation indexes are available, objective evaluation indexes are actually difficult to quantify the effect of subjective viewing. This is difficult to measure just in terms of coding distortion, so the effects of enhancements such as AI enhancements and fixes may be even harder to measure.

2.3 Key points to improve video HD experience

The key to continually improving the video HD experience is to have a clear and quantifiable quality goal. Looking back at the opposite story, we can sum up the importance of goals. To do anything, we need to make clear a correct goal before taking action, which is the premise of achieving a thing. Therefore, improving the video HD experience also requires a clear and quantifiable primary goal to support the HD path forward iteratively.

2.4 Difficulties in quantifiable indicators of video subjective picture quality

The establishment of quantifiable indicators of video subjective picture quality is mainly analyzed from the following four aspects:

The first is the screening of videos. How to select representative videos from the mass videos and the coverage of content are the key issues, because different video test sets will eventually produce completely different test results. So how to screen the test sequence?

The second is the development of the measurement dimension, which measurement dimension is necessary to measure, with a few points? The all-reference and no-reference patterns are also different.

The third is the quality of the evaluation, because the evaluation needs more than one person to get an average score, so how many people evaluation can not only ensure the accuracy of the evaluation without wasting human costs? At the same time, we should also consider whether the evaluation results need to be screened and how to analyze the data.

Finally, data mapping, evaluation scores in different periods and different scenarios, and how to map the data in different periods is also the difficulty. These four points are what we consider to be difficult points in the measurement.

03

PART

The magic mirror

3.1 The role of jinshan cloud magic mirror

The above part introduces how Kingsoft Cloud Mirror platform solves the above problems. First, the platform improves the efficiency and quality of evaluation. The magic mirror platform provides image and video evaluation services in the way of online platform, and does not require downloading. The task of metrics is to improve the efficiency of metrics through streaming management, and the quality of subjective metrics can be monitored through the platform.

Second, the platform provides professional evaluation methods, supporting full reference mode, no reference mode and OAA mode of three evaluation modes; Support custom measurement dimensions, and provide customers with automatic calculation of objective evaluation indicators.

Third, the platform also has the support of evaluation experts, to help customers to conduct reasonable video screening and evaluation dimension development, and to provide professional evaluation reports and data analysis.

In the end, all of these platform functions and expert team support actually help customers improve the subjective quality of image and video products and enhance the perceived pleasure of customers’ products.

3.1.1 Video screening

Is interpreted in detail, the first video screen, the traditional content-based filtering actually exist larger defects, such as the traditional classification includes basketball, football, variety, such as classification, but this is only one dimension of screening, you also need to quality of characteristics including color, brightness, noise and jitter screening, but these features are continuous, discrete, There is no way to get the appropriate filter set through exhaustive methods, but the filtering algorithm needs to achieve such an effect.

The magic mirror platform has the following steps in data screening: first, the massive data sets are filtered through the data, and then the equalization algorithm is used for screening. The equalization algorithm includes feature equalization, quality equalization and content equalization. Feature equalization mainly includes brightness, color, edge and so on. Quality equalization includes the equalization of noise, blur and jitter data sets. The content balance includes the dynamic, static, distant and close – range of the movement. These equalization algorithms can be combined to produce a comprehensive dataset of considerations.

The following two figures are a comparison before and after screening. The distribution of the original data before screening is extremely uneven, which is characterized by large in the middle and small at both ends. After the balanced screening, the data set is more balanced in the distribution of all dimensions.

3.1.2 Frequency measurement dimension

After the data set is filtered, the measurement dimensions should be developed. What are the measurement dimensions? How scores are set for each dimension, how scores are described and quantified, and whether or not there is a correlation between the dimensions can have a critical impact on the success or failure of a measurement.

For video quality, there are a variety of scenarios, such as the noise in the collection and compression, shooting in the dark field and image, or loses caused by the fuzzy focus, and in color problem, caton, exfoliating caused the details of the loss, or AI, dark field enhancement caused by acosmia feeling, sharp, beauty and super points caused by deformation and exceptions. These scenarios are of great relevance to the formulation of measurement dimensions, as well as to the customer scenarios.

After scene analysis, the evaluation method needs to be determined. Magic Mirror platform provides three evaluation modes:

Expert mode is also full reference evaluation mode, which refers to the comparison of two videos with the same content but different quality, which is more suitable for 10 to 15 people to quickly view the results.

User mode, also known as no reference mode, is no reference mode. Score a video, which is more suitable for truly simulating the actual experience of users. According to the actual user experience results combined with the analysis below, the SROCC score keeps increasing with the increase of the number of users. However, after the number of users reaches 55, the increase of SROCC score will be smaller and smaller. Therefore, the number of users who lock into the user mode is between 50 and 100.

Fine-grained mode, also known as OAA mode, is to select a category of video for PK comparison with a small number of people between 20 and 50. The customer can choose the evaluation mode according to his actual situation.

Before the rating system should pay attention to some evaluation details, including the viewing terminal, resolution requirements, viewing distance, video playback times, picture viewing time, all of which have corresponding requirements.

There are two dimensions of measurement: global dimension and local dimension.

The overall dimension includes the overall subjective effect, blur, color, brightness, detail processing, etc. Overall subjective, for example, to review member present not 1 to 5 points, but the specific description, such as unbearable, unpleasant, ordinary, comfortable, aesthetically pleasing, these descriptions to the user with the guidance of sex role is very strong, such as a description of the dance is impeccably or feast for the eyes, the two selection is more discussion to determine, It is believed that pleasing to the eye is a quality index which is more in line with the user’s choice habit and can make the video quality more clear.

For the local dimension, more attention is paid to the face, including blur, skin color, light and shade, detail and texture, noise and other dimensions. Take blurring. Significant, slight or no. Local circumference can also be hair, lip color, etc. This can be determined based on the user scene.

In the aspect of experimental demonstration, the overall score will be given to each dimension to analyze the correlation between the dimensions and the influence of each dimension on the overall score. The correlation between subjective picture quality and blur, noise and color in the table below is very high, but the correlation with brightness is low. Therefore, for users of this data set, the algorithm can focus on blur, noise and color and reduce the focus on brightness.

After the analysis, it is necessary to analyze the situation of the taggers, such as how discrete the taggers are on the video evaluation scores. One-to-one communication should be conducted for each tagger, and questionnaires should be issued to feedback more fine-grained questions, so as to understand the specific places that affect the user experience.

3.1.3 Evaluation quality assurance

Jinshan cloud magic mirror platform from the platform to ensure the correctness of the whole evaluation quality, with a complete set of evaluation process. First of all, the evaluators’ information will be counted, including basic information, positions, equipment, hobbies, etc., and then they will be taught to ensure that their operation conforms to the guidance of the platform.

Then you go to the exam, which is a test bank provided by the expert evaluation, and then you go to the evaluation task. Validation questions will be added randomly during the evaluation process to monitor the effectiveness of the overall evaluation.

Finally, there are some buried point analyses on the platform to analyze the behavior of reviewers and ensure the validity of the whole evaluation.

3.1.4 Analysis of evaluation results

When the entire evaluation is completed, the evaluation report and conformity report will be automatically generated. The evaluation report will include the analysis of the evaluator: validity determination, evaluator information analysis; Evaluation data analysis: evaluation distribution, evaluation confidence, type analysis; Evaluation dimension analysis: dimension correlation, subjective and objective correlation; Qualification analysis: Icon drawing and interface display.

3.1.5 Data mapping

Since the evaluation may be the result of the evaluation in different periods and in different scenes, the definition of HD in different periods is different, so the evaluation scores in different periods and in different scenes are different, so how to map the data?

On the left of the figure above are two test sets from different periods, Koniq and Kadid. The model we trained on the current data set shows relatively low SROCC and PLCC on the other dataset, but if the two data sets are trained together, SROCC and PLCC perform better on the two data sets. If you use the data mapping and then train together, SROCC and PLCC will have a four or five point improvement, which is a step up. On the right of the figure above is an analysis of the algorithms that perform the data computation.

3.2 Magic mirror platform function

Mirror platform is a for the convenience of reviewers and evaluation task manager platform, which provides the function mainly includes: project and evaluation task management, support, pictures, video online evaluation and subjective evaluation standard, custom support at the same time more than subjective evaluation, integration objective quality evaluation algorithm and automatic generation of professional evaluation report.

3.2.1 Evaluation mode

Jinshan Cloud Mirror provides three evaluation modes on the platform:

One is the expert mode (full reference mode), which is easier to see subtle differences and suitable for expert evaluation.

The second is the user mode (no reference mode), which conforms to the user’s final use scenario and directly judges the subjective effect. Big data must be guaranteed to eliminate errors.

The third is the fine contrast mode (OAA mode), which is used to simulate the user scene while taking into account the fine-grained contrast, and can overcome the problem of slightly different scenes.

3.2.2 Generate professional evaluation report automatically

The evaluation platform will automatically generate professional evaluation reports, including the scores of each dimension, the analysis of the reviewers, the display of Badcase, etc., as well as the comparison of strengths and weaknesses of each dimension. The degree of fit between subjective evaluation and objective scores will also be presented in the entire evaluation report.

3.2.3 Application Scenes of Magic Mirror Platform

The Magic Mirror platform is mainly suitable for three scenarios:

Scenario 1 applies to the algorithm team’s internal algorithm iteration. How to evaluate the effectiveness of the algorithm after iteration? Firstly, an accurate test set and test dimension should be developed. Secondly, the comparative evaluation task is generated, and the effect of the old version and the new version are compared and evaluated on the platform, and finally the evaluation report is produced.

Scenario 2 is suitable for evaluating the user’s actual viewing experience. This requires pulling the video data and picture data from the product line, determining the evaluation dimension, regenerating the unreferenced evaluation task for evaluation, and finally producing the evaluation report.

Scenario 3 applies to the comparative analysis of competing products. For example, to compare the video quality of two competing products, Douyin and Kuaishou, it is necessary to first collect videos of different categories of competing products as a test set, and then generate the evaluation task of OAA evaluation method, compare the effects of different competing products, and finally produce the evaluation report.

3.2.4 Evaluation service

In addition to the platform, we also provide a measurement service. The Metering Service has three options:

Scheme one is to use only the magic mirror platform model effect comparison, and internal subjective evaluation and open source objective evaluation index calculation.

The third party manual evaluation service is added in scheme 2 on the basis of scheme 1.

The third scheme adds expert consultation on the basis of the third party manual evaluation service, which can provide customers with more in-depth picture quality consultation service.

The vision of Jinshan Magic Mirror Platform is to make the Magic Mirror Platform a point to attract more cooperation, so as to jointly create a perfect picture quality evaluation system. The picture quality evaluation system is in line with Chinese aesthetic and can be used for different terminal evaluation system. In addition, we also hope to connect the academic community with the help of Kingsoft Cloud Mirror platform, so as to deliver more accurate industrial needs to the academic community and promote the development of technology. Finally, we hope that there can be a clear and measurable evaluation score to promote the progress and application of high-definition technology.

04

PART

Successful cases

In this part, I will share some successful cases of the Magic Mirror platform. The landing effect of the customers on the Magic Mirror platform is very good, and the response is very good. The main two highlights of the case are to share.

4.1 Case 1: Xiaomi KIE

Case one is Xiaomi. We provide AI-based superdividing technology for the support of MIUI image enhancement. The superdividing technology can greatly improve the picture quality and can be deployed in the mobile terminal. During the whole process of the optimization and debugging of Xiaomi, we spent three months iteratively updating 5 versions and determined 600 pictures as test samples. Through optimization, the overall score was raised from 3.78 to 4.21, and the whole tuning process was conducted on the Magic Mirror platform.

The project was successfully promoted as a feature highlight at the launch of Xiaomi’s new phone, and was officially launched on MIUI10.

4.2 Case Two: CCTV Collection Intelligent HD

The second client case is CCTV. The bandwidth of CCTV in the World Cup and Spring Festival Gala is very high, and its live broadcasting business is also very important. However, CCTV has a pain point that it needs very high bandwidth, so it has problems such as high bandwidth, blur, noise and insufficient color saturation during its live broadcasting.

In view of these pain points, we provide customers with smart HD solutions. This solution on the one hand reduces the bandwidth for customers, on the other hand improves the picture quality. The bandwidth reduction is mainly the application of image preprocessing, video classification, perceptual coding, per-tile coding technology, greatly reduce the bit rate; The cloud transcoding scheme is adopted to improve the lag rate and the first screen time. Image quality improvement mainly uses AI technology for noise removal, blur removal and color enhancement; Another 163 test videos were selected and optimized for two months, and the subjective score reached 3.64.

In the whole tuning process, it is very important to confirm the whole optimization effect of the magic mirror platform, and it provides efficient support for the whole tuning efficiency.

4.3 Partners

Our partners are mainly in academia and industry. In academia, we mainly conduct academic exchanges and cooperation with City University of Hong Kong, as well as cooperation and exchange on algorithm iteration. The industry is mainly in the image quality and Intel codec on the more in-depth cooperation. In addition, the Magic Mirror platform supports the subjective picture quality of SVT-AV1 to jointly evaluate and evaluate the subjective picture quality.

Hope more friends can use the Magic Mirror platform (https://kqoe.ksyun.com/), and together with us on the topic of subjective picture quality evaluation, can further exchange and cooperation.