preface

With the advent of 5G and AI, audio and video technology continues to iterate and upgrade. Netease Yunxin has launched a new generation of audio and video technology architecture. What are the core technologies of the next-generation audio and video architecture? What are the practical plans for each technology online scene?

Netease Innovative Enterprise Conference released a new generation of audio and video technology architecture, including a new generation of audio and video fusion communication server system, a new generation of audio and video SDK and a new generation of audio and video engine. This article is compiled from wu Tong, chief architect and multimedia development expert of netease Cloud streaming media.

This sharing will be carried out from the following aspects:

  • Development trend of audio and video;
  • Next-generation audio and video architecture upgrade;
  • New generation of audio and video core technology;
  • A new generation of audio and video scene landing;
  • Summary and outlook;

2020 Audio and video development trend

2020 is bound to be an extraordinary year. The epidemic has been changing since the beginning of this year. It is not easy to be here today and have a face-to-face conversation with you. Seeing overseas people are still suffering, we should thank our motherland for its strength. This year, the outbreak of COVID-19 has also made audio and video services get a spurt of development, especially during the work from home, video conferencing has become a pressing need for everyone, and also cultivated the habit of using video conferencing.

, of course, in addition to the outbreak, the arrival of the 5 g and its characteristics such as large bandwidth, low latency, mass connection also further enriched the audio and video application scenarios, not only in the traditional video conference, entertainment, social networking, online education, financial and IoT scenarios, such as continue to flourish, also led to cloud, cloud to cloud interview, games and other emerging areas. Of course, each scene also puts forward more differentiated requirements for audio and video experience: lower latency, higher concurrency, etc. Video conferencing, online education and cloud games will all grow rapidly in the next few years, especially cloud games, which will reach 100 billion in 2023. Therefore, I believe that the future audio and video industry will continue to grow at a high speed, and there will be infinite opportunities waiting for employees in the audio and video industry in the future.

With the growth of the audio and video market, netease Yunxin is also constantly improving its core competitiveness in the audio and video field. Driven by technological changes in 5G and AI, netease Yunxin iteratively upgraded its original audio and video architecture and launched a new generation of audio and video technology architecture, making major upgrades to the entire audio and video process, including three major architecture upgrades, which are as follows: A new generation of audio and video fusion communication server system, a new generation of audio and video SDK and a new generation of audio and video engine.

Netease Yunxin next-generation audio and video architecture upgrade

Architecture of next-generation audio and video fusion communication system

First of all, let’s take a look at the overall architecture diagram of netease Yunxin in the new generation of audio and video fusion communication server system.

1 Streaming media transmission and processing service

In the middle of this frame diagram are streaming media transmission and processing services, covering edge media access, real-time transmission networks, streaming media processing services, and live on demand services.

In the new-generation architecture, the streaming media system can be compatible with various protocols. The edge media server supports both cloud SDK access and WebRTC access on the standard Web end. At the same time, YUNxin developed the SIP gateway server, and realized the access of SIP and PSTN. Universal media gateway is used to realize the access of standard RTMP streaming tool, small program and RTSP surveillance camera.

After the edge media service system receives the media packets from each protocol client, it will use the edge nodes and routing nodes of the real-time transmission network developed by Yunxin to distribute real-time media data around the world to ensure the end-to-end optimal experience. At the same time, the universal MCU and transcoding server of streaming media processing service can be used to transfer the mixed media streams to the yunxin interactive live broadcast server by bypass, and then distribute them through the fusion CDN system of yunxin Live broadcast. It can also be recorded in the cloud and stored in the vod service system of yunxin.

2 Global streaming media control server

The left side of the architecture diagram is the global streaming media control service, which includes channel and stream management service, unified media scheduling service and real-time transmission network scheduling service. It is the brain of the whole audio and video fusion communication system and dynamically controls the operation of the whole system.

3 Big Data and Configuration Services

On the right side of the architecture diagram is the big data and configuration service system of Yunxin, in which the global big data analysis and mining system is responsible for data processing, alarm and quality transparency of the whole link collection, and uses the results of big data mining to guide the formulation of algorithms and strategies of all modules of the whole link. The other is our intelligent global configuration management and delivery service, which is responsible for the delivery of various cloud parameters, including QoS parameters, audio and video codec parameters and the related switch of A/B Test.

Here, we make a summary of the new-generation audio and video fusion system architecture of netease Yunxin:

  • First of all, the new generation of audio and video fusion communication system is a hybrid networking system that combines real-time media edge server, real-time transmission network and fusion CDN, which can meet users’ various requirements for scene and network real-time.
  • Second, we will sink the media edge server and media gateway to the edge, which greatly reduces the distance between users and the first-hop access service, and also gives better play to the edge computing capability of 5G.
  • Third, the control surface of audio and video system and the media forwarding surface have been re-abstracted and isolated. The control server is responsible for the management of audio and video rooms and streams, while the media and real-time transmission network system is only responsible for the distribution of streaming media data. The specific purpose and advantages of doing this will be elaborated in the core technology section below.

A new generation of audio and video SDK technology

As for the new generation of audio and video SDK technology, we have made a very clear hierarchical design, which decoupled each module as much as possible. At the same time, more basic modules can be sunk, so that more upper modules share the code. The differences of different platforms are abstracted in the cross-platform encapsulation layer to provide unified call logic for the SDK interface layer.

In the SDK interface packaging layer, we insist that each interface should be simple and easy to use, and at the same time, there should be no strong order dependence between interfaces as far as possible, so as to reduce abnormal problems caused by inconsistent call order of users. Another key point of interface design is to ensure that the interface can be unified and seamlessly switched from one end to the other, whether Android, iOS, PC or Web. In order to make users better use of SDK, netease Yunxin has also launched an easy-to-use system in addition to documents, providing developers with Sample Code, industry solution Demo and general components, which are open-source at the source level and provide high-quality access experience for developers.

New generation audio and video engine

We all know that having a good audio and video experience is the cornerstone of a good audio and video engine. So cloud letter in the design of a new generation of audio and video engine, but also from the user experience QoE index for the top goal, and then down to disassemble.

From the perspective of QoE, the indicators of audio include: whether there is lag, echo and noise, sound quality, etc. Video indicators include: clarity, fluency, color and picture perception; In addition, end-to-end delay and audio and picture synchronization are also very important QoE indicators.

In order to achieve these GOALS of QoE, the engine core is very important. The middle part of the figure is the core layer of the engine of Yunxin, which contains the audio engine, video engine and QoS engine:

  • Audio engine: the most core algorithm is 3A algorithm, yunxin has done a lot of upgrading and optimization work for the new generation of 3A algorithm. For example, AEC not only optimized on the original linear filter and NLP, but also added self-developed dual-talk detection and Noise Injection to improve the performance of AEC in more scenarios. In order to further improve the effect of high-pitched quality, we built an automatic online problem diagnosis and analysis tool based on the online real user data, and used the cloud adaptation and no-reference scoring system of the model to continuously improve the effect of high-pitched quality through the feedback of real data.
  • Video engine: NE264 and NEVC video encoders developed by Yunxin have greatly improved compression effect and speed compared with open source encoders; At the same time, Yunxin has realized a high-performance AI superpartition scheme, which can superpartition the low-resolution picture to 720P or even higher resolution in the video post-processing part, greatly improving the video experience under the condition of insufficient uplink bandwidth. The video superpartition algorithm engineer of Yunxin has made a lot of model optimization, so that the superpartition no longer stays in the laboratory. It has been widely used in the audio and video engine of the new generation of cloud communication.
  • QoS engine: In order to ensure the effect of full link in weak network, QoS engine not only makes a lot of intelligent strategies in Fec/Red/ Inband-FEC and ARQ, but also integrates various excellent algorithms in congestion control and bandwidth estimation, such as GCC, BBR, PCC, etc. We have developed a set of our own congestion control algorithm, so that the flow control algorithm can quickly match all kinds of complex networks.

All engine core algorithms depend on engine platform base capabilities. The lightweight AI model and high-performance reasoning engine developed by Yunxin enable all kinds of AI algorithms to be implemented in RTC high real-time scene. Meanwhile, data acquisition and mining as well as standardized evaluation system make all algorithms and strategies have evidence to rely on, so that the algorithm and actual effect form a complete closed loop. We will also go into more depth in the second half of the article.

Netease Yunxin new generation of audio and video core technology

Finish see infrastructure upgrade, to share the most core part of today, I will separately from the new generation of av integration communication server architecture, large-scale real-time transmission network, the network QoS engine, audio video engine, engine, and the whole link and quality of transparent data six aspects depth profiling cloud letter for you a new generation of audio-visual architecture of the core technology.

A new generation of audio and video converged communication server architecture

1 Service modules are decoupled

The first part is about the server architecture of audio and video fusion communication. First, let’s look at the specific scheme of service module decoupling in the server architecture, that is, the decoupling between the control surface and the media forwarding surface of the audio and video system mentioned above. After the client connects to the edge media server through signaling connection, the edge server synchronizes the publishing and subscription information of the client audio and video streams to the channel and stream management service, and the management server processes all the channel business level flow operations in a unified manner.

The biggest advantage of this method is that there is no complex channel state on the edge media server. When multiple media servers are cascaded, there is no need to synchronize the state and information of rooms and streams between media servers, which greatly reduces the difficulty of cascaded media servers and reduces the problems caused by state asynchronism between media servers at all levels. Through this program, we can relatively easy to achieve a thousand or even ten thousand people concurrent room.

Secondly, the cascade between edge media servers adopts an undirected graph structure, and all cascaded media server nodes are equal. Without vertices, there is no single point of failure, which also improves the reliability of the whole system.

Finally, for the cascade of two media servers, the routing optimization of the intermediate link is done by the real-time transmission network. The edge media server itself does not care about the planning of the intermediate path, which further decouples the media server and the real-time transmission network system, and the real-time transmission network does not need to care about specific audio and video services.

2 perfect integration of SFU and MCU

Another feature of the new generation of audio and video fusion communication server is the perfect integration of SFU and MCU. We all know what SFU and MCU are. To put it simply, SFU forwards all streams to downstream clients on the server, while MCU mixes all streams to downstream clients on the server.

SFU and MCU have their own advantages and disadvantages. SFU server is simple and client layout is flexible, but it needs to consume more traffic and put forward higher requirements on client performance. MCU transfers the performance of codec to the server. The client layout is relatively fixed, but it can save downstream bandwidth and is more friendly to the performance of the client.

In the old generation architecture of Yunxin, we only support the interactive live bypass server to push THE MCU flow to the fusion CDN, so that the audience can watch the stream through the player, but the delay is much higher than RTC.

In the new generation of audio and video servers, we have removed the MCU capability in the original interactive live broadcast and made it into a universal MCU server, so that the client can directly subscribe to THE MCU stream to the edge media server and realize low-delay MCU stream viewing through the low-delay RTC channel. Of course, we also retain the channel of interactive live broadcast bypass push fusion CDN. The unified media Scheduling service is responsible for scheduling all media servers, including universal MCU, interactive live broadcast and converged CDN. Users using the new generation of audio and video system can flexibly choose two schemes according to their own scene and real-time requirements.

A new generation of large-scale real-time transmission networks

The large-scale distributed real-time transmission network developed by Yunxin is the core part of the new generation audio and video architecture server of Yunxin. Today, it is also the first time for the public. Let’s look at the technical details of several parts in detail.

As can be seen from the figure above, when two edge media servers need to be cascaded, the traffic will be imported to the real-time transmission network through the edge A-Node of the real-time transmission network. According to the unified deployment of real-time transmission network scheduling service, the edge nodes of the real-time transmission network send data packets to the destination edge media server in the optimal path through the routing nodes of the transmission network. In this process, the real-time transmission network routing detection and computing service is the control brain for link routing and ensuring optimal quality.

This large-scale distributed real-time transmission network has four major features:

  • Low latency: Edge ACCESS nodes provide global coverage, providing ultra-low latency for user access and quality comparable to dedicated lines.
  • Low cost: Use edge computing power instead of core BGP room to reduce cost while maintaining quality;
  • High reachability: The path planning of routing nodes in real-time transmission network is very intelligent, and multi-channel backup can be achieved globally to ensure high data reachability;
  • High reliability: Real-time transmission network supports hierarchical service, multiple link channels can be switched automatically and quickly at the same time, can achieve fault isolation in seconds, to ensure the stability and reliability of links;

To sum up, the large-scale distributed real-time transmission network developed by Yunxun supports the application layer multicast technology in large-scale cascade scenarios to achieve traffic reuse; At the same time, the segmented QoS idea of media server is used for reference, and the segmented retransmission and FEC are supported to ensure the transmission quality of the whole link network.

We believe that there are several necessary conditions for a good real-time transmission network:

  • Node coverage: global node coverage is the basis of real-time transmission network. Therefore, Cloud communication has deployed streaming media edge servers and real-time transmission network nodes in major domestic provinces and overseas mainstream countries to ensure high node coverage.
  • Quantitative indicators of network data: Between each node of the network is always dynamic change, the business of RTC is particularly sensitive to changes in the network, so you need to do between nodes of network data quantitative indicators, according to the node to detect packet loss, delay and jitter between information to the routing path, assess the quality rate, scores were not optimal path dynamic weed out and do the corresponding alarm;
  • QoE User Experience: Collect end-to-end network data and QoE data of users through real-time transmission network from clients and edge streaming media servers, score the service of each real-time transmission network node, periodically conduct A/B Test between nodes, and finally leave the nodes that can truly provide high-quality QoE user experience for users through horse racing mechanism.

The figure shows the real-time data of some node paths in the transmission network of Cloud Reliance. Due to the large number of nodes, we mainly rely on the sentinel alarm system for link alarm and the automatic non-optimal path elimination function of the system to ensure the automatic operation and maintenance of the real-time transmission network.

Next-generation network QoS engine

To do good transmission, it is not enough to have good nodes, we also need good QoS engines.

1 Full-link QoS

Those who do audio and video or streaming media should be familiar with QoS. We carry out detailed analysis mainly from the following six aspects.

  • Weak resistance network: The anti-weak network capability of Yunxin has been maintained at the first-class level in the industry, and it can fight 70% packet loss in upstream and downstream. The technologies like FEC, RED and HARQ are well known to all. In addition, Yunxin also uses PLC algorithm based on machine learning to further improve the recovery processing after audio packet loss. Long reference frame LTR technology is used to further reduce the lag caused by continuous reference after single frame loss.
  • To shake: Using Jitterbuffer and NetEQ technology is a very mature scheme, but in the new generation of QoS engine, we spend a lot of energy to realize the strategy of Jitterbuffer, ARQ and audio and picture synchronization module. There are a number of engineering challenges in keeping sound and painting synchronized;
  • Intelligent flow control is mainly about bandwidth estimation and congestion judgment. While doing in-depth research on GCC, BBR and PCC algorithms, we have developed a stable and reliable congestion control algorithm by integrating the advantages of each algorithm, which can ensure that the bandwidth utilization rate reaches 90%+ in restricted networks. At the same time, to do a good job in flow control must do a good job in the full link feedback and source-side response, we use Simulcast and SVC technology to increase the dimension of regulation, and use the full link VQC technology to ensure the full link effect of video;
  • Segmented QoS: Yunxin has also done a lot of practice in server-side segmented QoS. The so-called segmented QoS means that the upstream and downstream anti-packet loss and bandwidth estimation should be segmented so as to optimize the service end for each user’s downstream network. Segment QoS in many video conference scene is extremely important technical means, cloud letter intelligent estimate each downstream users on the server bandwidth, and comprehensive utilization of Simulcast and SVC technology, for each user to select the optimal bandwidth allocation strategy, with streaming video selection strategy to the optimal bandwidth utilization to match each user downlink network; In addition, the optimization of access routing is the key to do a good job in upstream QoS. The cloud communication also uses a large number of big data-driven ways to continuously provide each user with the optimal access to edge media servers.
  • The cloud and the issuance of the adaptive configuration: a set of perfect cloud distributed and adaptive parameter configuration service is necessary, in addition to the adapter thousands of model parameters, you need to do adaptive differential scenario, each client differentiation ability need to be negotiated and downgrade, ensure that every client can have an optimal experience;
  • A/B Test: Finally, we believe that A/B Test must be done to do QoS, and A benign closed loop can be formed by using the horserace mechanism to make QoS algorithm do data acquisition with real user network on line continuously.

2 Bandwidth estimation and congestion control

In bandwidth estimation, we integrate the ideas of GCC, BBR and PCC, and make many algorithm and strategy improvements. In online video conferences, users can quickly complete bandwidth detection within one to two seconds after joining the conference, achieving clear bandwidth detection in seconds. In the case of limited bandwidth, bandwidth utilization reaches 90%+ :

  • When the bandwidth suddenly drops, the bandwidth estimation algorithm can complete the detection and downscaling within 1~5 seconds, avoiding congestion to the greatest extent.
  • When the bandwidth jumped, we did not do particularly radical, and it only took 5~10 seconds to judge that the network was really restored, and then it would reach a stable process, and finally reach 90%+ bandwidth utilization.

The general congestion control algorithm optimization process is based on the laboratory environment, which has great limitations. In particular, the real network environment is very complex, vulnerable to WIFI signal, traffic competition and other factors. To solve this problem, Yunxin designed a congestion control algorithm optimization scheme based on big data. We collect data online and feature the collected data, and then use this feature to simulate the online network environment in the laboratory, and then optimize and verify the congestion control algorithm. Finally, through A/B Test, the comparison of big data before and after algorithm optimization is verified online, and the congestion control algorithm is continuously optimized to form A benign closed loop.

New generation video engine

After talking about QoS engines, let’s move on to video engines.

1 self-developed video encoder NEVC

The core of the video engine is the video coding technology, yunxin developed the video encoder NEVC.

As shown in the table above, it is our measured result: NEVC is directly compared with X264 faster speed, but the compression rate is increased by 30%.

Compared to the X265 Veryslow, the compression rate is the same, but the speed is at least 50 times faster, up to 80 times faster in some cases.

NEVC is the cornerstone of yunxin video engine. Under its escort, the performance of terminal and the video effect of the whole link have been greatly improved.

2 AI screen sharing

In addition to video coding, Yunxin has also made a special optimization for screen sharing. Receiving side after receiving include desktop sharing video, first find out the text area via text recognition, AI deep learning optimization for text areas, make the text more clearly, using a cloud letter since the research of lightweight depth model, in conjunction with the heterogeneous accelerated, can do 1080 p real-time optimization, greatly improving the cloud letter under the screen sharing scene effect.

3 AI Super resolution

Then look at the AI super resolution developed by Yunxin. In fact, the topic of super resolution has been very popular in recent years, but it is rarely applied to the actual SCENES of RTC. The main reason is that it is difficult to achieve the low power consumption and high real-time performance of super resolution on the mobile terminal.

Yunxin is based on the lightweight network developed by itself, combined with the heterogeneous high-performance reasoning engine, which can achieve ultra-low power consumption and greatly reduce the cost of super-segmentation algorithm, so as to truly land on the mobile terminal. Relying on the unique data set processing technology of Yunxin, the result of supersegmentation is better under the same network model. Compared with other upsampling interpolation algorithms, the AI supersegmentation developed by Yunxin has higher PSNR and SSIM scores. Yunxin applies AI super fraction to the real online scene. In the case of downgrading the resolution of uplink weak network, AI super fraction is used at the receiving end to improve the picture quality of the video, so that the video has been significantly improved.

The figure above shows the real test effect of the left 360P super-partition to 720P. Visible to the naked eye, the improvement of clarity after super-partition is very obvious.

New generation audio engine

After watching the video engine, we will take a look at the audio engine in the new generation of audio and video of Yunxin.

This is the flow chart of the processing of the new generation of audio engine of Yunxin. This chart is very complicated, and there are many technical points involved in it. Let’s analyze several important technical points.

The first is triple-A, which is always important for audio engines. For echo cancellation, in addition to the optimization of the original linear filter and NLP, the new generation of audio engine also adds self-developed dual-talk detection and Noise Injection, which further improves the effect of AEC in more scenarios.

For noise reduction, we modular the noise reduction process, in the scene detection classification is the USE of AI model algorithm, respectively trained the SPEECH AI model and noise AI model, so that the performance of noise identification is greatly improved, also let our AI-based noise reduction algorithm compared with the traditional noise reduction effect has been significantly improved. In addition to scene detection and noise reduction combined with AI, we also have many other modules deeply integrated with AI, and these AI algorithms are based on the lightweight network model and high-performance reasoning engine developed by Yunxin, and the algorithm performance is greatly improved.

Another important point of audio algorithm is to do model adaptation. We rely on the big data collection and no-reference scoring system of Yunxin to do a lot of automatic model adaptation work in the cloud, which greatly improves the efficiency and coverage of model adaptation. In order to further quantify the high voice quality of speech, we developed a laboratory standard evaluation system, which comprehensively uses objective indicators and subjective methods. Through this evaluation system, our audio algorithm optimization and quantitative indicators can become a benign closed loop, constantly polishing and improving the audio quality of cloud communication.

Full link data and quality transparency

Finally, the core technology of the new generation of audio and video is data and quality transparency system. We all know that this is an era of big data, and we believe that it is extremely important for the continuous evolution of audio and video technology to be driven by big data. Therefore, in order to be well driven by big data, we collected and reported data from SDK, engine, scheduling server, channel and stream management server, edge media server, real-time transmission network, that is, every link of audio and video business.

Real-time indicator reporting refers to real-time status reporting, which covers more than 100 key events, including user behavior, system behavior, QoS behavior and many key events related to audio and video QoE experience, such as resolution switching, login time, drawing time, etc. In addition, 300+ core indicators are reported, including: system runtime status of the full link, QoS indicator, QoE indicator and server status. These reported data will be processed by our big data platform, with tens of billions of lines of data per day and T-level storage. With our high-performance big data platform, it is possible to process these data in real time at the second level.

With this data, we can drive many things, including quality transparent platform (open to users), business quality report, output access scheduling optimization, large network routing optimization, QoS algorithm optimization, real-time QoE fault alarm, all dimensions of aggregated quality analysis. Yunxin’s new generation of audio and video is constantly polished to improve its quality under the continuous drive of big data.

The above figure shows the quality and transparency platform open to users by Yunxin. This platform makes a visual display of each indicator, and the meaning of each quality guarantee is clear and easy to understand, which greatly reduces the difficulty of troubleshooting for users and makes users clear about the quality of the whole link.

In addition, the description of these indicators is as clear and easy to understand as possible, and all indicators in the whole link are clearly visible. The real-time effects of real-time transmission network, media edge server and media gateway processing are directly displayed on the transparent quality of the large plate.

The new-generation audio and video scene of netease Yunxin has been implemented

The core techniques are shared, and the final part is a look at the actual landing scenario. No matter how good the technology is, it’s useless if you can’t land it. So we have been sticking to the actual scene in the continuous polishing of the specific algorithm.

Cloud Music Listen together – Innovative music social experience

Listen to the music scene is this netease cloud, netease cloud music launched in July this year: “listen to” you can invite your friends to play, can make real-time audio interaction at the same time, this is a completely innovative music social experience, cloud letter of a new generation of audio-visual many core technologies in the scene is down to:

  • First of all, chat scenarios have very high latency requirements, so the overall low latency and QoS capabilities are clearly implemented in this scenario, including whether the effect is good or not.
  • Secondly, its sound quality requirements are very high, you certainly do not want to listen to music voice effect is very poor, this is not acceptable. Therefore, our ability of high sound quality, including AI noise reduction, has been implemented and practiced in this scene.
  • Finally, the model adaptation, because there are many users like netease Cloud Music, mobile phones are also varied, so we need to do a good model adaptation, including which model has no echo, which model is OK noise reduction.

Through these core capabilities, netease Cloud Music’s new generation of audio and video ensures the stability, reliability and high quality of netease Cloud Music’s listening scenes.

Cloud Music Look Live – Establish pan-music entertainment state

Look live broadcast of netease Cloud Music is the scene of live broadcast between anchors with mics.

  • The fusion communication capability of RTC+Live, a new generation of audio and video of Yunxin, is implemented in this scene.
  • At the same time, this scene has high requirements on video quality, and the new generation of self-developed HD video coding and video AI superscore ability are fully verified in this scene.

Through the landing of the above two scenes, Yunxin has also established a complete pan-music entertainment ecology.

Dubbing show – Yunxin multi-person voice chat room solution

Dubbing show is a dubbing App in multi-person language, which is an interactive live scene with multi-person voice and microphone. This scene is relatively complicated, and the following points are worth paying attention to:

  • Yunxin provides customers with multi-person voice chat room solutions in the easy to use system. Relying on the ease of use of SDK, the dubbing show only takes one week to access and go online.
  • Dubbing shows also have high requirements on sound quality, so the Noise reduction capability of voice AI and intelligent echo elimination on the host side of the new generation of audio and video provide high-quality audio effects for dubbing.
  • QoS anti-weak network and anti-jitter ability ensure the network stability of dubbing when 12 people are on the mic at the same time.

Cold cloud games against water – componentization of yunxin video conference

On October 24, netease held the second International Distributed AI Conference in upstream Cold Cloud Game, with 300 participants attending each conference. Yunxin provided componentization support for upstream cold cloud game video conference.

  • With 300 participants from all over the world, the high-performance server cascading architecture of next-generation audio and video, global real-time transmission network and other capabilities were fully verified in this scene.
  • Due to the large number of participants in video conference, it is necessary to make optimal network matching for each downlink user, so the segmented QoS capability of the new generation of audio and video is very important.
  • In addition, uHD AI screen sharing capability also played a significant role in attendees’ use of screen sharing for POWERPOINT presentations.

The conference was well received by the participants at home and abroad. In fact, Yunxin also proposes video conferencing components, which are very easy to implement when helping users to achieve video conferencing scenarios. In my opinion, these contents are all the thoughts and summaries made by Yunxin when it helps netease to implement internal or external scenes.

conclusion

That’s all for today, just a quick summary.

This year, netease put forward the value of “from 0 to 1 is innovation, and from 1 to 1.1 is innovation”. Netease Cloud communication audio and video has gone from 0 to 1, and the new generation of audio and video should do from 1 to 1.1. We want to grow with our customers, constantly fine-polish each vertical scene, and break through the red sea at present.

In today’s sharing, I believe you have seen that netease Yunxin has embraced AI technology in all links of audio and video link, because we believe that artificial intelligence can change the existing technical system, and I firmly believe that technology can change the world, and the audio and video field has infinite possibilities in the future!

Thank you!

Share guest Introduction

Wu Tong, after graduated from zhejiang university master degree in 2013 to join the netease, have been engaged in the related work of streaming audio and video now netease cloud letter streaming media chief architect, multimedia development experts, is responsible for the overall RTC live, live, interactive, transmission net and architecture design and coding, letter from scratch to participate in creating a cloud of audio and video technology.