Author: MAO Yujie, sound network WebRTC expert

Some people say that 2017 is the year of WebRTC turning point, 2018 will be the year of WebRTC outbreak. Last year, WebRTC 1.0 was drafted and will be officially released this year. At the same time, more and more browsers and vendors are starting to support it widely, and WebRTC is becoming the infrastructure of the Internet.

According to the 2017 wechat Data Report, as of September 2017, the number of successful calls on wechat was 2.05 times per day, the average monthly call duration was 139 minutes, and the average monthly call times was 19 times. Through these data, we can see that the emergence of wechat video call has subtly changed the way people communicate with each other.

Looking back at the data of the three major carriers, voice call volume recorded negative growth for the first time in 2015, which shows how strong the impact of Internet OTT applications on the traditional voice call business is. Thanks to these improved infrastructure, faster smartphones, faster networks, and richer usage scenarios, the need for real-time communication is growing. Since 2015, innovations such as interactive live broadcasting, werewolf killing, doll catching, live answering questions, and online KTV have emerged continuously, transferring common offline scenes to online, which is also sufficient to prove that real-time audio and video communication is gaining popularity.

More and more entrepreneurs are thinking about how to take the offline interaction scene and build the next popular app online.

When it comes to real-time Communication, WebRTC has to be mentioned. The full name of WebRTC is Web Real Time Communication. From the word “Web”, it can be seen that this technology was originally designed for real-time audio and video capabilities of browsers.

But in fact, WebRTC contains different meanings in different scenarios. It can either represent the WebRTC project of Google open source, or the WebRTC standard developed by the W3C working group, or the WebRTC interface in the browser. We collectively call them WebRTC technology. At present, applications or services with real-time audio and video capabilities more or less use WebRTC technology. Of course, all of these are inseparable from The Open source WebRTC project of Google. Let’s dig up the story behind WebRTC.

Looking back: Why WebRTC

Speaking of WebRTC, we have to mention Gobal IP Solutions, or GIPS. A VoIP software developer founded in Stockholm, Sweden in 1990, it offers arguably the best voice engine in the world. Skype, Tencent QQ, WebEx, Vidyo, and others use its audio processing engine, which includes patented echo cancellation algorithms, low-latency algorithms to accommodate network jitter and packet loss, and advanced audio codecs.

Google also uses the GIPS license in Google Talk. Google acquired GIPS in 2011 and made its source code open source, along with the VPx series of video codecs acquired by On2 in 2010. GIPS audio and video engine + replace H.264’s VPx video codec.

Since then, Google has incorporated WebRTC with libjingle, an open source project for P2P hole-making in Gtalk. Therefore, WebRTC currently provides apis for all platforms including Web, iOS, Android, Mac, Windows and Linux, ensuring the consistency of APIS on all platforms. The benefits of using WebRTC are mainly as follows:

  • Free access to GIPS ‘advanced audio and video engine, previously required a paid license.
  • Since audio and video transmission is based on point-to-point transmission, the simple 1-to-1 call scenario requires fewer server resources, and the free STUN/TURN server can greatly save costs.
  • Web application development is very convenient, using simple JS interface, no need to install any plug-in, can achieve audio and video communication.

The impact of WebRTC standards

On 2 November 2017, after 6 years of work, the W3C WebRTC 1.0 draft was finalized. Also in 2017, Microsoft Edge and Apple Safari announced support for the WebRTC 1.0 standard API in their latest releases.

While some implementation details vary from browser vendor to browser vendor (Safari only supports H.264, SDP description format is different, etc.), with the exception of IE, All major browsers like Google Chrome, Mozilla Firefox, Apple Safari, and Microsoft Edge already support WebRTC 1.0, making it possible for all browsers to communicate with each other without plug-ins.

On more and more terminal devices, high-quality audio and video calls can be made by opening web links without using any plug-ins or native applications. Application developers also do not need to pay attention to the details of audio and video engine implementation, which greatly saves development costs.

A wide range of applicable scenarios

WebRTC can be applied to a wide range of scenarios, and many industries can create interesting scenarios combining real-time communication. Traditional real-time communication application scenarios are mainly video conferencing, video interview, VoIP calls, call centers, and products such as WebEx and Skype.

At present, the hot scenes mainly focus on the live broadcast of social networking, games, sports, TV and blind date, as well as interactive mic, online education, online medical, online account opening of financial securities, intelligent hardware (such as uav), smart home devices such as camera monitoring and intelligent voice devices.

Of course, in addition to providing audio and video transmission function, WebRTC also has a easily neglected function is data transmission. Using peer-to-peer transmission mechanisms, some developers have created services such as Webtorrent and PeerCDN that do not pass through servers. So WebRTC is perfect for building real-time communication applications.

As a hot application at present, the use of WebRTC is indispensable for live broadcasting, which also mentions RTMP.

From the RTMP to WebRTC

From the perspective of application, due to the change of users’ usage habits, more and more live broadcast products begin to add the function of video intercommunication. At the same time, applications such as video conferencing and video verification are also increasing. This affects the evolution of technology selection.

RTMP(Real Time Messaging Protocol) is an open Protocol developed by Adobe Systems for audio, video, and data transmission between Flash players and servers. With the rise of live broadcasting, many people use it in live broadcasting.

In terms of protocol, RTMP can fully meet the requirements of live broadcast products, but it cannot meet the requirements of video communication products due to its relatively high latency. So people naturally turn to UDP, QUIC (based on UDP) and other network protocols with lower latency.

In terms of technical framework, since it is relatively complex to develop a communication system that meets the requirements of video communication, it not only involves network transmission, front-end development, mobile terminal development, but also solves the complex algorithm optimization in audio and video coding and decoding, which has high requirements on the technology stack of developers, so more and more people choose WebRTC.

At present, WebRTC has been supported by more and more browser manufacturers and related technology manufacturers, and the application prospect will be broader.

However, due to some shortcomings of WebRTC itself, most developers do not use WebRTC directly, but carry out secondary development based on WebRTC according to actual scenarios. WebRTC itself is not a panacea. It is impossible that one set of code and interfaces will solve all problems.

How to do secondary transformation?

WebRTC is an excellent project, but it has the following problems when used directly.

First, use WebRTC is to point to the transmission, while saving the cost of the server resources, but also brought in actual use transmission quality problems, such as transmission between multinational and cross operator network it is often difficult to guarantee quality, although there are excellent WebRTC end-to-end quality control algorithm, but under the condition of complex network, Performance was hardly satisfactory.

Second, WebRTC’s performance on mobile is also difficult to satisfy. In the early stage, due to the lack of support for H.264 codec, the mobile terminal could only use VP8 software codec for a long time, resulting in poor performance on middle and low-end mobile phones. Coupled with the fragmented nature of Android itself, it is difficult to have a unified user experience without adaptation for different models.

Third, WebRTC is designed for 1-to-1 communication scenarios. If you want to implement a multi-person scenario, you still need to use a server solution. Even though there are many open source webRTC server implementations, the deployment and maintenance of a streaming server or mixed-stream server is complex.

Fourth, on the Web side, there are compatibility problems between different browsers. Although using AdapterJS can solve the problem of interface adaptation between different browsers, but in addition, it still faces the problem of inconsistent behavior between different browsers. It can be said that if WebRTC is directly commercialized, it is almost impossible. At present, the common solution is self-research, secondary customized development according to their own business scenarios, or more simply, the use of third-party SDK.

The prospect of the WebRTC

In the future, WebRTC is still a very important piece of the puzzle in the field of real-time communication.

Both Web and Native are very dependent on the audio and video engine provided by WebRTC, especially on the Web side. Almost all browser manufacturers’ implementations are based on The Google WebRTC project. With the finalization of WebRTC 1.0 standard, WebRTC interfaces of major browsers have been basically unified.

WebRTC has always lacked testing tools. Late last year, Google launched the KITE open source project to help developers test WebRTC interoperability across browsers. For the standardization community, the next step will focus on providing a more complete set of test suites, which can not only help developers test the interoperability and experience of WebRTC applications on the Web and Native end, but also help ensure the consistency of WebRTC interface functions in browsers of various manufacturers. And gradually improve the WebRTC missing functions.

In terms of related technologies, QUIC has also entered the field of vision of more people. For WebRTC, QUIC can speed up the connection of data channels (at least in principle) and completely replace SCTP. The problem is, the only browsers that currently support QUIC are Chrome and Opera.

On the other hand, browsers continue to fix problems and adapt to different hardware devices and system platforms to ensure that WebRTC can run stably on more devices than the mainstream models and system versions.

If you are also working on a WebRTC app and have any questions, please visit the RTC Developer community and post to talk to more peers or share your work.