13. The WebRTC

1. What is WebRTC

Short for Web Real-Time Communication, it is an API that enables Web browsers to conduct real-time voice or video conversations. It became open source on June 1, 2011 and was incorporated into the World Wide Web Consortium’s W3C recommendation with support from Google, Mozilla, and Opera.

Composition of 1.1

VideoEngine
Sound Engine
Session Management
ISAC: Sound compression
VP8: Video codec for Google’s own WebM project
APIs (Native C++ API, Web API)

1.2 important API

The WebRTC native APIs are written based on the WebRTC specification. These APIs can be divided into Network Stream API, RTCPeerConnection API and Peer-to-peer Data API.

1.2.1 Network Stream API

MediaStream: MediaStream is used to represent a media data stream.
MediaStreamTrack represents a media source in the browser.

1.2.2 RTCPeerConnection

RTCPeerConnection: An RTCPeerConnection object allows users to communicate directly between two browsers.
RTCIceCandidate: indicates an ICE protocol candidate.
RTCIceServer: indicates an ICE Server

1.2.3 Peer – to – Peer Data API

DataChannel: DataChannel an interface represents a bidirectional DataChannel between two nodes

2. The architecture

2.1 Description of Color Labels

The purple is the Web developer API layer;
The solid blue line is the API layer for browser vendors
The dotted blue line is implemented by the browser vendor

2.2 Architecture Components

2.2.1 Your Web App

Web developers can develop real-time communication applications based on video and audio based on Web API provided by integrated WebRTC browsers.

2.2.2 Web API

The WebRTC standard API (Javascript) for third-party developers makes it easy for developers to develop web applications similar to web video chat. The latest standardization process can be seen here.

These apis can be divided into Network Stream API, RTCPeerConnection API and peer-to-peer Data API. You can see the detailed API description here. Network Stream API

MediaStream: MediaStream is used to represent a media data stream.
MediaStreamTrack represents a media source in the browser.
RTCPeerConnection
RTCPeerConnection: An RTCPeerConnection object allows users to communicate directly between two browsers.
RTCIceCandidate: indicates an ICE protocol candidate.
RTCIceServer: indicates an ICE Server.
Peer-to-peer Data API
DataChannel: DataChannel an interface represents a bidirectional DataChannel between two nodes.

2.2.3 WebRTC Native C++ API

The native C++ API layer makes it easy for browser manufacturers to implement the WebRTC standard Web API and abstract the digital signal process.

2.2.4 Transport/Session

Transport/session layer
- The session layer component uses part of the libjingle library to implement, without using XMPP/Jingle protocol
A. RTP Stack
- Real Time Protocol
b. STUN/ICE
- Call connections between different types of networks can be established through STUN and ICE components.
c. Session Management
- An abstract session layer that provides session establishment and management capabilities. This layer protocol is left to the application developer to customize the implementation.

2.2.5 VoiceEngine

Audio engine is a framework that contains a series of audio multimedia processing, including the whole solution from video capture card to network transmission end.

PS: VoiceEngine is one of WebRTC’s most valuable technologies and is open source after Google acquired GIPS. VoIP is a technology industry leader, more on that in future articles

a. iSAC
- Internet Speech Audio Codec
- Wideband and ultra-wideband audio codecs for VoIP and audio streaming are the default codecs for the WebRTC audio engine
- Sampling frequency: 16khz, 24khz, 32khz; (default: 16khz)
- The adaptive rate ranges from 10kbit/s to 52kbit/s.
- Adaptive packet size: 30~60ms;
- Algorithm delay: Frame + 3ms
b.iLBC
- Internet Low Bitrate Codec
- Narrowband voice codec for VoIP audio streams
- Sampling frequency: 8khz;
- The 20ms frame bit rate is 15.2 KBPS
- 30ms frame bit rate is 13.33 KBPS
- Standards are defined by IETF RFC3951 and RFC3952
c.NetEQ for Voice
- Speech signal processing components for audio software implementation
- NetEQ algorithm: adaptive jitter control algorithm and speech packet loss hiding algorithm. Enables it to adapt to the changing network environment quickly and with high resolution, ensuring beautiful sound quality and minimum buffering delay.
- GIPS is the world’s leading technology to effectively deal with the impact of network jitter and voice packet loss on voice quality.
- PS: NetEQ is also a very valuable technology in WebRTC. It has obvious effect on improving the quality of VoIP. It is better to integrate AEC, NR, AGC and other modules.
d.Acoustic Echo Canceler (AEC)
- Echo canceller is a software-based signal processing component that can remove the echo collected by MIC in real time.
e.Noise Reduction (NR)
- Noise suppression is also a software-based signal processing element for eliminating certain types of background noise associated with VoIP (hiss, fan noise, etc… …).

2.2.6 VideoEngine

WebRTC video processing engine
- VideoEngine is a series of video processing framework, from camera capture video to video information network transmission and video display the whole process of the solution.
a. VP8
- The video image codec is the default codec for the WebRTC video engine
- VP8 is suitable for real-time communication applications because it is primarily a codec designed for low latency.
- PS: The VPx codec was open-source after Google acquired ON2, and VPx is now part of the WebM project, which is
- One of the HTML5 standards Google is committed to promoting
b. Video Jitter Buffer
- Video jitter buffer reduces the impact of video jitter and video packet loss.
c. Image enhancements

Image quality enhancement module * processes the images collected by the webcam, including brightness detection, color enhancement, noise reduction and other functions, to improve the video quality.

3. The video

The video part of WebRTC includes collection, codec (I420/VP8), encryption, media files, image processing, display, network transmission and flow control (RTP/RTCP) and other functions.

3.1 Video Capture: Video_Capture

The source code is in webrtc\modules\video_capture\main, which contains the source code of the interface and each platform.
On Windows platform, WebRTC uses Dshow technology to enumerate video device information and video data collection, which means that most video collection devices can be supported; There’s nothing to do with video capture cards that require a separate driver, such as the Hikon HD card.
Video capture supports multiple media types, such as I420, YUY2, RGB, UYUY, etc., and can control frame size and frame rate.

3.2 Video codec –video_coding

The source code is in the webrtc\modules\video_coding directory.
WebRTC uses I420/VP8 codec technology. VP8 is an open source implementation of Google’s acquisition of ON2 and is also used in WebM projects. VP8 provides higher quality video with less data, making it particularly suited to requirements such as video conferencing.

3.3 Video Encryption –video_engine_encryption

Video encryption is a part of the Video_engine of WebRTC, which is equivalent to the function of video application layer. It provides data security for both sides of the point-to-point video and prevents video data leakage on the Web.
Video encryption encrypts and decrypts video data at both the sender and the receiver. The key is negotiated by both sides of the video, but the performance of video data processing is affected. You can also do without video encryption, which is better for performance.
The data source of video encryption may be the original data stream or the encoded data stream. It is estimated to be the encoded data stream, so that the encryption cost will be smaller, which needs further study.

3.4 Video media file –media_file

The source code is in the webrtc\modules\media_file directory.
The function is to use local files as video sources, somewhat similar to the function of virtual cameras; The supported formats are Avi.
In addition, WebRTC can also record audio and video to local files, more practical functions.

3.5 Video image processing –video_processing

The source code is in the webrtc\modules\video_processing directory.
Video image processing for each frame of the image processing, including brightness detection, color enhancement, noise reduction and other functions, to improve the quality of video.

3.6 Video display –video_render

The source code is in the webrtc\modules\video_render directory.
On Windows platform, WebRTC uses Direct3D9 and DirectDraw to display video, only this way, must be this way.

3.7 Network transmission and flow control

For network video, data transmission and control is the core value. WebRTC uses mature RTP/RTCP technology.

4 audio

WebRTC audio parts, including equipment, decoding (iLIBC/iSAC/G722 / PCM16 / RED/AVT, NetEQ), encryption, sound files, voice processing, sound output, volume control, audio and video synchronization, network transmission and flow control (RTP/RTCP), and other functions.

4.1 Audio Device — Audio_device

The source code is in webrtc\modules\ Audio_device \main, which contains the source code for the interface and each platform.
On Windows, WebRTC uses Windows Core Audio and Windows Wave technologies to manage Audio devices, and also provides a mix manager.
Using audio equipment, you can achieve sound output, volume control and other functions.

4.2 Audio codec — Audio_coding

The source code is in the directory webrtc\modules\ Audio_coding.
WebRTC using iLIBC/iSAC/G722 PCM16 / RED/AVT codec technology.
WebRTC also offers NetEQ, a jitter buffer and packet loss compensation module that improves sound quality and minimizes latency.
Another core feature is voice conference-based mixing.

4.3 Voice Encryption: voice_engine_encryption

Like video, WebRTC offers voice encryption.

4.4 Sound Files

This feature allows you to use local files as audio sources in Pcm and Wav formats.
Also, WebRTC can record audio to local files.

4.5 Sound Processing — Audio_processing

The source code is in the webrtc\modules\ Audio_processing directory.
Sound processing Audio data is processed, including echo cancellation (AEC), AECM(AEC Mobile), automatic gain (AGC), Noise reduction (NS), and mute detection (VAD) to improve sound quality.

4.6 Network Transmission and flow control

Like video, WebRTC uses mature RTP/RTCP technology.

5. Several streaming video formats

agreement	httpflv	rtmp	hls	dash	webscoketflv	webRtc	rtsp
transport	HTTP transport	TCP transport	HTTP transport	HTTP transport	Webscoket transmission	Webscoket transmission	HTTP transport
Video package format	flv	flv tag	M3u8 Ts file	Mp4 3gp webm	flv	SDP
Time delay	low	low	high	high	low	low
The data segment	Continuous flow	Continuous flow	Continuous flow	Slice file	Continuous flow
HTML 5 play	Unpack playback via HTML5 (flv.js)	Does not support	Html5 unpack playback (hls.js)	If dash file list is mp4 webM file, can play directly	Unpack playback via HTML5 (flv.js)

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

1. What is WebRTC

Composition of 1.1

1.2 important API

1.2.1 Network Stream API

1.2.2 RTCPeerConnection

1.2.3 Peer – to – Peer Data API

2. The architecture

2.1 Description of Color Labels

2.2 Architecture Components

2.2.1 Your Web App

2.2.2 Web API

2.2.3 WebRTC Native C++ API

2.2.4 Transport/Session

2.2.5 VoiceEngine

2.2.6 VideoEngine

3. The video

3.1 Video Capture: Video_Capture

3.2 Video codec –video_coding

3.3 Video Encryption –video_engine_encryption

3.4 Video media file –media_file

3.5 Video image processing –video_processing

3.6 Video display –video_render

3.7 Network transmission and flow control

4 audio

4.1 Audio Device — Audio_device

4.2 Audio codec — Audio_coding

4.3 Voice Encryption: voice_engine_encryption

4.4 Sound Files

4.5 Sound Processing — Audio_processing

4.6 Network Transmission and flow control

5. Several streaming video formats

6.

13. The WebRTC

1. What is WebRTC

Composition of 1.1

1.2 important API

1.2.1 Network Stream API

1.2.2 RTCPeerConnection

1.2.3 Peer – to – Peer Data API

2. The architecture

2.1 Description of Color Labels

2.2 Architecture Components

2.2.1 Your Web App

2.2.2 Web API

2.2.3 WebRTC Native C++ API

2.2.4 Transport/Session

2.2.5 VoiceEngine

2.2.6 VideoEngine

3. The video

3.1 Video Capture: Video_Capture

3.2 Video codec –video_coding

3.3 Video Encryption –video_engine_encryption

3.4 Video media file –media_file

3.5 Video image processing –video_processing

3.6 Video display –video_render

3.7 Network transmission and flow control

4 audio

4.1 Audio Device — Audio_device

4.2 Audio codec — Audio_coding

4.3 Voice Encryption: voice_engine_encryption

4.4 Sound Files

4.5 Sound Processing — Audio_processing

4.6 Network Transmission and flow control

5. Several streaming video formats

6.

Related Posts

How to use LerNA for front-end module management

CSS Revealed – Multiple borders + rounded edges inside borders

Hand-held from scratch – encapsulates a VUE video player component