This post is from the RTC developer community. Please visit to share your experience with more real-time audio and video developers and participate in more technical activities.

There are many open source projects for learning real-time audio and video development. A real-time audio and video application includes several links: collection, coding, pre – and post-processing, transmission, decoding, buffering, rendering and many other links. Each subdivision, there are more subdivision of technical modules. For example, the pre and post processing includes beauty, filter, echo cancellation, noise suppression, etc., the acquisition includes microphone array, etc., and the codec includes VP8, VP9, H.264, H.265, etc.

Today we’ve rounded up some open source projects that can help live audio and video developers who are learning or developing audio and video, and several commercial services that are also contributing to the open source community. These projects are divided into several categories: audio and video codec class, video pre – and post-processing, server class, etc.

Audio and video codec open source project

The function of video codec is to compress the image after the camera of the device collects the picture and pre-processing, and then encode it digitally for transmission. The advantages and disadvantages of codec are: the level of compression efficiency, speed and power consumption.

At present, the mainstream video encoders are divided into three series: VPx (VP8, VP9), H.26X (H.264, H.265), AND AVS (AVS1.0, AVS2.0). VPx series is an open source video codec standard created by Google. With the same quality, the bit rate of VP9 is reduced by about 50% compared with VP8. H. 26X series is widely supported by hardware. The encoding efficiency of H.265 can be improved by 30-50% than the last generation, but the complexity and power consumption will be much larger than the last generation, so there are certain bottlenecks in the implementation of pure software coding. Under the existing technology, we still need to rely on hardware codecs. AVS is a second-generation coding standard for information sources with independent intellectual property rights in China, which has been developed to the second generation.

WebRTC

The first thing you’ll be using is WebRTC, an open source project that enables real-time voice or video conversations through a Web browser. It provides audio and video collection, encoding and decoding, network transmission, display and other functions. If you want to develop real-time audio and video applications based on WebRTC, be aware that since WebRTC lacks server-side design and deployment solutions, you will need to combine WebRTC with open source projects like Janus.

Official website address:webrtc.org/

x264

H.264 is the most widely used bit stream standard. X264 is an encoder that can produce code stream conforming to H.264 standard. It can encode video stream into H.264, MPEG-4 AVC format. It provides a command-line interface, which is used for graphical user interfaces such as Straxrip and MeGUI, and an API called by FFmpeg, Handbrake, etc. Of course, if you have X264, you have X265 for HEVC/H.265.

Official website address:www.videolan.org/developers/…

FFmpeg

FFmpeg provides encoding, decoding, conversion, packaging and other functions, as well as post processing such as clipping, scaling, color gamut and so on. It supports almost all the current audio and video coding standards (because there are many formats, we will not list them, you can find them on Wikipedia).

At the same time, FFmpeg also derived from the LiBAv project, which gave rise to the video decoder LAV, many playback software can call LAV to decode, and LAV itself also supports the use of video card for video hard solution. Many mainstream video players use FFmpeg as their core player. It’s not just video players. Browsers like Chrome, which can play web videos, also benefit from FFmpeg. Many developers have also done a lot of development based on FFmpeg and open source it, such as The great God Lei Xiaohua (the code can be seen on his SourceForge).

Official website address:ffmpeg.org/

ijkplayer

Before introducing iJkPlayer, I need to mention ffPlay. Ffplay is a portable media player that uses the FFmpeg and SDL libraries. Ijkplay is Bilibili’s open source ffplay.c based lightweight iOS/Android video player. The API is easy to integrate and the compilation configuration can be tailored to control the size of the install package.

In terms of codecs, iJkPlayer supports both soft and hard video decoding, which can be configured before playing, but cannot be switched during playing. Video hardpacks for iOS and Android use the familiar VideoToolbox and MediaCodec, respectively. But ijkPlayer only supports soft solutions for audio.

Making address:Github.com/Bilibili/ij…

JSMpeg

JSMpeg is a JavaScript – based MPEG1 video decoder. If you want to do live video streaming on H5, you can consider using JSMpeg for decoding on mobile. In the H5 end to do audio and video live, you can use JSMpeg for video decoding, which is also the recent more popular H5 claw doll mainstream strategy.

Making address:Github.com/phoboslab/j…

Opus

Opus is a high flexibility audio encoder developed in C language. It is specially optimized for ARM and x86 and implemented by fib-point. Opus has clear advantages on all fronts. It supports both voice and music encoding at 6K-510K bit rates. It combines SILK coding method and CELT coding method. SILK was originally used in Skype, linear predictive analysis (LPC) based on speech signals, and wasn’t good for music support. CELT, though suitable for full-bandwidth audio, is inefficient for low-bit-rate speech, so the two complement each other in Opus.

Opus is “replacing” Speex. But there are features in Speex that Opus lacks, such as echo cancellation. This function has been isolated from the encoder. Therefore, if you want to achieve good echo cancellation, you can cooperate with THE AEC and AECM modules of WebRTC for secondary development.

Official website address:opus-codec.org/

live555

Live555 is a C++ streaming media open source project, which not only includes the transmission protocol (SIP, RTP), audio and video encoder (h.264, MPEG4), but also includes the example of streaming media server, streaming media project is the first choice, the transmission module is very worthy of video conference development as a reference.

Official website address:www.live555.com/

Audio and video pre – and post-processing open source project

Pre – and post-processing involves a number of segmentation techniques that, when applied correctly, can improve video quality more or less. However, each additional processing link will inevitably increase the amount of calculation and delay, so how to choose, we should consider each other. Seetaface

Seetaface is a complete set of face detection, face alignment and face verification scheme open source by The Chinese Academy of Sciences Teacher Shan Shiguang. Code based on C++ implementation, open source protocol bsd-2, available for academic and industrial free use. It does not depend on any third-party library function. In the case of using the aligned LFW images, the detection and align can reach 97.1% when all the open source software is used.

Making address:Github.com/seetaface/S…

GPUImage

Now in the iOS end to do beauty effect, watermark, the basic will use GPUImage, it has 125 built-in rendering effects, but also support script customization. This project has realized the image filter and the camera real-time filter. Its advantage is that the processing effect is based on the GPU implementation, compared with the CPU processing performance is higher.

Making address:Github.com/BradLarson/…

Open nsfw model

Open NSFW Model is a Yahoo Open source project called Open Not Suitable for Work Model, which identifies images that are Not suitable for viewing at work. It is based on Caffe framework training model for audio and video post processing. However, it is not yet able to identify gruesome and bloody images.

Making address:Github.com/yahoo/open_…

Soundtouch

Soundtouch is an open source audio processing framework, the main function of the audio variable speed, tone change, to achieve the effect of sound change. It can also process media streams in real time. Adopt 32 bit floating point or 16 bit fixed point, support mono channel or double channel, sampling rate range of 8K-48K.

Official website address:www.surina.net/soundtouch/

Server-side open source projects

As we said at the beginning, WebRTC lacks server-side design and deployment. Using MCU and SFU to implement multi-party chat and improve the quality of transmission need to be done by developers themselves. These open source projects can help.

Jitsi

Jitsi is an open source video conferencing system that enables online video conferencing, document sharing, and instant messaging. It supports network video conferencing and uses SFU mode to realize video router function. The development language is Java. It supports SIP account registration for phone calls. It can be installed not only on a local PC but also on a cloud platform.

Official website address:jitsi.org/

JsSIP

JsSIP is a library based on WebRTC’s JavaScript SIP protocol implementation that runs in browsers and Node.js. It can run with OverSIP, Kamailio, Asterisk, OfficeSIP and other SIP servers.

Making address:Github.com/versatica/J…

SRS

SRS is a simple domestic RTMP/HLS live broadcast server authorized by MIT protocol. The latest version also supports FLV mode, real-time RTMP, and THE HTTP protocol in HLS is highly adaptable to various network environments, and supports more players. It is similar to nginx-rtmp-module in that it can distribute RTMP/HLS.

Making address:github.com/ossrs/srs

JRTPLIB

JRTPLIB is an open source RTP protocol implementation library for Windows and Unix platforms. It supports multi-threading, processing performance is better. It also supports RFC3550, UDP IPV6, and custom extended transport protocols. However, it does not support TCP transport, which needs to be implemented by the developer. At the same time, it does not support audio and video subcontracting, you have to implement the code yourself.

Making address:Github.com/j0r1/JRTPLI…

OPAL

OPAL is the next version of OpenH323, inherited OpenH323 protocol, its new contains the SIP protocol stack, is the first choice to implement THE SIP protocol, the disadvantage is less reference examples.

Code address:Sourceforge.net/projects/op…

Kurento

Kurento is a WebrtC-based media server that includes a set of apis to simplify the development of real-time video applications on the Web and mobile.

Making address:github.com/Kurento

Janus

Janus is a WebRTC media gateway. Whether it is streaming media, video conferencing, recording, gateway, can be achieved based on Janus.

Making address:github.com/Kurento

Other Business Services

Callstats.io

During real-time communication, quality problems such as delay, packet loss, connection rate, and disconnection rate all affect user experience. Commercial projects need particular attention. Callstats is a service provider that helps users collect communication data and improve call quality by professionally monitoring WebRTC calls.

Callstats is also available on Github for developers using JitSI-Videobridge, turn-Server, and JsSIP.

Making address:Github.com/callstats-i…

Meetecho

Meetecho is the developer of Janus, a well-known open source WebRTC gateway project. They also provide technical consulting and deployment services based on Janus development, and set up live video conferencing and recording services.

Making address:Github.com/carlhuda/ja…

Sound network Agora

The sound network provides a full set of services from codec to end to end transmission. Developers can access the open source project of audio and video pre and post processing mentioned above, and use the sound network SDK to build high-quality real-time audio and video applications. On the Web side, Agora Web SDK can help WebRTC developers to solve the problems encountered in server transmission such as lag, delay, echo, multiplayer video instability and so on. At the same time, the SDK also provides real-time audio and video communication services for applications of multiple system platforms.

Voice network on Github there are many developers for reference, practice of the demo source code, covering from the web, iOS to Android platform, as well as audio and video live, game continuous mic, enterprise conference, AR, live answer, small programs and other real-time interactive application scenarios.

Making address:Github.com/AgoraIO-Com…

Here we have listed 18 open source projects and three services that can effectively ensure the quality of real-time audio and video delivery. Ph.org’s Speex, FLAC, Xvid, libvpx, Lagarith, Daala, Thor, etc. Ph.org is ph.ph.org’s ph.ph.x. You are welcome to add more.

Finally to want to develop real time audio and video App, or want to learn WebRTC developers to recommend some posts and data: rtcdeveloper.com/t/topic/435