There are many technical articles about live broadcasting, but few are systematic. We will use seven articles to systematically introduce the key technologies in all aspects of the current hot video live broadcast, so as to help video live broadcast entrepreneurs have a more comprehensive and in-depth understanding of video live broadcast technology and better technology selection.

Video coding is the third part of the series of articles on live video technology. It is a very important part of this series and a required basic course for mobile development. This article covers mainstream encoders from theory to practice.

If the whole streaming media is compared to a logistics system, then the encoding and decoding is the process of distribution and loading, which is very important. Its speed and compression ratio have great significance to the logistics system, and affect the overall speed and cost of the logistics system. Similarly, encoding is also very important for streaming media transmission. Its encoding performance, encoding speed and encoding compression ratio will directly affect the user experience and transmission cost of streaming media transmission.

The outline of this series of articles is as follows, if you want to review the direct links to previous articles:

(1) Collection

(2) Processing

(3) Coding and packaging

(4) Push current and transmission

(5) Principles of modern player

(6) Delay optimization

(vii) SDK performance test model

Meaning of video coding

  • Raw video data storage space is large, a 1080P 7 s video needs 817 MB
  • The original video data transmission occupies a large bandwidth. It takes 11 minutes to transmit the above 7 s video with a bandwidth of 10 Mbps

After h.264 coding compression, the video size is only 708 K,10 Mbps bandwidth only needs 500 ms, which can meet the needs of real-time transmission, so the original video collected from the video acquisition sensor is bound to go through video coding.

The basic principle of

So how can a huge raw video be encoded into a tiny video? What is the technology behind this? The core idea is to remove redundant information:

  • Spatial redundancy: there is strong correlation between adjacent pixels of the image
  • Temporal redundancy: Similar content between adjacent images of a video sequence
  • Coding redundancy: Different pixel values have different probabilities
  • Visual redundancy: The human visual system is insensitive to certain details
  • Knowledge redundancy: The structure of regularity can be derived from prior knowledge and background knowledge

Video is essentially a series of pictures played in rapid succession. The simplest compression method is to compress each frame of picture. For example, the older MJPEG encoding is this kind of encoding method, which only has in-frame encoding and uses sampling prediction in space to encode. Every frame image analogy is to as a picture, adopting JPEG coding format of image compression, the code only consider redundant information of the image compression, as shown in figure 1, the green part is the current coding region, gray is not coding area, green area can be predicted according to the already coded parts (to the left of the green, Bottom, bottom left, etc.).







But because of time correlation between frame and frame, the follow-up to develop some of the more advanced the encoder interframe coding can be used, simply say is selected via search algorithm frame in some areas, and then by calculating the current frame and reference frame for encoding vector difference before and after a form through the following two figure 2 consecutive frames, we can see, The skiing students move forward, but in fact the snow scene moves backward. P-frame can be encoded by reference frame (I or other P-frame), and the size after coding is very small and the compression ratio is very high.




Figure 2


For those of you who are interested in how these two images come from, here we use FFmpeg to implement two lines of command, see the following chapter for more details on FFmpeg:

  • The first line generates the video with the motion vector
  • The second line outputs each frame as a picture
ffmpeg  -flags2 +export_mvs -i tutu.mp4 -vf codecview=mv=pf+bf+bb tutudebug2.mp4

ffmpeg -i tutudebug2.mp4 'tutunormal-%03d.bmp'Copy the code

In addition to the compression of spatial redundancy and temporal redundancy, there are mainly coding compression and visual compression. The following is the main flow chart of an encoder:




Figure 3





Figure 4.


The two processes in FIG. 3 and FIG. 4 are intra-frame coding and inter-frame coding. The main difference seen from the figure is that the first step is different. In fact, the two processes are also combined together.

Selection of encoder

Encoders have experienced decades of development, and have evolved from the beginning only support frame encoding to the current GENERATION of H.265 and VP9 encoders represented by the current analysis of some common encoders, take you to explore the world of encoders.

H.264

Introduction to the

The H.264/AVC project aims to create a video standard. It can provide high-quality video at lower bandwidth than older standards (in other words, half or less of the bandwidth of MPEG-2, H.263, or MPEG-4 Part 2) without adding too much design complexity to make it impossible or too expensive to implement. The other purpose is to provide sufficient flexibility for use in a variety of applications, networks and systems, including high and low bandwidth, high and low video resolution, broadcasting, DVD storage, RTP/IP networks, and ITU-T multimedia telephony systems.

H.264/AVC includes a number of new features that enable it to encode more efficiently than previous codecs and to be used in a variety of network applications. That foundation has made H.264 the primary codec adopted by online video companies, including YouTube, but it’s not an easy thing to use, and in theory it requires a hefty patent fee.

Patent license

As with MPEG-2 Part I, PART II, and MPEG-4 Part II, product manufacturers and service providers that use H.264/AVC are required to pay patent licensing fees to the holders of the patents that their products use. The primary source of these licenses is a private organization called MPEG-LA LLC, which is not affiliated with the MPEG Standardization Organization, but also manages licensing of patents for the MPEG-2 Part I system, Part II video, MPEG-4 Part II video, and other technologies.

Other patents are licensed to another private organization called VIA Licensing, which also manages Licensing of Audio compression biased standards such as MPEG-2 AAC and MPEG-4 Audio.

An open source implementation of H.264

Openh264 is cisco’s open source H.264 encoding. Although H.264 requires a high patent fee, there is an annual cap on patent fees. After Cisco pays the annual patent fee for The implementation of Openh264, Openh264 is effectively free to use.

X264 x264 is a gPL-licensed video coding free software. The main function of X264 is to encode H.264/MPEG-4 AVC video, not to be used as a decoder.

In addition to the cost of comparison:

  • Openh264 CPU usage is much lower than x264
  • Openh264 supports only baseline Profile, x264 supports more profiles

HEVC/H.265

Introduction to the

High Efficiency Video Coding (HEVC) is a Video compression standard that is considered a successor to the ITU-T H.264/MPEG-4 AVC standard. In 2004, ISO/IEC Moving Picture Experts Group (MPEG) and ITU-T Video Coding Experts Group (VCEG) began as ISO/IEC 23008-2 MPEG-H Part 2 or ITU-T H.265 was developed. The first version of THE HEVC/H.265 video compression standard was accepted as an official standard of the International Telecommunication Union (ITU-T) on 13 April 2013. HEVC is believed to not only improve video quality, but also double the compression rate of H.264/MPEG-4 AVC (equivalent to 50% bit rate reduction for the same picture quality), supporting 4K resolution and even ultra high definition TV (UHDTV). The highest resolution is 8192×4320 (8K resolution).

An open source implementation of H.265

Libde265 HEVC, provided by Struktur under the GNU LesserGeneral Public License (LGPL), provides viewers with the highest quality images available over slow Internet speeds. The libde265 HEVC decoder can deliver your full HD content to up to twice as many people, or 50% less bandwidth for streaming, than previous H.264 based decoders. Hd or 4K/8K ultra HD streaming, low latency/low bandwidth video conferencing, and full mobile device coverage. With the stability of “congestion aware” video coding, it is very suitable for application in 3/4G and LTE networks.

X265 is developed by MulticoreWare and is open source. It is under the GPL, but several companies funding the project have formed an alliance to use the software under non-GPL licenses.

VP8

Introduction to the

VP8 is an open video compression format, first developed by On2 Technologies and later released by Google. Google also released libvpx, an implementation of the VP8 code, under a BSD license and later with a patent attached. After some debate, the license for VP8 was finally confirmed as an open source license.

The current web browsers that support VP8 include Opera, Firefox and Chrome.

Patent license

In March 2013, Google reached an agreement with MPEG LA and 11 patent holders, allowing Google to obtain the patent license that VP8 and its previous VPx code may infringe. Meanwhile, Google can also re-license relevant patents to VP8 users for free. This protocol also applies to the next generation VPx coding. At this point, MPEG LA gave up the establishment of VP8 patent centralized authorization alliance, VP8 users will be able to determine free use of this code without worrying about possible patent infringement licensing fees.

Open source implementation of VP8

Libvpx is the only open source implementation of VP8. It was developed by On2 Technologies and opened to the public under a very loose License after Google acquired it.

VP9

Introduction to the

The development of VP9 began in the third quarter of 2011. The goal is to reduce the file size by 50% compared to VP8 coding with the same picture quality. Another goal is to surpass HEVC coding in coding efficiency.

On December 13, 2012, Chromium browser added support for VP9 encoding. Chrome began supporting VP9-encoded video playback on February 21, 2013.

Google has announced that the VP9 code will be ready by June 17, 2013, and Chrome will boot VP9 by default. On March 18, 2014, Mozilla added VP9 support to Firefox.

On April 3, 2015, Google released libvpx1.4.0, which adds 10 – and 12-bit bit depth support, 4:2:2 and 4:4:4 chroma sampling, and VP9 multi-core codec.

Patent license

VP9 is an open format, non – royalty video coding format.

Open source implementation of VP9

Libvpx is the only open source implementation of VP9. It is developed and maintained by Google. Part of the libvpx code is common to VP8 and VP9.

VP9, H.264 and HEVC were compared

Codec HEVC x264 vp9
HEVC – 42.2% 32.6%
x264 75.8% 18.5%
vp9 48.3% – 14.6%
Codec HEVC vs. VP9(in %) VP9 vs. x264 (in %)
Total Average 612 39399

Comparative Assessment of H.265/MPEG-HEVC, VP9, And H.264/MPEG-AVC Encoders for low-delay Video Applications, a relatively new paper on low-delay Video encoding test results.

Comparison of HEVC and H.264 at different resolutions

Compared with H.264/MPEG-4, the average bit rate reduction of HEVC is:

The resolution of the 480P 720P 1080P 4K UHD
HEVC 52% 56% 62% 64%

The visible bit rate decreased by more than 60%.

  • HEVC (H.265) is superior to VP9 and H.264 in bit rate saving, saving 48.3% and 75.8% respectively in the same PSNR.
  • H.264 has a huge advantage in encoding time. Comparing VP9 with HEVC(H.265), HEVC is 6 times longer than VP9, and VP9 is nearly 40 times longer than H.264

FFmpeg

When it comes to video coding, there’s a great software package called FFmpeg.

FFmpeg is free software that can record, convert, and stream audio and video in a variety of formats. It includes libavCodec, an audio and video decoder library for multiple projects, and libavFormat, an audio and video conversion library.

The FF in FFmpeg stands for Fast Forward. Some novices wrote to the FFmpeg project leader asking if FF stood for Fast Free or Fast Fourier. The FFmpeg project leader wrote back: “Just for the record, the original meaning of FF in FFmpeg is Fast Forward…”

The project was originally started by Fabrice Bellard and is now maintained by Michael Niedermayer. Many FFmpeg developers are also members of the MPlayer project, where FFmpeg is designed to be developed as a server version.

FFmpeg Download is: FFmpeg Download

  • It can be downloaded by browser. Currently, it supports Linux,Mac OS and Windows, and can also be compiled to Android or iOS platforms.
  • For Mac OS, install via BREWbrew install ffmpeg --with-libvpx --with-libvorbis --with-ffplay

What useful and fun things can we do with FFmpeg? Through a series of small experiments to take you to appreciate the magic and powerful FFmpeg.

FFmpeg on screen

Here’s a quick example of how to record a screen using FFmpeg on Mac OS:

Input:

ffmpeg -f avfoundation -list_devices true -i ""Copy the code

Output:

[AVFoundation input device @ 0x7fbec0c10940] AVFoundation video devices:
[AVFoundation input device @ 0x7fbec0c10940] [0] FaceTime HD Camera
[AVFoundation input device @ 0x7fbec0c10940] [1] Capture screen 0
[AVFoundation input device @ 0x7fbec0c10940] [2] Capture screen 1
[AVFoundation input device @ 0x7fbec0c10940] AVFoundation audio devices:
[AVFoundation input device @ 0x7fbec0c10940] [0] Built-in MicrophoneCopy the code

Given the list and number of all input devices supported by the current device, I have two local monitors, so 1 and 2 are my screens, you can choose one for screen recording.

To view the current H.264 codec:

Input:

ffmpeg -codecs | grep 264Copy the code

Output:

 DEV.LS h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (decoders: h264 h264_vda ) (encoders: libx264 libx264rgb )Copy the code

To view the current VP8 codec:

Input:

ffmpeg -codecs | grep vp8Copy the code

Output:

  DEV.L. vp8                  On2 VP8 (decoders: vp8 libvpx ) (encoders: libvpx )Copy the code

You can choose vp8 or H264 as the encoder

Ffmpeg-r 30-f avFoundation -i 1-vcodec vp8-quality realtime screen2. Webm # -quality realtime screen2 I can only get a frame rate of 2 if I don't add it to my AirCopy the code
ffmpeg -r 30 -f avfoundation -i 1 -vcodec h264 screen.mp4Copy the code

Then use ffPlay to play it

ffplay screen.mp4Copy the code
ffplay screen2.webpCopy the code

FFmpeg video to GIF

As an IT practitioner, my first thought was not to download a transcoder, nor to go to an online conversion site, but to use the tool at hand FFmpeg to complete the transcoding in an instant:

Ffmpeg-ss 10 -t 10 -i tutu.mp4 -s 80x60 tutu.gif ## -ss means to convert from 10s,-t means to convert video from 10s -sCopy the code

FFmpeg records the screen and broadcasts live

We can extend example 1 to show you how to set up a test live service with a few lines of command:

Step 1: Install docker first: Visit Docker Download and install it according to the operating system.

Step 2: Download the nginx-rtmp image.

docker pull chakkritte/docker-nginx-rtmpCopy the code

Step 3: Create the nginx HTML path and start docker-nginx-rtmp

mkdir ~/rtmp

docker run -d -p 80:80 -p 1935:1935 -v ~/rtmp:/usr/local/nginx/html chakkritte/docker-nginx-rtmpCopy the code

Step 4: Push screen recording to Nignx-RTMP

ffmpeg -y -loglevel warning -f avfoundation -i 2 -r 30 -s 480x320 -threads 2 -vcodec libx264 -f flv RTMP: / / 127.0.0.1 / live/testCopy the code

Step 5: Use ffplay

Ffplay RTMP: / / 127.0.0.1 / live/testCopy the code

To sum up, FFmpeg is an excellent tool that can be used to complete a lot of daily work and experiments, but there is still a lot of work to be done to provide real usable streaming media services and live streaming services. In this respect, you can refer to the Seven Cow live streaming cloud service released by Seven Cow cloud.

encapsulation

After video coding, let’s talk about some encapsulation. Following the previous metaphor, encapsulation can be understood as what kind of truck is used to transport the media, that is, the container.

The so-called container is a standard that mixes and encapsulates multimedia content (video, audio, subtitles, chapter information, etc.) generated by an encoder. Container makes different multimedia synchronization play is very simple, but the container of another role is provide index for multimedia content, that is to say, if there is no container film can only be seen from the very beginning in the end, you can’t drag the progress bar (in this case, of course, some players session longer provisional index creation), And if you don’t manually load the audio, there will be no sound. Here are some common package formats and their pros and cons:

  1. AVI format (suffix.avi) : Audio Video Interleaved format. It was introduced by Microsoft in 1992. The advantage of this video format is good image quality. Since lossless AVI can save alpha channels, it is often used by us. The most common problem is that older Versions of Windows media players can’t play AVI video with earlier encoding, while older versions of Windows Media players can’t play AVI video with the latest encoding. Therefore, when we play some AVI format videos, we often have some puzzling problems, such as the video cannot be played due to video coding problems, or even if the video can be played, the playback progress cannot be adjusted, and only sound can be played without images.

  2. Dv-avi Format (suffix.avi) : DV is Digital Video Format. It is a home Digital Video Format jointly proposed by SONY, Panasonic, JVC and other manufacturers. Digital cameras record video data in this format. It can transfer video data to the computer through the COMPUTER’s IEEE 1394 port, and can also record the video data edited in the computer back to the digital camera. The file extension for this video format is also avi. Television stations use videotapes to record analog signals, and video collected from videotapes via EDIUS by IEEE 1394 port acquisition card is in this format.

  3. QuickTime File Format (.mov) : a video Format developed by Apple. The default player is Apple’s QuickTime. It has high compression ratio and perfect video definition, and can save the alpha channel.

  4. MPEG format (the file suffix can be.mpg.mpeg.mpe.dat. Vob.asf.3gp.mp4, etc.) : The Moving Picture Experts Group was established in 1988 to establish video and audio standards for CDS. Its members are technical Experts in the fields of video, audio and systems. MPEG file format is the international standard of moving image compression algorithm. The MPEG format currently has three compression standards, namely MPEG-1, MPEG-2, and MPEG-4. Mpeg-4 was developed in 1998. Mpeg-4 is specially designed for streaming media high quality video, in order to use the least data to obtain the best image quality. The main attraction of MPEG-4 right now is its ability to store small video files with near-DVD quality.

  5. WMV format (suffix is. Wmv.asf) : It is called Windows Media Video in English. It is also a file compression format launched by Microsoft that adopts independent coding method and can directly watch Video programs in real time on the Internet. The main advantages of the WMV format include local or network playback, rich inter-stream relationships, and extensibility. The WMV format needs to be played on websites, which requires the installation of Windows Media Player (WMP for short), which is very inconvenient, and few websites now use it.

  6. Real Video Format (suffix.rm.rmVB) : The audio and Video compression specification developed by Real Networks is called Real Media. Users can use RealPlayer to develop different compression ratios according to different network transmission rates, so as to realize real-time transmission and playback of video data over a low network rate. RMVB format: this is a new video format extended from the RM video format upgrade, of course, has a great performance improvement. RMVB video also has obvious advantages, a size of about 700MB DVD film, if it is transcribed into RMVB format of the same quality, its size is about 400MB at most. You may have noticed that in the past, when downloading movies and videos on the Internet, RMVB format was often used. However, with the development of The Times, this format was replaced by more and more better formats. In 2013, the famous YYEts Subtitles Group announced that it would no longer suppress RMVB video.

  7. Flash Video format (suffix.flv) : A popular web Video package format extended from Adobe Flash. With the abundance of video sites, this format has become very popular.

  8. Matroska format (suffix.mkv) : a new multimedia package format that can package multiple encoded videos and 16 or more audio and language subtitles in a Single Matroska Media file. It is also one of the open source multimedia encapsulation formats. Matroska also provides very good interaction, which is more convenient and powerful than MPEG.

  9. Mpeg2-ts format (suffix.ts) Also known as MTS, TS) is a standard format for transmitting and storing various data including sound, video and communication protocols, used in digital television broadcasting systems, such as DVB, ATSC, IPTV, etc. Mpeg2-ts is defined in MPEG-2 Part I, the system (i.e., the original ISO/IEC standard 13818-1 or ITU-T REC.H. 222.0). Software such as Media Player Classic and VLC multimedia Player can play MPEG-TS files directly.

At present, FLV and MPEG2-TS formats are mainly used for RTMP/HTTP-FLV and HLS protocols in streaming media transmission, especially in live broadcast.

Next time we will systematically explain the push stream and transmission of live video. Please look forward to it