The topic

First let’s calculate the following 1080P 60fps video file size

According to the figure above, a one-second video should be 342MB, but the actual size of the video we watch is not that big. That is because the video we watch is compressed, and the actual size is only 1-2MB. How to achieve this, we will explain later.

The resolution of the

As mentioned above, what is 1080p? What are these down here?

FullHD: 1920x1080 2K: 2560x1440Copy the code

We’ve all heard and seen these things in our daily lives. What are they? Yes, that’s the resolution.

Resolution = landscape pixels * vertical pixels.

We often see or hear people say 1920×1080, so 1920×1080 is the resolution, 1920 pixels horizontally by 1080 pixels vertically.

Doesn’t that sound easy, so the next question is, what is 1080p? You might say 1920×1080, but what is 1080i? See if I’m getting stuck here, it’s really easy

  • p: stands for Progressive Scanning
  • i: stands for Interlace Scanning

A progressive scan is a line by line scan that generates an image, and a frame is an image.

Interlaced scan will scan the previous frame according to odd numbers, then scan the next frame according to even numbers, and finally combine them to form a complete picture.

Interlaced scanning saves half of the bandwidth compared to progressive scanning, which means better resolution for the same bandwidth.

Frame rate

What is this 60fps? Yeah, it’s frame rate.

Video format

Encapsulation format

What is encapsulated format?

It is a container that defines the appearance of the video and stores video, audio, media information and letter information. For example, MP4, MKV and MOV are the packaging formats we usually see.

Coding format

What is an encoding format? H264 and H265 are the encoding formats. Common encoding format

MP4: H.264, H.265, MPEG4.. WebM VP8VP9... AVI: MPEG-2, AC-1, H.264, DIVX, XVID... RM/RMVB: RV, RM... MOV:MPEG-2, XVID, H.264... TSIPS:MPEG-2, H.264, MPEG-4.. WMy, WMV, AC - 1.. MKV: can encapsulate all video coding formatsCopy the code

Bit rate

Usually transcoding video can often see – B: V 5000Kbps, so what is it? That’s right, it’s video bitrate. The audio bitrate is denoted by -a:v

Bit rate: The amount of data and information contained in a video per second. Bit rate directly determines the final size and quality of video.

Method of controlling bit rate

CBR: fixed bit rate, constant bit rate throughout the whole process, predictable file size, low coding pressure, common live broadcast; Simple scene quality is good, complex scene quality is poor; It is a method with the lowest space utilization rate.

VBR: Variable bit rate, variable bit rate, on-demand allocation, low bit rate in simple scenarios, high bit rate in complex scenarios.

CRF: fixed quality. In fixed quality mode, the lower the CRF value is, the higher the video quality looks, and vice versa. The visual quality is the target bit rate and the file size is unpredictable.

Video compression

At the beginning, we said that if the video is not compressed, the volume of the video will be very large, no matter it is stored or transmitted, it will take up a lot of space, so the video compression is to compress the original video to the size of the video we can watch normally. But video compression can also be divided into intra-frame compression and inter-frame compression.

Intraframe compression

Intra-frame compression: compress every frame of the video into a lossy image like JPEG. The principle is to save the brightness information of the video as much as possible, compress the color information, especially the complex color information. Generally speaking, it is to record the key frame, and the rest of the video is predicted by the motion trajectory. You can save up to 90% of your space.

Intra-frame compression is the compression of I frames in the GOP image group.

Interframe Compression

Interframe compression, also known as Temporal_compression, is based on the great correlation between the two consecutive frames of many videos or animations (i.e. the continuous video has redundant information between adjacent frames). The compression ratio is further improved by comparing data between different frames on the time axis.

Intra-frame compression is to compress B frame and P frame in GOP image group.

GOP(Group Of Picture)

Screen group, the number of blocks contained in each IPB sequence, that is, the number of frames to be passed after one I frame before the next frame appears.

At the same bit rate, the larger the GOP value, the more P and B frames simulated, the clearer the video will be and the higher the picture quality will be.

The I frame

Key frames. Without reference to other image frames, only the information of this frame is used for encoding

P frame

The prediction frame, which represents the difference of the previous frame, is predicted based on the I frame, saving half of the space of the I frame. Using the previous I frame or P frame, the motion prediction method is used for inter-frame prediction coding.

B frame

Bidirectional prediction frame, which represents the difference between before and after frame, makes prediction again on the basis of I frame and P frame. Compared with P frame, it saves half of space and provides the highest compression ratio. It requires both the previous image frame (I frame or P frame) and the later image frame (P frame), and adopts the way of motion prediction to carry out the bidirectional prediction coding between frames

H264 encoding mode

Our common MP4 is encoded in H264.

Preset (the default)
  • Faster – low
  • Low Fast
  • In the Medium
  • Missile high
  • Very missile ultra-high

CPU encoding and GPU encoding

The quality of video encoded by CPU is higher than that of GPU, but the speed of video read is much slower than that of GPU.