Over the years, many different algorithms have been devised to compress video. Video compression sounds like a very modern term, but it has a long history, starting with analog video. In this article, I’ll walk you through the milestones that led to today’s video compression. From the past to the present, various video compression methods have evolved from the original concept to the current standard. Many compression standards are still in use today, and people continue to develop and refine new standards all the time.

1929: First interframe compression

Interframe compression is the retention of only one key image and the difference between subsequent frames. This key image is called a Keyframe. Surprisingly, the discussion of interframe compression dates back to 1929. R.D. Kell of the United Kingdom proposed interframe compression for analog video, a concept that has continued to be applied to digital video today.

1952: Differential pulse code modulation

The next milestone in video compression occurred in 1952. B.M. Oliver and C.W. Harrison of Bell Laboratories proposed that differential pulse code modulation (DPCM) could be used in video coding. Until then DPCM had been used for audio (and still is today). DPCM means that you can take samples from an image and use them to predict future sample values. Because you can reconstruct the image accurately by guessing, you don’t need to store much image data.

1959: Interframe predictive coding using time compression

Predictive interframe video coding using time compression was first proposed in 1959. Time compression refers to selecting a group of interval key frames in a video and encoding only the changes of these key frames. A keyframe is the only frame that is recorded as a reference point for other frames. The concept was developed by RESEARCHERS Y. Taki, M. Hatori and S. Tanaka at the Japanese public Broadcasting Corporation (NHK).

1967: Stroke length code

Run-length encoding (RLE) refers to storing the same continuously occurring data value as a single value and frequency of occurrence, for example, the input data stream “AAABBCCCC”, and the output is the counting sequence of continuously occurring data value “3A2B4C”. You can then use this information to reconstruct exactly the same image! The concept, developed by university of London researchers A.H. Robinson and C. Cherry, was originally used to reduce the transmission bandwidth of analog television signals. Today, trip length coding is still used in digital video.

1970s: Early digital video algorithms

Digital video emerged in the 1970s. The video was sent using the same technology used in telecommunications – PCM (Pulse code modulation). Does that look familiar? PCM comes from DPCM mentioned above. PCM represents the sampled analog signal in digital form. Originally an audio standard, it was used to compress digital video in the 1970s. Although video can be transmitted, it requires large bit rates and is inefficient.

First compression of digital video

Nasir Ahmed is an IndiAn-American electrical engineer and computer scientist

Around 1972, Nasir Ahmed of Kansas State University proposed the use of DCT coding to compress images. DCT stands for Discrete Cosine Transform, which divides an image into small chunks composed of different frequencies. In the quantization process, the high frequency components are discarded, and the remaining low frequency components are saved and used for the subsequent image reconstruction. The resulting image will not be exactly the same due to the omission of certain frequencies, but most of the time this difference will go unnoticed.

1973: DCT technology becomesA kind ofImage compression algorithm

DCT

Nasir Ahmed, in collaboration with T. Natarajan and K.R. Rao of the University of Texas, implemented a DCT image compression algorithm. They published their work in 1974.

1974: Development of mixed coding **** process

In 1974, Ali Habibi of the University of Southern California used predictive coding and DCT coding together. As mentioned above, predictive coding refers to guessing the value before and after the appearance of the current image. Habibi’s algorithm can only be applied to intra-frame images, not inter-frame images.

1975: Further development of mixed coding

John A. Roesse and Guner S. Robinson further developed Habibi’s algorithm so that it could be applied between frames, and they tried various methods to do so, eventually finding that Ahmed’s DCT technique was most efficient when used in combination with predictive coding.

1977: Faster DCT algorithm

Wenxiong Chen, C.H. Smith, and S.C. Fralick worked together to optimize the DCT algorithm, and they founded Compression Labs to commercialize DCT.

1979~1981: motion compensated DCT video compression


Anil K. Jain and Jaswant R. Jain continue to develop motion compensated DCT video compression technology. Chen used their results to create a video compression algorithm that combined all the research. Continued research work on motion compensation DCT led to it eventually becoming the standard compression technique used from the 1980s to the present day.

First digital video compression standard, H.120

All the previous research finally came to fruition – the first video compression standard H.120 was published. The standard was useful for individual images but poor at maintaining image quality between frames and was revised in 1988. H.120 is the first international video compression standard, mainly used for video conferencing. This is a great achievement, but many companies have had to experiment with ways to improve the standard due to the inefficiencies in many aspects of H.120.

1988: with h. 261heldVideo conference

H.261 is probably the first in a series of codecs you’ve seen or used. It was the first digital video compression standard to effectively use intra – and inter-frame compression techniques. H.261 was also the first commercially successful digital video coding standard. It’s used for video conferencing around the world and introduces hybrid block-based video coding, The code is still used today in many video standards (MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263 MPEG-4 Part 2, H.264/MPEG-4 Part 10, and HEVC). The method used to create the H.261 standard is still widely used today. It supports a maximum resolution of 352×288.

Although the standard is well received internationally, it was incomplete when it was first released. The standard was revised in 1990 and 1993 respectively. H.261 does not include details of processing encoding and is only used to decode video.

1992: Uses Motion JPEG (MJPEG)PCMultimedia application

In 1992, Motion JPEG was created for multimedia applications on computers. This video compression technique compresses each frame of the video individually into a JPEG image.

1993: Video CD using MPEG-1

MPEG stands for Moving Pictures Experts Group, which is the International Standardization Organization (ISO), International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) jointly established to develop International standards for media coding. Around 1988, they began collaborating on the video coding standard known today as MPEG-1. Like H.261, MPEG-1 does not contain a standard for how to encode video, although it provides a sample implementation. As a result, MPEG-1 can show very different performance depending on how it is encoded.

Mpeg-1 is designed to compress raw digital Video, audio and metadata of VHS quality for Video CDS, digital cable television, satellite television and file sharing for reference, archiving and transcription. It has a maximum resolution of 352×288. You probably know mPEG-1 best in audio — it created MP3.

1994 Television broadcasts and DVDS using H.262 and MPEG-2

Mpeg-2 and H.262 are different names for the same video standard, which was developed by a number of companies. The standard supports interlaced scanning (a technique used to emulate NTSC, PAL, and SECAM TV systems) and uses a number of interesting coding techniques. Here are two kinds:

The image sampling

Mpeg-2 uses image sampling technology to reduce data. One method is to divide each frame into two images that are scanned alternately (one field contains all the odd lines, the other field contains all the even lines). When displayed, the interlaced contents of the first field are displayed first, and then the second field is displayed to fill the first field to complete it. This method is called interlaced scanning. Interlaced scanning is a compression method to reduce data volume and ensure frame rate.

Another strategy takes advantage of the fact that the human eye perceives brightness better than color. Mpeg-2 adopts Chroma Subsampling: this video coding method uses a lower resolution for Chroma (color) information than for brightness (brightness) information. Because humans are not very good at seeing colors, even if information is lost in the compression process, it will not affect the viewing. The goal of this strategy is to reduce the amount of data used to store color images so that image compression does not degrade quality.

Frame I, P and B

Mpeg-2 uses different kinds of frames to compress data. I frame is the in-frame encoding frame. I frame describes the content of the image background and the moving subject and can be used as the reference frame of P frame and B frame. A P frame, also known as a prediction frame, contains the difference between itself and the information in the preceding I, P, or B frames. B frame is similar to P frame, but it needs to refer to the I frame or P frame before it and the P frame after it to generate a complete video picture.

1995: Use DV to store digital video

The first DV specification, called the Blue Book, defined common characteristics such as videotape, recording modulation methods, magnetization, and basic system data. DV uses DCT to compress video frame by frame. Like MPEG-2, it uses chromaticity secondary sampling for further compression.

DV was designed by SONY and Panasonic for professional and broadcast users. With memory cards and solid-state drives, this method of storage has long been obsolete.

1996: New generation of video conferencing standards using H.263

H.263 is implemented on the basis of H.261. It uses DCT technology to create low-bit-rate compressed video that can be used in video conferencing. This standard is widely used for Flash video content on the Internet, YouTube and MySpace. It was used throughout the Internet until h.264 came along.

1999: Internet video using MPEG-4 Part II

Mpeg-4 Part II (also known as MPEG-4 Visual) is an H.263-compliant standard commonly used for surveillance cameras as well as HDTV broadcasts and DVDS. It uses a more efficient algorithm than MPEG-2 and compresses faster. However, because it could not handle AVC (Advanced Video Coding) formats, MPEG-4 AVC came later.

2003: Blu-ray, DVD, live video and broadcast TELEVISION using H.264/MPEG-4 AVC

H.264/MPEG-4AVC (sometimes referred to as MPEG-4 Part 10) was released in 2003. The purpose of this compression technology is to create high-definition digital video that is flexible enough to be used on different systems, networks and devices. This is by far the most popular compression standard. H.264 is used not only in various decoders, browsers and mobile devices, but also in satellites, the Internet, telecommunications networks and cables. It’s on Blu-ray discs, Netflix, Hulu, Amazon Prime Video, Vimeo, YouTube, and almost every Video you see on the Internet. The maximum resolution it supports is 4096×2048.

The standard is based on motion-compensated integer DCT coding. Integer DCT transform is an algorithm which can realize cosine transform very fast. H.264 supports lossless and lossy compression, making it very flexible compared to earlier compression standards. Another advantage is that the technology is free for streaming content over the Internet.

2013:360° immersive video, AR and VR with H.265/HEVC

H.265/HEVC (High Efficency Video Coding) does everything h.264 can do, and performs better. It reduces file size by 50% and supports very high quality video resolution — up to 8K (Max resolution 8192×4320). While you usually don’t need 8K or can’t get it through today’s devices and networks, H.265 is very useful for immersive experiences like AR, VR, and 360°. The high cost is the main reason why it is not widely used. With the exception of big companies like Netflix and Amazon Prime Video that can afford it, many companies still choose to use H.264.

2013: VP9

VP9 was developed by Google and is a competitor to H.265. Unlike H.265, it’s free. The H.265 performs better at high bit rates. H.265 and VP9 both take a while to encode video, which adds latency. It is this problem that keeps H.264 in use. VP9 is becoming more and more popular because it’s free. But whether it will be more widely used remains to be seen.

2018: High quality web video with AV1


Google, Amazon, Cisco, Intel, Microsoft, Mozilla and Netflix have decided to work together to create a new video format standard called AV1. It is the next generation video standard after VP9. It is open source and free. This format is designed for real-time applications such as WebRTC and supports higher resolution, with the goal of being able to handle 8K video. It uses new technology to implement the typical block-based DCT transform I mentioned above. It can achieve accurate inter-frame prediction by more precise methods of dividing images into blocks and using improved filtering.

2020: Achieve commercially viable 4K, 8K using H.266/VVC

H.266/VVC (Versatile Video Coding) mainly targets at 4K and 8K Video services. Released in July 2020, it is the latest video compression standard to be released to date. H.266 further optimizes compression (but no other innovations), saving approximately 50% of the video bitrate while ensuring the same video sharpness. It uses a block-based hybrid video encoding approach, and the idea is to find ways to optimize and improve existing algorithms and compression techniques. H.266 encoding is still slow, but the standard offers good quality improvements at lower bit rates.

AVS3 is the third generation source coding standard with independent intellectual property rights in China

AVS3 video coding standard is the third generation standard developed by AVS working group in China. It is designed to accommodate a variety of application scenarios, such as ultra hd TELEVISION broadcasting, virtual reality and video surveillance. The development process of AVS3 is divided into two phases. So far, the first phase of AVS3 was completed in March 2019, and it saves about 30% of the bit rate in 4K ultra-high-resolution video compared to AVS2. In addition, phase 2 of AVS3 is developing more efficient coding tools to improve performance, especially for surveillance video and on-screen content video.

From 1929 to 2020, the history of video compression standards is over. Looking back at the history of nearly 100 years, it is the unremitting efforts of generations of individuals and organizations that have led to today’s video compression standard. What else will happen to video compression in the future? Let’s wait and see.

Translation/Alex

Technical Review/LiveVideoStack invited technical Review

Original link:

API. Video/blog/video -…


LiveVideoStackCon2021 Beijing station is hot registration!

Please refer to the scan figure for detailsQr codeOr clickRead the originalLearn more about the conference.

This article uses the article synchronization assistant to synchronize