Chapter 6. Transform coding

1. Change codes

Transform the purpose of coding

Remove the correlation of spatial signals
The space signal capability is concentrated on a small part of the low frequency coefficients in the frequency domain
The coefficients with low energy can be removed by quantization without seriously affecting the quality of reconstructed images

Block transformation and global transformation

Block Transform: 4×4, 8×8, 16×16
Global Transform: Wavelet Transform

Energy concentration characteristics of transformation

DCT coding

2. Change the type

K – L transformation
Fourier transform
Cosine transform
The wavelet transform

3. KL transform

The optimal transformation
The basis function depends on the graph
There is no fast algorithm
Rarely used in practice, the complexity is extremely high
The K-L transformation is very complicated and not practical because you need to compute the covariance matrix U and you need to compute the eigenvectors and you need to send them to the decoder

4. Discrete Fourier transform

5. Properties of Discrete Fourier transform

6. DCT

The Fourier transform is less complicated than the K-L transform
The transform performance is second only to k-L transform
There are fast algorithms to speed up the transformation
Integer transformations can be used to further reduce complexity

7. Relationship between DCT and DFT

8. Important properties of discrete cosine transform

Integer discrete cosine transform

DCT is a floating point operation

64-bit accuracy is required
Floating-point calculation is complicated
High conversion accuracy

Integer transform: Integer approximation of discrete cosine transform

Less bit width is required
Integer computation complexity is low
Good integer conversions have conversion accuracy close to floating point conversions
Floating point approximation method

H.264’s 4×4 integer conversion

12. Wavelet transform

New transformation method
Avoid block effects due to block coding
More suitable for video space scalable coding

Chapter 7 Quantification

1. The quantitative Quantization

The process of representing a larger set by a smaller set

A finite approximation of the signal source
The damage process
Apply A/D conversion compression
Quantization method: Scalar quantization or Vector quantization

2. The basic idea of quantification

Map an input interval to an integer
Reduce the bit of the source code
In general, the reconstructed value is different from the input value

3. Quantitative model

4. Quantitative rate distortion optimization

Quantizer design issues

The number of quantization levels, that is, the number of Bin
Decision boundary: the Bin boundary
Refactoring level

Quantizer design is an optimization of rate distortion

To reduce the size of the bit rate, you need to reduce the number of Bin
The reduction of Bin number leads to the increase of reconstruction error and distortion

5. Distortion measurement

Two aspects of quantizer design give the number of quantization levels M, find the decision boundary xi and the reconstruction level to minimize the given distortion limit D for MSE, find the number of quantization levels M, the decision boundary xi and the reconstruction level Yi to make MSE<=D

7. Uniform Quantization

8. Quantization and peak SNR

9. Midrise Quantizer

10. Midtread Quantizer

11. Deadzone Quantizer

12. Non-uniform Quantization

If the source is not uniformly distributed, uniform quantization is not optimal
For non-uniform quantization, in order to reduce MSE, when the probability density function fX(x) is high, the quantization step of Bin decreases; when the probability density function fX(x) is low, the quantization step of Bin increases.

13. Optimal scalar quantization

14. Quantitative coding

Quantization level of fixed-length coding

Each quantization level is encoded using a code word of equal length, the code word length is: [log2M]

Quantization level of entropy coding

Each quantization level is encoded with a variable length code word according to the probability distribution of the quantization level
Average codeword length <= [log2M]
It is more efficient than fixed-length coding quantization level
Widely used in image and video coding

15. Vector quantization

Scalar quantization: Quantization of data one by one, called scalar quantization.
Vector quantization: data is grouped, and each group of K data constitutes K-dimension vector, and then quantization is carried out with the vector as the processing unit. Vector quantization is a multidimensional extension of scalar quantization and scalar quantization is a special case of vector quantization
Vector quantization process

Two dimensional vector quantization

Advantages of vector quantization
Only the subscript of the code word, high coding efficiency
At the same bit rate, the distortion is smaller than scalar quantization
At the same distortion, the bit rate is lower than scalar quantization

Disadvantages of vector quantization:

The complexity increases exponentially with the number of dimensions

Chapter 8 Entropy coding

1. The entropy coding

Entropy: The average amount of information in a source, which is more accurately described as the average number of bits of information contained in all symbols of the source
Entropy coding: variable length coding with lossless compression to minimize entropy according to the probability model of source message in data compression

2. The entropy

Information amount formula:

Units: –

Entropy formula:

Unit: bit/symbol

3. Fixed-length coding

4. Variable length coding

Variable length encoding: representing each symbol with a different number of bits Assigning short code words to symbols that occur frequently assigning long code words to symbols that occur infrequently is more efficient than fixed-length encoding
Commonly used variable-length coding Huffman arithmetic coding

5. Huffman encoding

Prefix code: Any code word that is not a prefix of another code word. If 011 is a valid code word, 0, 01, 11 must not be a valid code word without causing decoding ambiguity
Huffman: Binary tree node: symbol or symbol combination branch: two branches one representing “0” and one representing “1”

Huffman’s non-uniqueness

Each branch has two options: 0,1

The same probability leads to different combinations

Disadvantages:

The probability variation of data is difficult to calculate in real time
The Huffman tree requires encoding to be transmitted to the decoder
It is optimal only if p(xi)=1/2 KI
The minimum code word length is 1 bit/symbol

If there is a binary source, the probability of the two symbols is very different

For example, p(1)=0.0625, P (0)=0.9375, then H=0.3373 bits/symbol, and the average Huffman code length =1 bit/symbol
The joint coding of two symbols is more efficient

6. Expand Huffman coding

7. Huffman coding paradigm

The establishment rules of Huffman tree in normal form

Set the left branch of the node to 0 and the right branch to 1

Tree depth increases from left to right

Each symbol is placed at the first satisfied leaf node

features

The first code word is a string of zeros
The values of code words of the same length are continuous
If all code words have the same length by adding 0 in the low position, then 0000<0100<1000<1010<1100<1110<1111
The relationship between the code word length n and n+1 is as follows: C(n+1,1)=(C(n,last)+1)<<1
The code word length from n to n+2 is as follows: C(n+2,1)=(C(n,last)+1)<<2

8. A dollar

Encodes a non-negative integer n with n 1s and one 0

No code table is required

You can represent it as a Huffman tree

Code length increases too fast: n=100, code length 101

9. Columbus Code

Divide source symbols, etc., into groups with corresponding numbers for each group

The allocation code word with a small number is short, and the allocation code word with a large number is long

Symbols in the same group have code words of equal length that grow more slowly than unary code words

Distribution of code word

10. Exponential Columbus coding

The size of the grouping of the source symbols is the same in Columbus code. The size of the grouping of the source symbols in Columbus code increases exponentially. The index of Columbus code is still the form of unary plus fixed-length code.

11. CAVLC (Context-based Adaptive Variable Length Code)

The coefficient distribution of the current block is related to the coefficient distribution of its neighbor block

NX is the number of non-zero coefficients of block X, and the code table of the first coefficient of the current block C is determined by NC.

NC=( NA+ NB )/2

There is a correlation between the current encoding coefficient and the previous encoding coefficient

The code table of other coefficients of current block C is determined by the amplitude of the previous coefficient coFN-1 =>GolombTab_x, and cofN is encoded with GolombTab_x

12. Arithmetic coding

Amount of information >= number of symbol code bits
Huffman encoding assigns a code word to each symbol, which indicates that the compression limit of Huffman encoding is 1 bit/symbol
Arithmetic coding several symbols can be encoded into 1bit
Arithmetic coding is the representation of the source as the [0,1] interval on the real number line, and each symbol in the source is used to shorten this interval. A real number in the output [0,1] interval represents a string of encoded symbols more efficient than Huffman encoding
The encoder encodes a string of symbols using entropy encoding algorithm to generate a real number in the range of [0,1], and transmits a binary representation of the real number to the decoder. The decoder decodes a string of symbols using entropy decoding algorithm
Binary representation of decimals;
Source symbolic probability distribution
String: X2 X2 X3 X3 X6 X5 X7
Huffman code, 01 01 100 100 00 11 1011 18bit

Arithmetic coding is closer to entropy
Finite precision arithmetic encodings are near-optimal encodings that send integer bits to the decoder
Arithmetic encoding does not output code words until the end of the last character encoding
The coding complexity is also high

13. Binary arithmetic coding

Adaptive binary arithmetic coding

Since the probabilities of source 0 and 1 are constantly changing, the probability intervals of 0 and 1 should also be constantly changing. For every 0 or 1 encoded by adaptive binary arithmetic coding, the probabilities of 0 and 1 should be re-counted and the interval of [0,1) should be re-divided. The probability statistical model of codec end is consistent, and the same interval of [0,1) can be obtained

14. CABAC (Context-based Adaptive Binary Arithmetic Coding)

The probability distribution of syntactic elements of the current block is related to the probability distribution of syntactic elements of adjacent blocks SA and SB of adjacent blocks A and B of the current block C can select A probability model for the syntactic element SC encoding block C
Binarization converts syntax element values into binary value strings
Probabilistic model update

According to the encoded bits, the probability of the binary value string is re-estimated and the probability model is updated, and the next bit is encoded with the new probability model

15. Run Length encoding

The technique of encoding using the repeatability of source characters

Great for source encodings with long, repetitive characters

Repeated characters are called run, and the number of repeated characters is called Run Length

Run-length codes can compress data together with other entropy codes

16. Dictionary encoding

Dictionary encoding: according to the combination characteristics of the source symbol (message), the establishment of a dictionary containing all kinds of symbol combinations, encoding the index of these symbol combinations LZ78=>Winzip LZW=> GIF
It is suitable for data compression in general sense to remove statistical redundancy of data

17. LZW

In the output string of the message source, each character or string appearing for the first time is represented by an index, and the corresponding index of the character or string is encoded into the code stream
Based on the characters decoded from the stream, the decoder creates a dictionary online exactly like the encoder and restores the source output string
This method is used when a large number of substrings are repeated multiple times. The more repeated times, the better the compression effect
Individual symbols are assigned values between 0 and 255
Initial code table so that it contains 256 symbols with values 0-255 and null symbols with values greater than 255
The encoder determines the character combination to be a value from 256 to 4095, depending on the symbol of the encoding
When encoding, the encoder recognizes new character combinations and adds them to the code table
The encoder encodes with values corresponding to the combination of symbols in the code table

When decompressed, the LZW decoder can produce exactly the same code table as the encoder
Initialize all single characters as the encoder does, assigning values between 0 and 255 to them
The code table is updated when decoding all but the first character
By reading code words and decoding them into corresponding characters or combinations of characters according to the values in the code table

18. LZ78

References:

LZW:www.cs.usyd.edu.au/~loki/cs2cs… LZ78:www.cs.usyd.edu.au/~loki/cs2cs…

【 video codec -07】 Transformation, quantization and entropy coding