Chapter 6. Transform coding

1. Change codes

Transform the purpose of coding

  • Remove the correlation of spatial signals
  • The space signal capability is concentrated on a small part of the low frequency coefficients in the frequency domain
  • The coefficients with low energy can be removed by quantization without seriously affecting the quality of reconstructed images

Block transformation and global transformation

  • Block Transform: 4×4, 8×8, 16×16
  • Global Transform: Wavelet Transform

Energy concentration characteristics of transformation

DCT coding

2. Change the type

  • K – L transformation
  • Fourier transform
  • Cosine transform
  • The wavelet transform

3. KL transform

  • The optimal transformation
  • The basis function depends on the graph
  • There is no fast algorithm
  • Rarely used in practice, the complexity is extremely high







  • The K-L transformation is very complicated and not practical because you need to compute the covariance matrix U and you need to compute the eigenvectors and you need to send them to the decoder

4. Discrete Fourier transform

5. Properties of Discrete Fourier transform



6. DCT

  • The Fourier transform is less complicated than the K-L transform
  • The transform performance is second only to k-L transform
  • There are fast algorithms to speed up the transformation
  • Integer transformations can be used to further reduce complexity







7. Relationship between DCT and DFT

8. Important properties of discrete cosine transform





9. Fast DCT transform



Here is a live demo:





Integer discrete cosine transform

DCT is a floating point operation

  • 64-bit accuracy is required
  • Floating-point calculation is complicated
  • High conversion accuracy

Integer transform: Integer approximation of discrete cosine transform

  • Less bit width is required
  • Integer computation complexity is low
  • Good integer conversions have conversion accuracy close to floating point conversions
  • Floating point approximation method





H.264’s 4×4 integer conversion













12. Wavelet transform

  • New transformation method
  • Avoid block effects due to block coding
  • More suitable for video space scalable coding

Chapter 7 Quantification

1. The quantitative Quantization

The process of representing a larger set by a smaller set

  • A finite approximation of the signal source
  • The damage process
  • Apply A/D conversion compression
  • Quantization method: Scalar quantization or Vector quantization

2. The basic idea of quantification

  • Map an input interval to an integer
  • Reduce the bit of the source code
  • In general, the reconstructed value is different from the input value

3. Quantitative model

4. Quantitative rate distortion optimization

Quantizer design issues

  • The number of quantization levels, that is, the number of Bin
  • Decision boundary: the Bin boundary
  • Refactoring level

Quantizer design is an optimization of rate distortion

  • To reduce the size of the bit rate, you need to reduce the number of Bin
  • The reduction of Bin number leads to the increase of reconstruction error and distortion

5. Distortion measurement

6. Quantizer design

Two aspects of quantizer design give the number of quantization levels M, find the decision boundary xi and the reconstruction level to minimize the given distortion limit D for MSE, find the number of quantization levels M, the decision boundary xi and the reconstruction level Yi to make MSE<=D

7. Uniform Quantization



8. Quantization and peak SNR



9. Midrise Quantizer

10. Midtread Quantizer

11. Deadzone Quantizer

12. Non-uniform Quantization

  • If the source is not uniformly distributed, uniform quantization is not optimal
  • For non-uniform quantization, in order to reduce MSE, when the probability density function fX(x) is high, the quantization step of Bin decreases; when the probability density function fX(x) is low, the quantization step of Bin increases.

13. Optimal scalar quantization





14. Quantitative coding

Quantization level of fixed-length coding

  • Each quantization level is encoded using a code word of equal length, the code word length is: [log2M]

Quantization level of entropy coding

  • Each quantization level is encoded with a variable length code word according to the probability distribution of the quantization level
  • Average codeword length <= [log2M]
  • It is more efficient than fixed-length coding quantization level
  • Widely used in image and video coding

15. Vector quantization

  • Scalar quantization: Quantization of data one by one, called scalar quantization.
  • Vector quantization: data is grouped, and each group of K data constitutes K-dimension vector, and then quantization is carried out with the vector as the processing unit. Vector quantization is a multidimensional extension of scalar quantization and scalar quantization is a special case of vector quantization
  • Vector quantization process





Two dimensional vector quantization

  • Advantages of vector quantization
  • Only the subscript of the code word, high coding efficiency
  • At the same bit rate, the distortion is smaller than scalar quantization
  • At the same distortion, the bit rate is lower than scalar quantization

Disadvantages of vector quantization:

  • The complexity increases exponentially with the number of dimensions

Chapter 8 Entropy coding

1. The entropy coding

  • Entropy: The average amount of information in a source, which is more accurately described as the average number of bits of information contained in all symbols of the source
  • Entropy coding: variable length coding with lossless compression to minimize entropy according to the probability model of source message in data compression

2. The entropy

Information amount formula:



Units: –

Entropy formula:



Unit: bit/symbol



3. Fixed-length coding

4. Variable length coding

  • Variable length encoding: representing each symbol with a different number of bits Assigning short code words to symbols that occur frequently assigning long code words to symbols that occur infrequently is more efficient than fixed-length encoding
  • Commonly used variable-length coding Huffman arithmetic coding

5. Huffman encoding

  • Prefix code: Any code word that is not a prefix of another code word. If 011 is a valid code word, 0, 01, 11 must not be a valid code word without causing decoding ambiguity
  • Huffman: Binary tree node: symbol or symbol combination branch: two branches one representing “0” and one representing “1”



  • Huffman’s non-uniqueness

    Each branch has two options: 0,1

The same probability leads to different combinations

Disadvantages:

  • The probability variation of data is difficult to calculate in real time
  • The Huffman tree requires encoding to be transmitted to the decoder
  • It is optimal only if p(xi)=1/2 KI
  • The minimum code word length is 1 bit/symbol

If there is a binary source, the probability of the two symbols is very different

  • For example, p(1)=0.0625, P (0)=0.9375, then H=0.3373 bits/symbol, and the average Huffman code length =1 bit/symbol
  • The joint coding of two symbols is more efficient

6. Expand Huffman coding

7. Huffman coding paradigm

  • The establishment rules of Huffman tree in normal form

    Set the left branch of the node to 0 and the right branch to 1

    Tree depth increases from left to right

    Each symbol is placed at the first satisfied leaf node

features

  • The first code word is a string of zeros
  • The values of code words of the same length are continuous
  • If all code words have the same length by adding 0 in the low position, then 0000<0100<1000<1010<1100<1110<1111
  • The relationship between the code word length n and n+1 is as follows: C(n+1,1)=(C(n,last)+1)<<1
  • The code word length from n to n+2 is as follows: C(n+2,1)=(C(n,last)+1)<<2



8. A dollar

Encodes a non-negative integer n with n 1s and one 0

No code table is required

You can represent it as a Huffman tree

Code length increases too fast: n=100, code length 101

9. Columbus Code

Divide source symbols, etc., into groups with corresponding numbers for each group

The allocation code word with a small number is short, and the allocation code word with a large number is long

Symbols in the same group have code words of equal length that grow more slowly than unary code words

Distribution of code word



10. Exponential Columbus coding

The size of the grouping of the source symbols is the same in Columbus code. The size of the grouping of the source symbols in Columbus code increases exponentially. The index of Columbus code is still the form of unary plus fixed-length code.





11. CAVLC (Context-based Adaptive Variable Length Code)

The coefficient distribution of the current block is related to the coefficient distribution of its neighbor block

NX is the number of non-zero coefficients of block X, and the code table of the first coefficient of the current block C is determined by NC.

NC=( NA+ NB )/2

There is a correlation between the current encoding coefficient and the previous encoding coefficient

The code table of other coefficients of current block C is determined by the amplitude of the previous coefficient coFN-1 =>GolombTab_x, and cofN is encoded with GolombTab_x

12. Arithmetic coding

  • Amount of information >= number of symbol code bits

  • Huffman encoding assigns a code word to each symbol, which indicates that the compression limit of Huffman encoding is 1 bit/symbol
  • Arithmetic coding several symbols can be encoded into 1bit
  • Arithmetic coding is the representation of the source as the [0,1] interval on the real number line, and each symbol in the source is used to shorten this interval. A real number in the output [0,1] interval represents a string of encoded symbols more efficient than Huffman encoding
  • The encoder encodes a string of symbols using entropy encoding algorithm to generate a real number in the range of [0,1], and transmits a binary representation of the real number to the decoder. The decoder decodes a string of symbols using entropy decoding algorithm
  • Binary representation of decimals;

  • Source symbolic probability distribution

  • String: X2 X2 X3 X3 X6 X5 X7
  • Huffman code, 01 01 100 100 00 11 1011 18bit



  • Arithmetic coding is closer to entropy
  • Finite precision arithmetic encodings are near-optimal encodings that send integer bits to the decoder
  • Arithmetic encoding does not output code words until the end of the last character encoding
  • The coding complexity is also high

13. Binary arithmetic coding





  1. Adaptive binary arithmetic coding

Since the probabilities of source 0 and 1 are constantly changing, the probability intervals of 0 and 1 should also be constantly changing. For every 0 or 1 encoded by adaptive binary arithmetic coding, the probabilities of 0 and 1 should be re-counted and the interval of [0,1) should be re-divided. The probability statistical model of codec end is consistent, and the same interval of [0,1) can be obtained



14. CABAC (Context-based Adaptive Binary Arithmetic Coding)

  • The probability distribution of syntactic elements of the current block is related to the probability distribution of syntactic elements of adjacent blocks SA and SB of adjacent blocks A and B of the current block C can select A probability model for the syntactic element SC encoding block C
  • Binarization converts syntax element values into binary value strings
  • Probabilistic model update

    According to the encoded bits, the probability of the binary value string is re-estimated and the probability model is updated, and the next bit is encoded with the new probability model

15. Run Length encoding

The technique of encoding using the repeatability of source characters

Great for source encodings with long, repetitive characters

Repeated characters are called run, and the number of repeated characters is called Run Length

Run-length codes can compress data together with other entropy codes



16. Dictionary encoding

  • Dictionary encoding: according to the combination characteristics of the source symbol (message), the establishment of a dictionary containing all kinds of symbol combinations, encoding the index of these symbol combinations LZ78=>Winzip LZW=> GIF
  • It is suitable for data compression in general sense to remove statistical redundancy of data

17. LZW

  • In the output string of the message source, each character or string appearing for the first time is represented by an index, and the corresponding index of the character or string is encoded into the code stream
  • Based on the characters decoded from the stream, the decoder creates a dictionary online exactly like the encoder and restores the source output string
  • This method is used when a large number of substrings are repeated multiple times. The more repeated times, the better the compression effect
  • Individual symbols are assigned values between 0 and 255
  • Initial code table so that it contains 256 symbols with values 0-255 and null symbols with values greater than 255
  • The encoder determines the character combination to be a value from 256 to 4095, depending on the symbol of the encoding
  • When encoding, the encoder recognizes new character combinations and adds them to the code table
  • The encoder encodes with values corresponding to the combination of symbols in the code table

  • When decompressed, the LZW decoder can produce exactly the same code table as the encoder
  • Initialize all single characters as the encoder does, assigning values between 0 and 255 to them
  • The code table is updated when decoding all but the first character
  • By reading code words and decoding them into corresponding characters or combinations of characters according to the values in the code table

18. LZ78





References:

LZW:www.cs.usyd.edu.au/~loki/cs2cs… LZ78:www.cs.usyd.edu.au/~loki/cs2cs…