Authors: Wang Yang, Xu Jizheng

What is quantization

In the field of digital signal processing, quantization refers to the process of approximating the continuous values of a signal to a finite number of discrete values, or a large number of possible discrete values to a small number of discrete values. Quantization can usually be divided into scalar quantization and vector quantization. Scalar quantization is to divide the value range of sample points in an image or video into several intervals, and only one value (scalar) is used to represent all possible values in each interval. The quantized value of each sample point is a scalar and independent of the quantized value of other sample points. Vector quantization takes n sample points as an N-dimensional vector and divides the value space of the N-dimensional vector into several subspaces. Each subspace uses an N-dimensional vector to represent all the values of the subspace. Since vector quantization takes advantage of the correlation between multiple sample points, vector quantization generally has higher compression efficiency and computational complexity than scalar quantization.

Specifically in the field of video coding, in general, transformation technology is used to remove the spatial redundancy in the residual signal after prediction, so that the energy is concentrated in the low-frequency region, which is reflected in the large transformation coefficient in the low-frequency region and the small transformation coefficient in the high-frequency region. In this case, quantization technology is used to quantify the transformation coefficient. The value range of the larger transform coefficient is mapped to the value range of the smaller quantization coefficient, and the transform coefficient of the high frequency region is quantized to zero by taking advantage of the insensitive characteristic of the human eye to the high frequency information, so as to improve the compression performance. For some specific video signals (such as screen video), residual signal sometimes does not need to be processed by transformation technology, but directly uses quantization technology to process residual signal. Quantization technology usually includes two processes, quantization and reverse quantization. The quantization process converts the transformation coefficient into quantization coefficient, and the coding end needs to decide the optimal quantization coefficient and carry out entropy coding on the quantization coefficient. In the reverse quantization process, the quantization coefficients obtained after entropy decoding are converted into transformation coefficients, which are then used in the reconstruction of coding blocks through the reverse transformation process to obtain residuals.

Quantization techniques in H.266/VVC

The Quantization techniques in H.266/VVC include Uniform Reconstruction Quantization (URQ), Sign Data Hiding, SDH), Trellis-coded Quantization (TCQ)[1]. URQ is a scalar quantization, while SDH and TCQ are multi-coefficient joint quantization. URQ and SDH are existing technologies in HEVC without major changes in H.266/VVC. TCQ is a newly introduced Quantization technique of H.266/VVC, also known as Dependent Quantization (DQ).

Uniform scalar quantization (URQ)

The reverse quantization of URQ

In uniform scalar quantization, the reconstructed value of quantization coefficient is determined by a single parameter (quantization step δ k\Delta_k δ k), i.e., TK ‘= δ K × QKT ‘_k = \Delta_k \times q_ktk’ = δ K ×qk, Where tk ‘t’ _KTK’ represents the reconstructed value of quantization coefficient, qkq_KqK represents quantization coefficient (integer), and KKK table

Index showing quantization coefficient. Like HEVC, H.266/VVC also supports quantization matrix, which can improve the subjective quality of the video by adjusting the quantization weight of different blocks, namely δ K =αk δ \Delta_k=\alpha_k \Delta δ K =αk δ, αk\alpha_kαk represents quantization weight coefficient, The quantization matrix is also called the Scaling List in H.266/VVC. δ \Delta δ represents the Quantization parameter (QP) and is expressed as an integer. The relationship between δ \Delta δ and QP is exponential as follows:


Δ = 2 ( Q P 4 ) / 6 2 B 8 \Delta=2^{(QP-4)/6}\cdot 2^{B-8}

Where, BBB represents the Bit depth of the sample point, and the process of reverse quantization to obtain the reconstructed value of the quantization coefficient can be expressed by the following formula:


t k = Alpha. k 2 ( Q P 4 ) / 6 2 B 8 q k t’_k=\alpha_k\cdot2^{(QP-4)/6}\cdot 2^{B-8}\cdot q_k

In order to reduce the computational complexity and avoid the reconstruction error, the whole integer operation is used in h. 266/VVC reverse quantization. In H.266/VVC, the inverse transformation of a block size W×HW\times HW×H adds a WH⋅2B−15\ SQRT {WH}\cdot 2^{b-15}WH ⋅2B−15 scaling factor, so the scaling process also needs to be represented in the inverse quantization process. The inverse quantization formula after adding the scaling process is as follows:


t k = Alpha. k 2 ( Q P 4 ) / 6 + B 8 2 15 B ( W H ) 1 / 2 q k t’_k=\alpha_k\cdot2^{(QP-4)/6 + B-8}\cdot 2^{15-B}\cdot (WH)^{-1/2}\cdot q_k

Set p = ⌊ QP / ⌋ + 6-8 p = B \ lfloor QP / 6 \ rfloor + B – 8 p = ⌊ QP / ⌋ + 6-8 B, m = 6 m = QP QP % \ % 6 m = QP %, Beta = 12 log ⌈ ⁡ 2 WH ⌉ \ beta = \ lceil \ frac {1} {2} \ log_2 beta = {WH} \ rceil ⌈ 21 log2WH ⌉, gamma = 2 beta – log ⁡ WH \ \ beta gamma = 2-2 \ log_2 {WH} gamma = 2 beta – log2WH, ⌊⌋\lfloor \rfloor⌊⌋ and ⌈⌉\lceil \rceil⌈⌉ respectively represent integer down and integer up, %\%% represents modular operation, because the width (W) and height (H) of the transform block are both powers of 2, so γ∈0,1\gamma∈{0,1}γ∈0,1, The inverse quantization process can be expressed by the following formula:


t k = ( 16 Alpha. k ) ( 2 ( 32 + 3 gamma + m ) / 6 ) 2 p 2 5 Beta. B q k t’_k=(16\alpha_k)\cdot(2^{(32+3\gamma+m)/6})\cdot 2^{p}\cdot 2^{5-\beta-B}\cdot q_k

To implement integer operations, the two terms in the parentheses can be approximated as integers, and the multiplication of 25−β−B2^{5-\ beta-b}25−β−B can be implemented with a shift operation, which modifies the formula as follows:


t k = ( w k ( a [ gamma ] [ m ] The fabric p ) q k + ( ( 1 The fabric b ) 1 ) ) b t’_k=(w_k(a[\gamma][m]\ll p)\cdot q_k+((1\ll b)\gg 1))\gg b

Among them, the exploring \ ll exploring and ≫ \ gg ≫ said bitwise shift to the left and right, b = b + beta – 5 b = b + \ beta – 5 b = b + beta – 5, With 2 x 62\62 * 6 times of a two-dimensional array [gamma] [m] a [\ gamma] [m] a [gamma] [m] said 2 (32 + 3 gamma + m) / 62 ^ {(32 + 3 \ gamma + m) / 6} 2 (32 + 3 gamma + m) / 6 approximation of integer: A = {{40,45,51,57,64,72}, {57,64,72,80,90,102}} a = \ {\ {40, 45, 51, 57 and 64, 72 \}, \ {57, 64, 72, 80, 90, A = 102 \} \} {{40,45,51,57,64,72}, {57,64,72,80,90,102}}, Wk =round(16αk), WK ∈[1:255] W_K =round(16\ alpha_K),~ W_K ∈[1:255] WK =round(16\alpha_k), wK ∈[1:255] WK =round(16αk), wK ∈[1:255] is called the Scaling list, which is the quantization matrix. If the Scaling list is not used, the default value of wkw_kwk is 16, when δ K = δ \Delta_k=\Delta δ k= δ.

In transform skip mode, there is no need to reverse transform, so there is no extra Scaling process for reverse quantization, and the Scaling list cannot be applied to the transform skip mode. Then wk=16,γ=0,b= 10W_k =16,\gamma=0,b= 10wK =16,γ=0,b=10 So the reverse quantization process of transformation skip mode can be simplified as follows:


r k = ( ( a [ 0 ] [ m ] The fabric ( p + 4 ) ) q k + 512 ) 10 r’_k=((a[0][m]\ll(p+4))\cdot q_k+512)\gg10

In addition, URQ is also used to quantify the Palette mode in H.266/VVC. The reverse quantization process of the Palette mode is the same as that of the above transformation skip mode. Only Escape value needs to be quantized in the Palette mode.

URQ quantization process

Quantization dead-zone

As mentioned above, the general reverse quantization process of URQ can be simply expressed as: TK ‘= δ K × QKT ‘_k = \Delta_k \times q_ktk’ = δ K ×qk, then the corresponding quantization process is: Qk = SGN (tk) ⋅ ⌊ ∣ tk ∣ Δ ⌋ q_k = SGN (t_k) \ cdot \ lfloor \ frac {| t_k |} {\ Delta} \ rfloorqk = SGN (tk) ⋅ ⌊ Δ ∣ tk ∣ ⌋, SGN (⋅) SGN (\ Cdot) SGN (⋅) represents a sign function. This quantizer is rounded down and quantizes the transformation coefficient of the interval [0, δ][0, \Delta][0, δ] into 0. The region of 0 after quantization is called dead-zone [2]. The residual transform coefficient is distributed more intensively in inter-frame coding than in intra-frame coding, so the parameter quantization offset FFF is introduced, and the quantization formula after introducing parameter FFF is as follows:


q k = s g n ( t k ) t k + f Δ q_k = sgn(t_k)\cdot \lfloor \frac{|t_k|+f}{\Delta} \rfloor

When FFF increases, the quantized dead zone decreases. When FFF becomes small, the quantization dead zone increases. Quantization of dead zone size can directly affect the subjective quality of video. After transformation, the value of the high frequency part in the video is usually small and close to 0. If the dead zone is large, values near 0 will be quantized to 0, and the video will lose these details. Under normal circumstances, the forecast within the time frame f = Δ / 3 f = \ Delta / 3 f = Δ / 3, the interframe prediction f = Δ / 6 f = \ Delta/f = Δ / 6.

Rate-distortion Optimized Quantization (RDOQ)

At the coding end, the utilization distortion optimization technology is generally used to select the optimal coding parameters (such as block division structure, prediction mode, etc.), and the rATE-distortion optimization technology is applied in the quantization stage (namely, rate-distortion optimization quantization) to obtain the optimal quantization coefficient [3]. The optimal quantization of rATE-distortion can be divided into three steps: quantization of transformation Coefficient, elimination of Coefficient group (CG) and decision of the first non-zero quantization Coefficient (decoding sequence).

  1. In to quantify the transform coefficient, the first use of a uniform quantizer to obtain each transform coefficient quantization coefficient, need to compare each quantitative factor and the rate-distortion cost after fine-tuning, the need to consider the current value of each quantized coefficients, and the current value minus 1 (except for the current quantitative coefficient 0 and 2: the current value is 0, only consider 0; The current value is 2, considering 2, 1, 0), the quantization result of this step is selected for the coefficient of the current position according to the size of the calculation rate distortion cost of different quantization candidate values.
  2. In the step of coefficient group elimination, the encoder calculates the rate-distortion cost when all the quantization coefficients in the whole coefficient group are set to 0, and compares it with the optimal quantization result in the first step. If the rate-distortion cost is minimum at this time, then all the quantization coefficients in the array are set to 0. Considerable bit rate savings can be achieved by encoding only the identification bits of the all-zero coefficient group without requiring the quantization coefficients of the coding coefficient group, while some distortion can also be brought about.
  3. In the last step, all coefficient groups in the transform block have been traversed. The position of the first non-zero quantization coefficient (decoding sequence) needs to be considered, and the optimal position of the first non-zero quantization coefficient is selected by calculating the rate distortion cost of different positions.

Sign Bit Hiding (SDH)

The symbol bit hiding technology in H.266/VVC is no different from that in HEVC. The basic idea of SDH is to improve the compression efficiency by deducing the symbol of the last non-zero quantization coefficient (decoding sequence) of a coefficient set at the decoding end, instead of encoding the symbol of the non-zero quantization coefficient. Compared with scalar quantization at the same quantization step, SDH technique can save one bit for each coefficient group.

Decoding technology for SDH

From decoding the perspective, a set of continuous quantitative coefficients of a coefficient group (CG), if the end of the CG a non-zero coefficient and the first nonzero coefficient index of scanning the difference is more than 3, the CG of the last nonzero coefficient of the sign bit will not need to decode, but by whether all the sum of the absolute value of coefficient of quantitative to export the parity of: If the sum of the absolute values of the quantization coefficients is odd, the sign bit of the last non-zero coefficient is negative, otherwise it is positive.

SDH coding technology

From the point of view of the encoding end, if the parity of the sum of the absolute values of all the quantization coefficients in a CG exactly meets the SDH condition, then the encoding end does not need any additional operation to save one bit for transmitting the last non-zero coefficient symbol. If the oddity of the sum of the absolute values of all quantization coefficients in this CG does not meet SDH’s condition, the encoding end needs to change the value of a quantization coefficient to meet this condition.

Literature [4] proposed the encoding optimization algorithm for SDH and RDOQ joint optimization. When RDOQ determines the first non-zero coefficient and the last non-zero coefficient of decoding sequence in a CG, If the distance between the scan indexes of the two non-zero coefficients is less than 3 or the SDH condition has been met (the odd-even of the sum of the absolute values of all quantization coefficients in CG and the sign bit of the last non-zero coefficient meets the SDH condition), the coding end only needs to skip the coding optimization step of SDH. Otherwise, the coefficients in CG need to be fine-tuned to meet SDH conditions. In fine-tuning, the rATE-distortion cost of all quantization coefficients from the first non-zero coefficient to the last non-zero coefficient should be taken into account, and the adjustment with the lowest rate-distortion cost should be selected as the optimal quantization coefficient. Consider the following special cases:

A) The last non-zero coefficient cannot be adjusted to 0;

B) If the first non-zero coefficient is adjusted to 0, the distance between the new first non-zero coefficient and the last non-zero coefficient is still greater than 3;

C) The quantization coefficient which is 0 between the first non-zero coefficient and the last non-zero coefficient can also be adjusted. If its source signal is positive, the quantization coefficient can be increased by 1; If its source signal is negative, the quantization coefficient can be reduced by 1.

D) When the quantization coefficient is equal to 1, it can be adjusted by either adding or subtracting 1;

E) The adjusted quantization coefficient shall not exceed the maximum or minimum value of the quantization coefficient in entropy coding.

Literature [5] proposed the SDH coding optimization algorithm when the encoding end does not use RDOQ to calculate the difference between the original value (transformation coefficient) of each quantization coefficient in a CG and its quantization value. In order to reduce the coding complexity, only the quantization coefficient of the maximum difference value is considered to be adjusted. If the difference value is positive, the quantization coefficient is added by 1. If the difference is negative, the quantization coefficient is reduced by 1. This adjustment can ensure that the influence of the adjustment of the parity of the sum of the absolute values of the quantization coefficients on the coding performance is minimized.

Grid coded quantization (TCQ)

Basic concepts of TCQ

The concept of TCQ was first seen in literature [6], and grid coding quantization is based on Trellis-coded Modulation (TCM). TCQ improves the quantizer’s mean square error performance by expanding the signal space, so it can achieve better quantization performance with appropriate computational complexity. It is a more efficient quantization method for memoryless and Gauss-Markov sources.

TCQ contains several scalar quantizers, and in most cases the scalar quantizer’s reconstruction value is an integer multiple of the quantization step δ \Delta δ. As shown in Figure 1, TCQ contains two scalar quantifiers, Q0Q_0Q0 and Q1Q_1Q1. Q0Q_0Q0 can represent all the reconstruction values of even multiples of the quantization step δ \Delta δ, and Q1Q_1Q1 can represent all the reconstruction values of odd multiples of the quantization step δ \Delta δ. In addition, both quantizers contain zero quantization steps (i.e., refactoring values equal to 0), which improves coding performance in low bit rate scenarios. The values of the two quantizers are denoted by quantization indexes, for example, in Q0Q_0Q0 quantizer, quantization indexes equal to 3 indicate the reconstructed values of 6 δ 6\Delta6 δ; In the Q1Q_1Q1 quantizer, the reconstruction value represented by a quantization index equal to -2 is −3 DELta-3 \Delta−3 δ.

Figure 1. Schematic diagram of two scalar quantizers in TCQ [7]

In scalar quantization, the reconstructed value of a quantization coefficient can be obtained only through quantization index, but there are two scalar quantizers in TCQ. Besides quantization index, it is also necessary to know which quantizer (Q0Q_0Q0 or Q1Q_1Q1) is used to obtain the quantization coefficient. In TCQ, the choice of scalar quantizer is determined by state transitions in the state machine, assuming that the state machine has 2K(K>=2)2^K (K>=2) 2K(K>=2) states, where each state is assigned a quantizer (or multiple states are assigned a quantizer), The current state is determined by both the previous state and the previous quantization index. In the field of video coding, most transformation blocks only contain a small number of non-zero transformation coefficients, so H.266/VVC uses the simplest four-state state machine (K=2K =2K =2) to represent the transformation process between two scalar quantizers, which can ensure the lowest coding complexity.

Figure 2. State transition process for two quantizer selection [8]

As shown in Figure 2, quantizer Q0Q_0Q0 is used for states 0 and 1, and quantizer Q1Q_1Q1 is used for states 2 and 3. The state transfer in the state machine is carried out in accordance with the coefficient scanning order, so when using TCQ, the coefficients in a code block must be processed in a certain scanning order, which can be used as the index value k=0,1… K = 0, 1,… K = 0, 1,… To represent. The initial state of the state machine S0S_0S0 is set to 0, given the current state SkS_kSk and the current quantization index qkq_kqk, Then the next state Sk+1S_{k+1}Sk+1 is uniquely determined by the current state SkS_kSk and the current quantization index qkq_kqk (in Figure 2 it is determined by the parity pkP_kpk of QKq_kqK). For a transformation coefficient TKT_Ktk, when its corresponding state SkS_kSk is equal to 0 or 1, the quantizer Q0Q_0Q0 is used to quantize TKT_Ktk. When its corresponding state SkS_kSk is equal to 2 or 3, the quantizer Q1Q_1Q1 is used to quantize TKT_Ktk. This state transition process can implicitly divide the quantization index of each quantizer into two subsets: one consisting of even quantization index pk=0p_k= 0PK =0 in Q0Q_0Q0 and the other consisting of odd quantization index PK =1p_k= 1PK =1 in Q0Q_0Q0.

TCQ decoding process

Unlike scalar quantization, when using TCQ, quantization coefficients within a block can only be reconstructed serially because the choice of quantizer depends on the parity of the previous state and the previous quantization coefficient. Given N quantized indexes of a block qkq_kqK, where KKK represents encoding sequence or decoding sequence, the corresponding reconstructed inverse transformation coefficient TK ‘T_K’ TK ‘can be obtained by the following algorithm:


S 0 = 0 S_0=0


f o r   k = 0   t o   N 1   d o for~k=0~to~N-1~do


   t k = ( 2 q k ( s k > > 1 ) s g n ( q k ) ) Δ ~~t_k’=(2 \cdot q_k-(s_k>>1)\cdot sgn(q_k))\cdot \Delta


   s k + 1 = s t a t e T r a n s T a b l e [ s k ] [ q k   &   1 ] ~~s_{k+1}=stateTransTable[s_k][q_k~\&~1]


e n d   f o r end~for

SGN (⋅) SGN (\cdot) SGN (⋅) represents a sign function, δ \Delta δ represents a quantization step, stateTransTable represents a state transition table (as shown in FIG. 3), operator >> represents a right-shift operation, So Sk>>1S_k>>1Sk>>1 denotes the use of the quantizer Q0Q_0Q0 or Q1Q_1Q1, the operator & denotes bits and operations, so qkq_kqk&1 denotes the parity value pkP_kpk of the quantizer index qkq_kqK.

Figure 3. State transition table [8]

Suppose a block contains 5 quantization indexes {3, 2, 4, 0, -1} respectively, and the quantization step is δ \Delta δ. According to the decoding process of TCQ above, The reconstruction values of the quantization coefficients are {6 δ 6\Delta6 δ, 3 δ 3\Delta3 δ, 8 δ 8\Delta8 δ, 0, −2 δ -2\Delta−2 δ} respectively, and the process is shown as follows:

The initial state S0S_0S0 is 0, using the quantizer Q0Q_0Q0, the quantizer index q0=3q_0=3q0=3 is 6 δ 6\Delta6 δ, P0 =q0&1p_0= q_0&1p0 =q0&1 is 1, so the next state S1S_1S1 is 2;

State S1S_1S1 is equal to 2, using the quantizer Q1Q_1Q1, quantization index Q1 =2q_1= 2Q1 =2 after reconstruction is 3 δ 3\Delta3 δ, P1 = Q1&1p_1 = Q_1 \ &1P1 = Q1&1 is equal to 0, so the next state S2S_2S2 is equal to 1;

State S2S_2S2 is equal to 1, using the quantizer Q0Q_0Q0, the quantizer index Q2 =4q_2= 4Q2 =4 is 8 δ 8\Delta8 δ, p2=q2&1p_2=q_2\ &1P2 = Q2&1 is equal to 0, so the next state S3S_3S3 is equal to 2;

State S3S_3S3 is equal to 2, using the quantizer Q1Q_1Q1, quantization index Q3 = 0Q_3 = 0Q3 =0 after reconstruction is 0, P3 = Q3&1P_3 = Q_3 \& 1P3 = Q3&1 is equal to 0, so the next state S4S_4S4 is equal to 1;

State S4S_4S4 = 1, quantizer Q0Q_0Q0 is used, quantizer index Q4 =−1q_4=-1q4=−1 is reconstructed as −2 δ -2\Delta−2 δ, P4 = Q4&1p_4 = Q_4 \&1p4=q4&1 = 1, So the next state, S5S_5S5, is zero;

All the quantized indexes in the block have been refactored and the process is complete.

TCQ coding process

From the point of view of the decoder, the state transition of the two scalar quantizers is unique because of the unique quantization index sequence. From the point of view of coding side, in order to obtain the maximum compression performance with appropriate computational complexity, it is necessary to decide the optimal quantization index sequence. As shown in Figure 4, the possible transfer states of the two scalar quantizers Q0Q_0Q0 and Q1Q_1Q1 can be represented by a grid of 2K(K=2)2^K(K=2)2K(K=2) states, which is the origin of the grid encoding quantization name.

Figure 4. Grid structure of decision quantization index at coding end [7]

⋅ J=D+λ⋅RJ=D+\lambda \cdot RJ=D+λ⋅R, where DDD represents quantization distortion, RRR represents the number of bits required to encode the quantization index, and the Lagrange multiplier λ\lambdaλ usually depends on the quantization step δ \Delta δ. The distortion of a block can be expressed as the distortion sum of each individual transformation coefficient, the quantization distortion of a single transformation coefficient TKT_KTK is shown as follows:


D k ( q k . s k ) = ( t k Δ ( 2 q k ( s k > > 1 ) s g n ( q k ) ) ) 2 D_k(q_k, s_k)=(t_k-\Delta \cdot (2 \cdot q_k – (s_k >> 1) \cdot sgn(q_k)))^2

Where tKT_ktk represents the original transformation coefficient, qkq_kqk represents the corresponding quantization index, and SKs_ksk represents the state of TCQ.

In addition to the four states in the grid shown in Figure 4, the grid contains an “unencoded” state in the H.266/VVC encoder. This “uncoded” state is used to represent the case where the quantization index at the position preceding the first non-zero quantization index is equal to zero. Ideally, regardless of the entropy coding dependence between coefficients, Viterbi algorithm [9] can be used to obtain the shortest path of the grid, and then determine an optimal set of quantitative indexes. However, due to the complex dependence between coefficients in entropy encoding (for example, the number of bits required to encode a quantized index depends on the value of the previous quantized index, which requires real encoding to obtain an accurate value), the optimal solution cannot be obtained by using Viterbi algorithm. However, we can still use the estimated bit number and viterbi algorithm to achieve a good balance between coding performance and computational complexity.

FIG. 5. Grid structure of decision quantization index with “uncoded” state at encoding end [8]

The initial cost of rate-distortion is set to J−1=0J_{-1}=0J−1=0. The transformation coefficients in the grid are processed in accordance with the coding order, and each stage of the grid is represented by the scan index of the transformation coefficients k=0,1… , N – 1 k = 0, 1,… , N – 1 k = 0, 1,… ,N−1, where NNN represents the number of transformation coefficients. For the process from scanning index K − 1K-1K −1 to KKK, the rATE-distortion cost of full connection of all grid nodes from stage K − 1K-1K −1 to stage KKK in the grid is JkJ_kJk. The following three conditions need to be checked: the transition from “uncoded” state to “uncoded” state; Transition from the “uncoded” state to state 0 or 2; Transitions to other states.


u n c o d e d u n c o d e d : J k = J k 1 + D k ( 0 . 0 ) uncoded \longmapsto uncoded: J_k=J *{k-1}+D_k(0, 0)


u n c o d e d ( 0 . 2 ) : J k = J k 1 + D k ( q k . s k ) + Lambda. ( R f i r s t ( x k . y k ) + R k ( q k ) ) uncoded \longmapsto (0, 2): J_k = J*{k – 1} + D_k(q_k, s_k) + \lambda \cdot (R *{first}(x_k, y_k) + R_k(q_k | \cdot \cdot \cdot))


( 0 . 1 . 2 . 3 ) ( 0 . 1 . 2 . 3 ) : J k = J k 1 + D k ( q k . s k ) + Lambda. R k ( q k ) (0, 1, 2, 3) \longmapsto (0, 1, 2, 3): J_k = J*{k – 1} + D_k(q_k, s_k) + \lambda \cdot R_k(q_k | \cdot \cdot \cdot)

Which JkJ_kJk source node rate-distortion cost, says rate item fairly Rk (qk ∣ ⋅ ⋅ ⋅) R_k (q_k | \ cdot \ cdot \ cdot) fairly Rk (qk ∣ ⋅ ⋅ ⋅) according to the given path from the start node to the source node of quantitative index qk – 1, qk – 2,… q _{k-1}, q_{k – 2}, … Qk – 1, qk – 2,… The estimate of the number of bits required by the encoding quantization index QKq_kqK, Rate item Rfirst R_ (xk, yk) {first} (x_k, Y_k)Rfirst(xk,yk) denotes the number of bits required when Coded block flags change from 0 to 1 and transmit the current scan order KKK as the position of the first non-zero quantization coefficient (xk,yk)(x_k, y_k)(xk,yk). Note that the transition from the “uncoded” state to state 0 or state 2 only occurs if the quantized index qk≠0q_k \not= 0qk=0. At this point, for each connection only need to check to get the quantized index qkq_kqk that can minimize distortion Dk(qk,sk)D_k(q_k, s_k)Dk(qk,sk). In addition to the grid connections shown in Figure 5, the coding side of H.266/VVC also needs to check the connections of Coded subblock flag = 0, which will only occur if the target stage KKK is the last scanned index within a subblock.

When the rATE-distortion cost of all nodes in the KKK phase is estimated, only the smallest rate-distortion cost JkJ_kJk connection is retained for each target node. Repeat the above process until k=N−1k= n-1k =N−1, at which point five surviving paths in the grid can be obtained. Finally, the path with the minimum rate-distortion cost JN−1J _{n-1}JN−1 is the final path, and all the quantized indexes on this path q0, Q1… QN – 1 q_0, q_1,… q_{N-1}q0,q1,… QN −1 is the final set of quantized indexes to be encoded. In addition, literature [10] proposed a low-complexity grid search algorithm to reduce the coding complexity of TCQ.

Quantitative parameter control

As mentioned above, H.266/VVC supports three quantization technologies: URQ, SDH, and TCQ. These three quantization technologies can achieve different compromises between coding efficiency and coding complexity, so the ENcoder of H.266/VVC can choose one or more quantization technologies that are best suited to the application scenario. The choice of which quantization technique to use is marked in the Slice header; note that SDH and TCQ control options cannot be turned on at the same time. The use conditions of different quantization and reverse quantization methods are shown in the table below:

In order to support block-based bitrate control algorithms and perceptual optimization encoding methods, the Quantization parameter QP can be adjusted based on blocks, which are called Quantization groups (QG), the size of which is determined by the syntax element in the Picture header. The QP of luminance component adopts differential encoding, that is, for each QG containing non-zero coefficient, the difference between the QP actually used in encoding and the predicted QP; The QP of chromaticity component is obtained by using the QP of its corresponding luminance block through table lookup. H.266/VVC has three lookup tables to obtain QP of chroma component, one for Cb chroma component, one for Cr chroma component, and one for Joint Coding of Chroma Component Residuals (JCCR) mode. In order to support a wider range of transformation functions and color formats, the coder is free to choose the appropriate lookup table. These lookup tables are defined as piecewise linear functions transmitted over the sequence parameter set. The QPS supported by H.266/VVC range from −6(B−8)-6(B-8)−6(B−8) to 63, where B indicates the bit depth of the corresponding color space.

H.266/VVC used Scaling List to represent quantization matrix. For ordinary coding schemes, the default weight coefficient in the Scaling list is 16. For the scale-skipping mode, its quantization process is not affected by Scaling list because there is no reverse transformation process. H.266/VVC includes 28 Scaling lists, and each Scaling List defines a weighting matrix of 2×2, 4×4 and 8×8. The Scaling List can be transmitted in the high-level syntactic structure Adaption Parameter Set (APS), allowing direct reuse of the previous Scaling list and differential encoding of each weighting coefficient of the Scaling List. The Scaling list selection is determined by the color component, prediction mode, maximum width and height of transform blocks. For blocks whose size is not equal to 2×2, 4×4 and 8×8, the weight matrix is resampled by nearest neighbor interpolation.

reference

[1] H. Schwarz, M. Coban, M. Karczewicz, T.-D. Chuang, F. Bossen, A. Alshin, J. Lainema, C. R. Helmrich and T. Wiegand, “Quantization and Entropy Coding in the Versatile Video Coding (VVC) Standard,” in IEEE Transactions on Circuits and Systems for Video Technology, doi: 10.1109 / TCSVT. 2021.3072202.

[2] G.J. Sullivan, “On embedded scalar quantization”, IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2004, vol.4, pp. 605-608, Montreal, Canada, May 2004.

[3] J.-R. Ohm, G.J. Sullivan, H. Schwarz, T.K. Tan; T. Wiegand, “Comparison of the Coding Efficiency of Video Coding Standards — Including High Efficiency Video Coding (HEVC),” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1669-1684, Dec. 2012.

[4] G. Clare, F. Henry, and J. Jung, Sign Data Hiding, JCTVC-G271, 7th Joint Collaborative Team on Video Coding (JCT-VC) Meeting, Geneva, Switzerland, Nov. 2011.

[5] X. Yu, J. Wang, D. He, G. Martin-Cocher, and S. Campbell, Multiple Sign Bits Hiding, JCTVC-H0481, 8th Joint Collaborative Team on Video Coding (JCT-VC) Meeting, San Jose, CA, Feb. 2012.

[6] M. W. Marcellin and T. R. Fischer, “Trellis Coded quantization of Memoryless and Gauss-Markov Sources,” IEEE Transactions on Communications, Vol. 38, No. 1, pp. 82 — 93, Jan. 1990.

[7] H. Schwarz, T. Nguyen, D. Marpe and T. Wiegand, “CE7: Transform Coefficient Coding and Dependent Quantization (Tests 7.1.2, 7.2.1),” JVET-K0071, 2018.

[8] H. Schwarz, T. Nguyen, D. Marpe, and T. Wiegand, “Video Coding with Trellis-Coded Quantization,” in 2019 Data Compression Conference (DCC), March 2019, Pp. 182-191.

[9] G. D. Forney, Jr., “The Viterbi algorithm,” Proceedings of the IEEE, vol. 61, no. 3, pp.

268-278, Mar., 1973.

[10] M. Wang, S. Wang, J. Li, L. Zhang, Y. Wang, S. Ma and S. Kwong, “Low Complexity Trellis-Coded Quantization in Versatile Video Coding,” in IEEE Transactions on Image Processing, Vol. 30, pp. 2378-2393, 2021, doi: 10.1109 / TIP. 2021.3051460.