H.265/HEVC intra-frame prediction coding

As mentioned above, both images and videos have strong spatial correlation, that is to say, the pixel value of a pixel is very close to the pixel value of its neighboring pixels. Using this to carry out predictive coding can remove spatial correlation and compress the image or video volume.

The intra-frame prediction process is roughly as follows:

1. Reference pixel preparation

Due to spatial correlation, we always choose the encoded pixel closest to the current PU as the reference pixel of the current pixel in the PU. In HEVC, that is, the row above the current PU and the left column are selected. To be exact, for THE PU of NxN, N pixels are selected from the top, N pixels from the top right, N pixels from the left, N pixels from the bottom left, and 1 pixel from the top left. A total of 4N+1 pixel is used as the reference pixel.

However, in some cases, some reference pixels are not available at image boundaries, Tile or Slice boundaries, and in some cases, the A and E regions are not yet encoded and their pixels are not available. You need to find pixels to fill.

If there is no pixel in area A, fill it with the pixel at the bottom of area B; if there is no pixel in area E, fill it with the pixel at the right of area D, as shown in the figure above.

If none of the region’s pixels are present, all reference pixels are filled with R=1<<(BitDepth-1), 128 for 8bit pixels and 512 for 10bit pixels.

Now all the reference pixel is ready, and then I will filter these reference pixels to reduce the impact of noise, but because the filtering method is closely related to PU size and prediction mode, this part is in section 3 (O =^• Karu •) O resume ━ Bill

2. Intra-frame prediction

HEVC provides 35 modes for intra-frame prediction, including DC mode, Planar mode and 33 Angle mode. The mode numbers and corresponding names are as follows:

Planar mode 0, DC mode 1, 2 to 34 correspond to 33 Angle modes.

Among them, 2-17 is in the horizontal direction, 19-34 is in the vertical direction, mode 10 is in the horizontal direction, and Mode 26 is in the vertical direction. In these Angle patterns, the closer to the horizontal or vertical direction, the more dense the distribution is, because objects in nature move in the horizontal or vertical direction, the more accurate the prediction is in these directions.

So how do you calculate predicted values using reference pixels based on these patterns?

(1) Planar pattern

For the pixel P at the position (x,y), in order to calculate its predicted value by using Planar mode, the Planar mode shall be applied to the straight top, straight left, and the first pixel at the upper right and the first pixel at the lower left of the current PU. There are altogether 4 pixels. The calculation method is as follows:

Using this method for each pixel in PU, the predicted value for each pixel in Planar mode can be calculated.

(2) DC mode

DC mode first needs to calculate the average of the pixels above and to the left of the current PU (excluding the upper-left, upper-right, and lower-left)

If the current PU is a chromaticity PU or a luminance PU greater than 16×16, then all pixel predicted values in the PU are dcValue. For other cases (luminance PU with size less than or equal to 16×16), the predicted values are calculated as follows.

(3) Angle model

For the Angle pattern, the Angle extension direction may involve both the upper and the left reference pixels, and it is more convenient to project all these pixels onto a row or column, for example, for the vertical class Angle pattern, the left pixel is projected onto the top.

Because each Angle has a different orientation, each Angle needs a different reference pixel for projection. Each Angle has an offset value for either vertical (Mode 26) or horizontal (mode 10), as shown below.

The vertical class pattern M (offset[M]) is taken as an example to construct its one-dimensional reference pixel list Ref[].

Offset [M]<0, that is, M is 18 to 25

② Offset [M]≥0, that is, M is 26 to 33

Horizontal class one-dimensional reference list construction method is similar, the following is a example of 4x4PU pixel projection

Through the above calculation, a one-dimensional reference pixel list Ref[] can be obtained.

For pixel (x,y), find its corresponding reference pixel pos in Ref:

pos = (y*offset[M])>>5

The weight factor w corresponding to pixel (x,y) :

W = (y*offset[M])&31, where & stands for bitwise and operation

Then the predicted value of (x,y) pixel is:

3. Reference pixel filtering

In the above prediction, the adjacent pixels on the left or above are directly used as reference pixels. However, in order to reduce noise and increase prediction accuracy, smooth filtering is required for reference pixels when selecting some prediction modes, usually using a 3-tap low-pass filter. Details are as follows:

Smoothing filtering is not required for 4×4 PU. For all other sizes of PU, smoothing filtering is not required for DC mode, and smoothing filtering is required for Planar mode.
For the Angle model.
- For 8x8PU, only conventional smoothing filtering is performed for modes 2, 18 and 34.
- For 16x16PU, except for modes 9, 10, 11, 25, 26 and 27, all the other 27 Angle modes need conventional smoothing filtering.
- For 32x32PU, all the other 31 Angle modes except modes 10 and 26 need conventional smoothing filtering or strong filtering.

Conventional filtering: tap coefficient is [1,2,1]/4

Strong filtering: strong filtering is performed only on 32x32PU

The calculation method of strong filtering is as follows:

4. Smooth PU boundary values

In order to remove the discontinuity effect of the boundary, the first row and column after PU prediction are filtered for PU smaller than 32×32 when modes 1, 10, and 26 are used.

5. In-frame mode coding

After the prediction mode is selected, the prediction mode is encoded and sent to the decoder. Because there are 35 in-frame prediction patterns that require 6 bits to encode. HEVC defines three Most Probable Modes (MPM[0], MPM[1], and MPM[2] for the current PU. If the current mode is in MPM, only its index needs to be encoded; if it is not in MPM, only 5bit can be encoded in the other 32 Modes.

The construction of an MPM requires the schema information of the adjacent encoded PU to its left and top.

The MPM construction process is as follows:

(1) ModeA and ModeB are the same

①ModeA and ModeB are DC mode or Planar mode

MPM[0] = Planar

MPM[1] = DC

MPM[2] = 26

②ModeA and ModeB are both Angle models

MPM[0] = ModeA

MPM[1] and MPM[2] are two adjacent Angle modes of ModeA

(2) ModeA and ModeB are different

MPM[0] = ModeA

MPM[1] = ModeB

① If ModeA and ModeB are not Planar, MPM[2] = Planar

② When ① is not satisfied, MPM[2]=DC if ModeA and ModeB are not in DC mode

③ When both ① and ② are not satisfied, MPM[2] = 26

If you are interested, please remember to scan the wechat public account. (wechat id: Video Coding)

H.265/HEVC intra-frame prediction coding

1. Reference pixel preparation

2. Intra-frame prediction

3. Reference pixel filtering

4. Smooth PU boundary values

5. In-frame mode coding

Related Posts

MySQL-Binlog

Confused, should we use Lombok or not?

Beats: Use Elastic Stack to monitor the Nginx Web server