A. Introduction of JPEG

JPEG (Joint Photographic Experts Group) is a digital image compression standard (ISO/IEC 10918), proposed in 1992. JPEG is a lossy compression digital image technology, its core algorithm is discrete cosine transform (DCT).

JPEG compression technology

JPEG coding principle involves some image processing knowledge, it is strongly recommended to take a look at: image and filtering.

The JPEG encoding process is shown below

2.1 stone

The JPEG standard will first divide the image into 8×8 pixels of squares. The DCT, quantization and entropy coding are all for the operation of a single square, and the encoded product is the compressed data of these squares. The compressed data is decoded and restored into pixel data, and then the pixel data that is pieced together into a complete picture is sent to the graphics card for display.

2.2 DCT

DCT can Transform time domain signals into frequency domain signals. The low frequency parts are concentrated in the upper left corner of the matrix, and the high frequency parts are concentrated in the lower right corner of the matrix. It is a variant of Fourier transform, compared with Fourier transform, DCT transform function only uses real numbers, the algorithm is simple, so it is widely used in the field of image compression, JPEG uses two-dimensional DCT transform.

Due to the human eye for low frequency information in the picture (the color change is not obvious, such as the picture is tonal, the object contour) is more sensitive, is not sensitive to high frequency information (color change dramatically, such as the edge of the object, people on the face of small spots), so we can take advantage of the DCT transform to distinguish in the picture, the high frequency and low frequency part, then the high frequency part of the data is compressed, This achieves the function of compressing the picture.

The formula of two-dimensional DCT transformation is:

  • F(u, v) represents the frequency of coordinates (u, v) after DCT transformation
  • C (u) and C (v) can be considered as a compensation coefficient, which can make the DCT transformation matrix an orthogonal matrix
  • F (I, j) represents the pixel data of coordinates (I, j)


2.2 quantitative

Below we use a 50% quantily JPEG quantization table to quantize the frequency data:

  • B(i, j) = round(G(i, j) / Q(i, j))

Such as: Q = round (0, 0) (415.38/16) = – 26 Q (0, 1) = round (30.19/11) = 3

The so-called quantization is actually the frequency/quantization step, and the accuracy information within the quantization step is lost. It can be observed that the upper left corner of the table is small and the lower right corner is large, so the function of this quantization table is to screen high-frequency information. The final quantitative results are as follows:

2.3 the entropy coding

Entropy coding is a kind of coding specification which requires that no information is lost in the process of coding according to entropy principle. The common entropy codes are shannon, Huffman and arithmetic. JPEG first uses RLE(run-length encoding) to arrange the image data in a “zigzag” format, as shown below, so as to store as much frequency zero data together as possible. N zeros in a row, which can be represented by a zero and a length N, compression is good, and then Huffman codes the rest of the positions.

These are the important steps for JPEG encoding. Decoding is basically the reverse operation of the above steps, not to say more, here is just to introduce IDCT

2.4 IDCT

IDCT refers to Inverse DCT, which is the reverse operation of DCT and converts the frequency data of the image into pixel data, the formula is as follows:

  • Pixel data of f(I, j) coordinates (I, j)
  • F(u, v) coordinates (u, v)
  • C (u) and C (v) can be considered as a compensation coefficient, which can make the DCT transformation matrix an orthogonal matrix

In the example above, we first do the reverse quantization operation, that is, multiply the step size:

Then we substituted the frequency data in the figure above into IDCT formula, and finally obtained the restored pixel data:

3. Android instance

Let’s take a look at the effect of DCT conversion through an Android example. To facilitate understanding, I use a YUV picture of Y component to demonstrate the effect. The original picture is as follows

The two-dimensional DCT transformation formula is actually a matrix transformation formula, and the above formula is its summation form, which is relatively inefficient. Generally, matrix operation is used directly in development, and the formula is as follows:

Where X is yuV pixel matrix and Y is frequency domain signal matrix

3.1 DCT transform

Apache’s Commons – Math3 library is used to do matrix calculations

    / / DCT transform
    public static RealMatrix dct2(byte[] yuv, int N) {
        // copy the Y component of N*N
        byte[] y = new byte[N * N];
        System.arraycopy(yuv, 0, y, 0, y.length);
        // Convert a one-dimensional array to a two-dimensional array
        double[][] matrixData = MatrixUtils.toMatrixData(y, N);
        // Construct the YUV matrix
        RealMatrix signalMatrix = new Array2DRowRealMatrix(matrixData);
        // Get the DCT coefficient matrix
        RealMatrix dctMatrix = getDCTMatrix(N);
        // Frequency domain matrix = DCT system matrix * YUV matrix * DCT coefficient transpose matrix
        RealMatrix frequencyMatrix = dctMatrix.multiply(signalMatrix).multiply(dctMatrix.transpose());
        return frequencyMatrix;
    }
    
    // Get the DCT coefficient matrix
    private static RealMatrix getDCTMatrix(int N) {
        double matrixData[][] = new double[N][N];
        for (int i = 0; i < N; i++) {
            for (int j = 0; j < N; j++) {
                double factor = i == 0 ? Math.sqrt(1d / N) : Math.sqrt(2d / N);
                matrixData[i][j] = factor * Math.cos((2 * j + 1) * i * Math.PI * 0.5d / N); }}return new Array2DRowRealMatrix(matrixData);
    }    
Copy the code

After transformation, the effect is as follows:

3.2 IDCT transform

DCT inverse transformation formula is as follows:

 // DCT inverse transform
 public static double[][] idct2(RealMatrix frequencyMatrix, int N) {
        // Get the DCT coefficient matrix
        RealMatrix dctMatrix = getDCTMatrix(N);
        // Frequency domain matrix = DCT coefficient transpose matrix * DCT system matrix * YUV matrix
        RealMatrix frequencyMatrix = dctMatrix.transpose().multiply(frequencyMatrix).multiply(dctMatrix);
        return signalMatrix;
    }
Copy the code

The transformation effect is to transform the above frequency domain signal into YUV signal, and the effect is the same as the original picture (the DCT transformation process is lossless, and the errors in the operation process are ignored).

3.3 Block DCT transform

Divide the picture into 8×8 squares and do DCT transformation for these squares respectively. The code will not be shown and the effect will be directly seen

3.4 quantitative

The JPEG 50% Quantily table is still used here, which looks like this

3.5 Block inverse DCT transform

The quantized 8×8 squares were transformed into a complete picture by IDCT

The address of the project: Gitee:gitee.com/huaisu2020/… Github:github.com/xh2009cn/An…

Refer to the article: www.ruanyifeng.com/blog/2017/1… en.wikipedia.org/wiki/JPEG