📖 Blog:NodeJS Module Studies – Zlib

🐱 lot:Github.com/dongyuanxin…

The Zlib module of NodeJS provides resource compression. For example, GZIP, commonly used in HTTP transmission, can greatly reduce network transmission traffic and improve the speed. This article will introduce zlib module and related knowledge points from the following aspects:

  • File compression/decompression
  • Compression/decompression in HTTP
  • Compression algorithm: RLE
  • Compression algorithm: Huffman tree

File compression/decompression

Take gzip compression as an example. The compression code is as follows:

const zlib = require("zlib");
const fs = require("fs");

const gzip = zlib.createGzip();

const rs = fs.createReadStream("./db.json");
const ws = fs.createWriteStream("./db.json.gz");
Copy the code

As shown in the figure below, a 4.7MB file is compressed to 575Kb.

Decompress the compressed file as follows:

const zlib = require("zlib");
const fs = require("fs");

const gunzip = zlib.createGunzip();

const rs = fs.createReadStream("./db.json.gz");
const ws = fs.createWriteStream("./db.json");
Copy the code

Compression/decompression in HTTP

During transmission between the server and the client, the browser (client) uses the accept-Encoding header to tell the server the accepted Encoding, and the server uses the Content-Encoding header to tell the browser (client) the actual Encoding algorithm.

The following is an example of server code:

const zlib = require("zlib");
const fs = require("fs");
const http = require("http");

const server = http.createServer((req, res) = > {
    const rs = fs.createReadStream("./index.html");
    // Prevent cache errors
    // Get the encoding supported by the client
    let acceptEncoding = req.headers["accept-encoding"];
    if(! acceptEncoding) { acceptEncoding ="";
    // Matches the supported compression format
    if (/\bdeflate\b/.test(acceptEncoding)) {
        res.writeHead(200, { "Content-Encoding": "deflate" });
    } else if (/\bgzip\b/.test(acceptEncoding)) {
        res.writeHead(200, { "Content-Encoding": "gzip" });
    } else if (/\bbr\b/.test(acceptEncoding)) {
        res.writeHead(200, { "Content-Encoding": "br" });
    } else {
        res.writeHead(200{}); rs.pipe(res); }}); server.listen(4000);
Copy the code

The client code is simple enough to recognize the Accept-Encoding field and unpack it:

const zlib = require("zlib");
const http = require("http");
const fs = require("fs");
const request = http.get({
    host: "localhost".path: "/index.html".port: 4000.headers: { "Accept-Encoding": "br,gzip,deflate"}}); request.on("response", response => {
    const output = fs.createWriteStream("example.com_index.html");

    switch (response.headers["content-encoding"]) {
        case "br":
        // Alternatively, just use the zlib.createunzip () method to handle both cases:
        case "gzip":
        case "deflate":
            break; }});Copy the code

As you can see from the example above, there are three corresponding decompression/compression apis:

  • zlib.createInflate()zlib.createDeflate()
  • zlib.createGunzip()zlib.createGzip()
  • zlib.createBrotliDecompress()zlib.createBrotliCompress()

Compression algorithm: RLE

RLE is Run Length Encoding. It works by recording the number of occurrences of consecutive repeated data. The formula is: number of occurrences of character *.

For example, the original data is AAAAACCPPPPPPPPERRPPP, a total of 18 bytes. According to RLE rules, the compressed result is A5C2P8E1R2P3, a total of 12 bytes. The compression ratio is: 12/17 = 70.6%

The advantage of RLE is that compression and decompression are very fast, and data compression rates are higher for consecutive occurrences of multiple characters. However, for ABCDE similar data, the compressed data will be larger.

Compression algorithm: Huffman tree

The Huffman tree works by using as few codes as possible to represent characters that occur more frequently. Following this principle, take data ABBCCCDDDD as an example:

character Encoding (binary)
D 0
C 1
B 10
A 11

The original data was 10 bytes. So the encoded data is: 1110101110000, a total of 13 bits, in the computer needs 2 bytes to store. The compression ratio is: 2/10 = 20%.

However, data encoded only according to this principle cannot be restored correctly. For example, 1110 can be interpreted as:

  • 11 + 10
  • 1 plus 1 plus 1 plus 0
  • 1 plus 1 plus 10
  • .

Huffman trees are cleverly designed to be restored correctly. The construction process of Huffman tree is as follows:

Any data type (text file, image file, EXE file) can be compressed using Huffman tree.

Refer to the link

  • Nodejs document
  • 30 minutes HTTP Vary
  • Core knowledge that programmers have to know

👇 scan code to pay attention to the “xin Tan blog”, view the “front-end atlas” & “algorithm problem solving”, adhere to sharing, grow together 👇