Most people have abilities and opportunities they don’t understand and may do things they don’t dream of. — Dale Carnegie

Children moving from the front end to Node.js will be unfamiliar with this part of the content, because the simple string operations in the front end already meet the basic business needs, and sometimes Buffer, Stream and so on May feel mysterious. Back on the server side, if you want to be more than just a regular Node.js developer, you should dive deeper into learning Buffer to unlock the mystery and take your understanding of Node.js to the next level.

About the author: May Jun, Nodejs Developer, love technology, like to share the 90’s youth, public account “Nodejs technology Stack”, Github open source project www.nodejs.red

Buffer first

Before TypedArray was introduced, the JavaScript language had no mechanism for reading or manipulating binary data streams. The Buffer class was introduced as part of the Node.js API to interact with octets in TCP streams, file system operations, and other contexts. Node.js can be used to process and interact with binary streams of data.

As part of the Node.js API, Buffer is used to read or manipulate binary data streams. It is used to manipulate network protocols, databases, images, file I/O, and other scenarios that require large amounts of binary data. The size of the Buffer is determined at creation time and cannot be adjusted. This Buffer is provided at the C++ level for memory allocation, not V8.

I don’t know if you think this is very simple here? But what about the keywords mentioned above — binary, Stream, Buffer? Let’s try to make some brief introductions.

What is binary data?

When we think of binary, we might think of code commands like 010101, as shown below:

As is shown in the figure above, binary data is data represented by two digits, 0 and 1. In order to store or display some data, the computer needs to first convert the data into binary for representation. For example, if I want to store the number 66, the computer will first convert the number 66 to binary 01000010. In my impression, I first encountered this in the C language course in college. The conversion formula is as follows:

128 64 32 16 8 4 2 1
0 1 0 0 0 0 1 0

We know that numbers are just one of the data types. Others are strings, images, files, and so on. For example, an English M operation is converted to a binary representation by using JavaScript ‘M’.charcodeat () to retrieve the corresponding ASCII code.

What is Stream?

Stream, in English, is an abstraction of an input/output device, which can be a file, network, memory, etc.

Streams are directional. When a program reads data from a data source, such as a file or network, an input stream is opened. For example, we read data from a. Conversely, when our program needs to write data to a specified data source (file, network, etc.), we turn on an output stream. When we have some large file operations, we need a Stream that works like a pipe, sending data out bit by bit.

For example

We now have a large pitcher of water to water a vegetable field, and if we pour all the water into the vegetable field at once, how much effort (in this case, the power of the hardware in a computer) would it take to move it in the first place. If, we brought the water pipe and little by little into our vegetable field, this time do not have to work so hard to finish.

Through the above explanation, we further understand what Stream is. What is the relationship between Stream and Buffer? Look at the following introduction, there are also a lot of knowledge points about Stream itself, welcome to pay attention to the public account “Nodejs technology Stack”, later will be separately introduced.

What is a Buffer?

With the above Stream, we have seen data flow from one end to the other, so how does it flow?

Typically, data is moved in order to process or read it, and decisions are made based on it. Each process has a minimum or maximum amount of data over time. If data arrives faster than the process can consume it, the few data that arrive early will be in the waiting area waiting to be processed. Conversely, if data arrives more slowly than the process consumes, the data that arrived earlier will have to wait for a certain amount of data to arrive before it can be processed.

The waiting area here refers to a Buffer, which is a small physical unit in a computer, usually located in the computer’s RAM. These concepts can be difficult to understand, so don’t worry about further illustration with an example below.

Bus stop ride example

Take a bus stop for example, usually the bus will run every few minutes, even if the passengers are full before arriving at this time, the bus will not leave in advance, early passengers need to wait at the station. Suppose there are too many passengers arriving, and those who arrive later have to wait for the next bus at the bus stop.

In the above example of the bus stop waiting area, corresponding to our Node. Js in the Buffer (Buffer), the speed of the other passengers arrived, we can’t control, and only when to start, we can control, corresponding to our program is we can’t control the time of arrival of the data stream, you can do is to decide when to send data.

Basic Use of Buffer

With the concepts of Buffer in mind, let’s take a look at some of the basic uses of Buffer. This is not a list of all the API uses, but just a few of the common ones. For more details, see Node.js.

Create a Buffer

In versions of Node.js prior to 6.0.0, Buffer instances were created using the Buffer constructor, which allocated the returned Buffer new Buffer() in different ways depending on the arguments provided.

It can now be created using buffer.from (), buffer.alloc (), and buffer.allocunsafe ()

Buffer.from()

const b1 = Buffer.from('10');
const b2 = Buffer.from('10'.'utf8');
const b3 = Buffer.from([10]);
const b4 = Buffer.from(b3);

console.log(b1, b2, b3, b4); // <Buffer 31 30> <Buffer 31 30> <Buffer 0a> <Buffer 0a>
Copy the code

Buffer.alloc

Returns an initialized Buffer to ensure that new buffers will never contain old data.

const bAlloc1 = Buffer.alloc(10); // Create a buffer of 10 bytes

console.log(bAlloc1); // <Buffer 00 00 00 00 00 00 00 00 00 00>
Copy the code

Buffer.allocUnsafe

Create a new uninitialized Buffer of size bytes. Since buffers are uninitialized, the allocated memory fragment may contain sensitive old data. If the contents of a Buffer are readable, it may expose its old data, which is unsafe and should be used with caution.

const bAllocUnsafe1 = Buffer.allocUnsafe(10);

console.log(bAllocUnsafe1); // <Buffer 49 ae c9 cd 49 1d 00 00 11 4f>
Copy the code

Buffer character encoding

You can convert Buffer instances to JavaScript strings using character encodings, which are currently supported as follows:

  • ‘ASCII’ – Only for 7 bit ASCII data. This code is fast and will strip high if set.
  • ‘utF8’ – A multi-byte encoded Unicode character. Many web pages and other document formats use UTF-8.
  • ‘UTF16LE’ – a 2 – or 4-byte, little endian encoded Unicode character. Proxy pairs (U+10000 to U+10FFFF) are supported.
  • ‘ucs2’ – alias of ‘utf16le’.
  • ‘Base64’ – Base64 encoding. When creating a Buffer from a string, this encoding also correctly accepts the “URL and filename security letters” specified in RFC 4648, Section 5.
  • ‘latin1’ – a method of encoding buffers into single-byte encoded strings (defined by IANA in RFC 1345, p. 63, as supplementary blocks to Latin-1 and C0/C1 control codes).
  • ‘binary’ – ‘latin1’ alias
  • ‘hex’ – Encodes each byte into two hexadecimal characters.
const buf = Buffer.from('hello world'.'ascii');
console.log(buf.toString('hex')); // 68656c6c6f20776f726c64
Copy the code

The string and Buffer types are interconverted

String to Buffer

If the encoding is not passed, it is stored in UTF-8 format by default

const buf = Buffer.from('Node.js technology Stack '.'UTF-8');

console.log(buf); // <Buffer 4e 6f 64 65 2e 6a 73 20 e6 8a 80 e6 9c af e6 a0 88>
console.log(buf.length); / / 17
Copy the code

The Buffer is converted to a string

ToString ([encoding], [start], [end]). The default encoding is still UTF-8. If you do not pass start and end, you can complete the conversion. If you pass start and end, you can do some conversion (be careful here)

const buf = Buffer.from('Node.js technology Stack '.'UTF-8');

console.log(buf); // <Buffer 4e 6f 64 65 2e 6a 73 20 e6 8a 80 e6 9c af e6 a0 88>
console.log(buf.length); / / 17
console.log(buf.toString('UTF-8'.0.9)); / / the Node. Js �
Copy the code

In the output of node.js, you can see that gibberish is displayed. � Garbled?

Why do garbled characters appear in the conversion process?

First of all, the default encoding used in the above example is UTF-8. The problem is that a Chinese word takes 3 bytes in UTF-8, and the corresponding bytes in BUF are 8a, 80, e6, and we set the range from 0 to 9, so only 8a is printed. This will cause characters to be truncated appear garbled.

Let’s change the interception range of our example:

const buf = Buffer.from('Node.js technology Stack '.'UTF-8');

console.log(buf); // <Buffer 4e 6f 64 65 2e 6a 73 20 e6 8a 80 e6 9c af e6 a0 88>
console.log(buf.length); / / 17
console.log(buf.toString('UTF-8'.0.11)); / / the Node. Js
Copy the code

You can see that it’s working

Buffer memory mechanism

The section on Memory management and V8 garbage collection in Node.js explains how node.js garbage collection is managed using V8, but it does not mention how Buffer data is collected. Let’s learn about how Buffer data is collected.

Since Buffer needs to process a large amount of binary data, it will cause frequent calls to the system to apply for memory if only one point is used. Therefore, the memory occupied by Buffer is no longer allocated by V8, but is applied at the C++ level of node.js. Memory allocation is done in JavaScript. Therefore, this part of memory is called out-of-heap memory.

Note: The buffer.js source code used below is the Node.js v10.x version at github.com/nodejs/node…

Buffer Memory allocation principle

Node.js uses slab mechanism for pre-application and post-allocation, which is a dynamic management mechanism.

Passing a specified size in a Buffer. Alloc (size) applies a fixed size to a slab. There are three states:

  • Full: Indicates the full assignment state
  • Partial: Partial assignment status
  • Empty: no state has been assigned

8 KB limit

Node.js uses the 8KB boundary to distinguish between small and large objects. You can see the following code in buffer.js

Buffer.poolSize = 8 * 1024; // In line 102, node.js version is v10.x
Copy the code

It was mentioned in the Buffer Introduction section that buffers are sized at creation time and cannot be resized.

Allocating Buffer objects

In the following code example, directly calling createPool() at load time is equivalent to directly initializing an 8 KB memory space, which makes it more efficient to allocate memory the first time. We also initialize a new variable poolOffset = 0 that will record how many bytes have been used.

Buffer.poolSize = 8 * 1024;
varpoolSize, poolOffset, allocPool; .// Omit intermediate code

function createPool() {
  poolSize = Buffer.poolSize;
  allocPool = createUnsafeArrayBuffer(poolSize);
  poolOffset = 0;
}
createPool(); / / 129 rows
Copy the code

In this case, the newly constructed slab looks like this:

Now let’s try to allocate a Buffer of size 2048 as follows:

Buffer.alloc(2 * 1024)
Copy the code

Now let’s look at what the current slab memory looks like. As follows:

So what does this allocation process look like? Let’s look at another core method of buffer.js: allocate(size)

// https://github.com/nodejs/node/blob/v10.x/lib/buffer.js#L318
function allocate(size) {
  if (size <= 0) {
    return new FastBuffer();
  }

  // When the allocated space is smaller than buffer. poolSize is shifted to the right, and the result is 4KB
  if (size < (Buffer.poolSize >>> 1)) {
    if (size > (poolSize - poolOffset))
      createPool();
    var b = new FastBuffer(allocPool, poolOffset, size);
    poolOffset += size; // Add used space
    alignPool(); // 8 bytes memory alignment processing
    return b;
  } else { // C++ level application
    returncreateUnsafeBuffer(size); }}Copy the code

After reading the above code, it is clear when a small Buffer is allocated and when a large Buffer is allocated.

Summary of Buffer memory allocation

This part is really difficult to understand. I have read several books related to Node.js, and I recommend you to read the section “Buffer of Node.js” written by Park Ling.

  1. An 8KB memory space is initialized on the first load, as shown in the source buffer.js
  2. There are small Buffer objects and large Buffer objects according to the requested memory size
  3. In the case of a small Buffer, we continue to determine whether the slab space is sufficient
    • If there is enough space to use the remaining space and update the slab allocation state, the offset will increase
    • If you run out of space, you run out of slab space, you create a new slab space to allocate
  4. For large buffers, createUnsafeBuffer(size) is used
  5. For both small and large buffers, memory allocation is done at the C++ level, memory management is done at the JavaScript level, and eventually can be reclaimed by V8’s garbage collection flag.

Application Scenarios of Buffer

Here are some examples of how buffers can be used in real business. Please add more in the comments section!

I/O operations

I/O can be either file or network I/O. The following is a stream that reads input.txt and writes it to output.txt. What is a Stream? What is a Stream? What is Buffer?

const fs = require('fs');

const inputStream = fs.createReadStream('input.txt'); // Create a readable stream
const outputStream = fs.createWriteStream('output.txt'); // Create a writable stream

inputStream.pipe(outputStream); // Pipe read and write
Copy the code

We don’t need to manually create our own buffers in the Stream, they will be created automatically in the Node.js Stream.

zlib.js

Zlib. js is one of the core libraries of Node.js. It uses the Buffer function to manipulate binary data streams, providing compression or decompression functions. Refer to the source zlib.js source code

encryption

In crypto. CreateCipheriv, the second parameter key is of type String or Buffer. If it is of type Buffer, this is what we are going to use here. The following is a simple example of encryption, focusing on initializing an instance using buffer.alloc () (described above) and then using the fill method, which is highlighted here.

buf.fill(value[, offset[, end]][, encoding])

  • Value: The first parameter is the content to be populated
  • Offset: The offset, the starting position of the fill
  • End: Ends filling the offset of buF
  • Encoding: code set

The following is the symmetric encryption Demo of the Cipher

const crypto = require('crypto');
const [key, iv, algorithm, encoding, cipherEncoding] = [
    'a123456789'.' '.'aes-128-ecb'.'utf8'.'base64'
];

const handleKey = key= > {
    const bytes = Buffer.alloc(16); // Initialize a Buffer instance with 00 for each item
    console.log(bytes); // <Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00>
    bytes.fill(key, 0.10) / / fill
    console.log(bytes); // <Buffer 61 31 32 33 34 35 36 37 38 39 00 00 00 00 00 00>

    return bytes;
}

let cipher = crypto.createCipheriv(algorithm, handleKey(key), iv);
let crypted = cipher.update('Node.js technology Stack ', encoding, cipherEncoding);
    crypted += cipher.final(cipherEncoding);

console.log(crypted) // jE0ODwuKN6iaKFKqd3RF4xFZkOpasy8WfIDl8tRC5t0=
Copy the code

Buffer VS Cache

What is the difference between Buffer and Cache?

Buffer

Buffer (Buffer) is used to process binary stream data, to Buffer the data, it is temporary, for streaming data, the Buffer will be used to temporarily store the data, such as the Buffer to a certain size, and then stored in the hard disk. A classic example is the video player. Sometimes you will see a buffer icon, which means that the set of buffers is not full at this point. When the data reaches the full buffer and is processed, at this point the buffer icon disappears and you can see some image data.

Cache

Cache can be regarded as a middle layer, which can permanently Cache hot data to make access speed faster. For example, we request data from hard disk or other third-party interfaces through Memory, Redis, etc., so as to store data in the Cache area of Memory. This allows faster access to the same resource and is an important point for performance optimization.

For a discussion from Zhihu, click more

Buffer VS String

How do you stress test both String and Buffer?

const http = require('http');
let s = ' ';
for (let i=0; i<1024*10; i++) {
    s+='a'
}

const str = s;
const bufStr = Buffer.from(s);
const server = http.createServer((req, res) = > {
    console.log(req.url);

    if (req.url === '/buffer') {
        res.end(bufStr);
    } else if (req.url === '/string') { res.end(str); }}); server.listen(3000);
Copy the code

The above instance I put in the virtual machine to test, you can also test in the local computer, using AB test tool.

Test string

Take a look at the following important parameters and compare them with buffer transfers

  • Complete requests: 21815
  • [#/ SEC] (mean)
  • Transfer Rate: 3662.39 [Kbytes/ SEC] Received
$ab - 200 - c t 60 http://192.168.6.131:3000/stringCopy the code

Test the buffer

As you can see, the total number of requests transferred through buffer was 50000, QPS more than doubled, and bytes transferred per second were 9138.82 KB, which shows that the performance of converting data to buffer ahead of time was nearly doubled.

  • Complete requests: 50000
  • [#/ SEC] (mean)
  • Transfer Rate: 9138.82 [Kbytes/ SEC] Received
$ab - 200 - c t 60 http://192.168.6.131:3000/bufferCopy the code

The /string interface in the above example returns a string. HTTP converts the string to a Buffer before transferring it. The Buffer is sent to the client as a binary Stream. However, returning the Buffer type directly saves the conversion operation each time, which is also a performance improvement.

In some Web applications, static data can be pre-buffered to reduce CPU reuse (repeated string to Buffer operations).

Reference

  • Nodejs. Cn/API/buffer….
  • The Node.js Buffer is simple
  • Do you want a better understanding of Buffer in Node.js? Check this out.
  • A cartoon intro to ArrayBuffers and SharedArrayBuffers
  • buffer.js v10.x

Welcome everyone to pay attention to “Nodejs technology stack” public number, scan and follow me oh!