What is a stream

define

A stream is an abstract data interface implemented by many objects in Node.js. A stream is an instance of EventEmitter objects, which emit data (measured in buffers) or absorb data. Maybe it’s more intuitive to look at a picture:

Note: Stream is not unique to Node.js. It is a basic operation of an operating system, but node.js has apis that support this operation. Linux command | is a stream.

Koala is dedicated to sharing the complete Node.js technology stack, from JavaScript to Node.js, to back-end database. Wish you become an excellent senior Node.js engineer. [Programmer growth refers to north] Author, Github blog open source project github.com/koala-codin…

Why learn stream

Video Playing Examples

Source is the server-side video, dest is your own player (or flash and H5 video in your browser). If you think about it, the way you watch a movie is just like the way you change water in the pipe above. You stream the video from the server to the local player one by one.

Video playback of this example, if we do not use the pipe and flow of the way, directly from the server to load the video file, and then play. It can cause a lot of problems

  1. System stalling or crashing due to excessive memory usage
  2. Because our network speed memory CPU computing speed is limited, and there are many programs to share the use, a video file loading may be several GIGABytes.

Example of reading a large file data

There is an example of a need to read a large file called data

Read with file

const http = require('http');
const fs = require('fs');
const path = require('path');

const server = http.createServer(function (req, res) {
    const fileName = path.resolve(__dirname, 'data.txt');
    fs.readFile(fileName, function (err, data) {
        res.end(data);
    });
});
server.listen(8000);
Copy the code

There is nothing syntactically wrong with using a file to read this code, but if the data.txt file is very large, up to a few hundred megabytes in response to a large number of concurrent user requests, the program may consume a lot of memory, which may cause user connection problems. And the server memory overhead is also high if the concurrent requests are too large. At this point, let’s look at the stream implementation.

const http = require('http');
const fs = require('fs');
const path = require('path');

const server = http.createServer(function (req, res) {
    const fileName = path.resolve(__dirname, 'data.txt');
    let stream = fs.createReadStream(fileName);  // This line has been changed
    stream.pipe(res); // This line has been changed
});
server.listen(8000);
Copy the code

With Stream, you don’t need to read the entire file and return it. Instead, you can read and return as you go, and the data flows through the pipe to the client, which really takes the strain off the server.

See two examples and you can see why you should use stream. Because reading and manipulating large files, memory, and the network at once is too much, keep the data flowing and do it bit by bit.

Stream process

Look at this bucket pipe flow diagram again

stream

Where does stream come from? – Soucre

There are three common sources of stream:

  1. Input from the console
  2. httpIn the requestrequest
  3. Read the file

Input from the console, 2 and 3 are explained in detail in the stream application Scenarios section.

Look at the code for process.stdin

process.stdin.on('data'.function (chunk) {
    console.log('stream by stdin', chunk)
    console.log('stream by stdin', chunk.toString())
})
// The console enters koalakoala and outputs the result
stream by stdin <Buffer 6b 6f 61 6c 61 6b 6f 61 6c 61 0a>
stream by stdin koalakoala
Copy the code

Run the code above: Any input from the console will be listened for by the data event. Process. stdin is a stream object, and data is a custom function that the stream object uses to listen for incoming data.

A stream object can listen for events such as “data”,”end”,”opne”,”close”, and “error”. Node.js listens for custom events using.on methods, such as process.stdin.on(‘ data ‘,…). , the req. On (‘ data ‘,…). In this way, you can intuitively listen to the incoming and ending of stream data

The pipe that connects the bucket

Pipe (dest) : Pipe (dest); pipe(dest) : pipe(dest); pipe(dest) : pipe(dest);

Where does stream go – Dest

There are three common outputs of stream:

  1. Output console
  2. httpIn the requestresponse
  3. Written to the file

Stream application Scenario

Stream is used to handle IO operations, and HTTP requests and file operations are IO operations. I/O operations are too large at one time, and the hardware costs are too high, which affects the efficiency of software operation. Therefore, I/O operations are divided into batches, and the data flows like a pipe until the flow is complete. The following describes common application scenarios

Introduces a stress test tool

Apache bench is a tool that can stress test network requests. It is a tool that comes with Apache, so you must install Apache to use AB. Apache is delivered with the MAC OS. Windows users can install Apache as required. Start Apache before running AB. Sudo Apachectl start is used for MAC OS.

Apache bench bench: Apache bench bench: Apache bench

The purpose of this gadget is to visually test the following scenarios to see the performance gains from using Stream.

Apply the stream to the GET request

Such a requirement:

Node.js implements an HTTP request, reads the data. TXT file, creates a service, listens on port 8000, reads the file and returns it to the client, compares the get request with a regular file read, see the following example.

  • The response example is normally returned to the client using a file read namedgetTest1.js
// getTest.js
const http = require('http');
const fs = require('fs');
const path = require('path');

const server = http.createServer(function (req, res) {
    const method = req.method; // Get the request method
    if (method === 'GET') { // get request method judgment
        const fileName = path.resolve(__dirname, 'data.txt');
        fs.readFile(fileName, function (err, data) { res.end(data); }); }}); server.listen(8000);
Copy the code
  • Modify part of the above code using stream to return response to the client, and name the file asgetTest2.js
// getTest2.js
// Show the changes
const server = http.createServer(function (req, res) {
    const method = req.method; // Get the request method
    if (method === 'GET') { / / get request
        const fileName = path.resolve(__dirname, 'data.txt');
        let stream = fs.createReadStream(fileName);
        stream.pipe(res); // set res as dest of stream}}); server.listen(8000);
Copy the code

Response is also a stream object. Yes, in the bucket pipe flow diagram, Response is a dest object.

Stream can be used in get requests, but what is the advantage over reading ·res.end(data) directly from file files? This is where the stress test widget we just recommended comes in. Use ab tool to test. Run ab -n 100 -c 100 http://localhost:8000/. -n 100 indicates that 100 requests have been sent successively. -c 100 Indicates that 100 requests are sent at a time. After the use of Stream, there is a very large performance improvement, friends can see their own practical operation.

Use stream in post

A demand to generate a QR code by post request wechat applets address.

Params SRC url/other image request link * params localFilePath: local path * params data: wechat request parameter * */
const downloadFile=async (src, localFilePath, data)=> {
    try{
        const ws = fs.createWriteStream(localFilePath);
        return new Promise((resolve, reject) = > {
            ws.on('finish', () => {
                resolve(localFilePath);
            });
            if (data) {
                request({
                    method: 'POST'.uri: src,
                    json: true.body: data
                }).pipe(ws);
            } else{ request(src).pipe(ws); }}); }catch (e){
        logger.error('wxdownloadFile error: ',e);
        throwe; }}Copy the code

Create a stream object for the local file path, and pipe(ws) the post request to the local file. This type of stream is often used in node backend development.

Post and GET use stream summaries

Request, like Reponse, is a stream object, and you can use the stream properties. The difference is, if we look at the bucket pipe flow diagram,

Use stream in file operations

An example of file copying

const fs = require('fs')
const path = require('path')

// Two file names
const fileName1 = path.resolve(__dirname, 'data.txt')
const fileName2 = path.resolve(__dirname, 'data-bak.txt')
// Read the stream object of the file
const readStream = fs.createReadStream(fileName1)
// Write the stream object to the file
const writeStream = fs.createWriteStream(fileName2)
// Execute copy and data flow through pipe
readStream.pipe(writeStream)
// Copy is complete
readStream.on('end'.function () {
    console.log('Copy done')})Copy the code

Looking at the code, it seems easy to copy. Create a readStream and a writeStream, and stream the data directly through the pipe. This use of stream copy compared to save file read and write implementation copy, performance is much higher, so friends in the encounter file operation requirements, try to evaluate whether to use stream implementation.

Front-end some low-level implementation of packaging tools

At present, some of the more popular front-end packaging build tools, are written by Node.js, packaging and construction process is certainly frequent file operation process, not to stream, such as gulp is now more popular, interested partners can go to see the source code.

The types of the stream

  • Readable StreamReadable data stream
  • Writeable StreamWritable data stream
  • Duplex StreamTwo-way data flow, which can be read and written simultaneously
  • Transform StreamConvert data streams that can be read and written, while converting (processing) data (uncommon)

Previous articles focus on the first two types of readable and writable data streams. The fourth type is not commonly used, so you need to search online. Next, I will explain the third type of data Stream Duplex Stream.

Duplex Stream is a Duplex that can be read and written. Duplex Streams implements both Readable and Writable interfaces. Examples of Duplex Streams include

  • tcp sockets
  • zlib streams
  • crypto streamsI haven’t used Duplex streams in my projects, so I’ll refer to this article for some Duplex StreamsNodeJS Stream A duplex Stream

What are the disadvantages of stream

  • withrs.pipe(ws)The way to write a file is not to put rs contentappendGo after WS, but directly overwrite the content of WS with the content of RS
  • An ended/closed stream cannot be reused and the data stream must be recreated
  • pipeMethod returns the target data stream, such asa.pipe(b)The return is B, so make sure you’re listening to the right person when you listen for events
  • If you’re listening to multiple streams and you’re usingpipeTo concatenate data streams, you would write:
 data
        .on('end'.function() {
            console.log('data end');
        })
        .pipe(a)
        .on('end'.function() {
            console.log('a end');
        })
        .pipe(b)
        .on('end'.function() {
            console.log('b end');
        });
Copy the code

A common library for stream

  • Event-stream has a functional programming feel to it

  • Awesome -nodejs#streams is also a great third-party stream library to check out on github

conclusion

After reading this article, you know a little bit about Stream, and you know that Node has a perfect solution for handling files. In this paper, the bucket pipe flow diagram has been shown for three times, and I hope you can remember it three times. In addition to the above content, will you have some thoughts, such as

  1. What exactly is a stream? Binary orstringType or any other type, and what benefit does this type bring to stream?
  2. The water pipe in the bucket pipe flow diagram, i.epipeWhen does the function fire? Under what circumstances will the circulation be touched? What are the underlying mechanisms? The above question (split into two because it’s too long) will stay with mestreamThe second article explains it in detail.

Today share so much, if you are interested in sharing the content, you can pay attention to the public account “programmer growth refers to north”, or join the technical exchange group, we discuss together.

Join us and learn!

Node learning exchange group

If the number of ac group reaches 100, you cannot join the group automatically. Please add the wechat id of the group assistant: [coder_qi] Note node, you will be automatically added to the group.