This paper analyzes the principle of high concurrency of Node.js single thread

Node is not really a single thread, it is the main thread “single thread”, using an event-driven model to separate I/O from computation.

It also has a thread pool (the Libuv library based on the C/C++ implementation) that is responsible for performing the longer I/O operations (network requests, file reads and writes, etc.) and notifies the main thread when the tasks are complete.

For CPU computing tasks, the main thread does all the work.

Node’s important advantage is that it takes I/O operations out of the main thread, freeing up the main thread to handle more requests.

Therefore, Node is good at performing I/O intensive tasks, but not CPU intensive tasks.

Have you ever considered the following questions when approaching Node?

  • Is Node really single-threaded?
  • If it is single-threaded, how does it handle high concurrency requests?
  • How is the Nodes event driver implemented?
  • Why can Javascript running in a browser interact with the operating system at the bottom level?

Node architecture and operation mechanism

Before answering the above questions let’s take a look at the NodeJS architecture overview diagram:

  • Node Standard Library: A NodeJS Standard Library, or API, written in Javascript.
  • Node Bindings: This layer contains C/C++ Bindings (Glue code), encapsulates V8 and Libuv interfaces down, and provides the base API interface up, which is a bridge between Javascript and C++.
  • V8: Google’S Javascript VM, which provides an environment for Javascript to run on the non-browser side.
  • Libuv: a wrapped library developed specifically for Node.js that provides cross-platform asynchronous I/O capabilities and is responsible for thread pool scheduling at Node runtime.
  • C-ares: Provides asynchronous dnS-related processing capabilities.
  • Http_parser, OpenSSL, zlib: provides HTTP parsing, SSL, data compression, and other capabilities.

NodeJS works like this:

  1. The V8 engine parses the application Javascript script code
  2. Call C/C++ libraries via Node Bindings
  3. When the current event is executed, the event is placed on the call stack for processing
  4. Any I/O requests in the stack are handed over to Libuv, which maintains a thread pool of worker threads called by requests to complete tasks, which in turn call the underlying C/C++ libraries
  5. After the request is processed, Libuv returns the result to the event queue for the main thread to execute
  6. Meanwhile, the main thread continues to perform other tasks

Cross-operating system interaction

For a simple example, if we want to open a file and perform some operations, we can write the following code:

var fs = require('fs');
fs.open('./test.txt'."w".function(err, fd) {    / /.. do something});
Copy the code

Lib /fs.js → SRC /node_file.cc → uv_fs

lib/fs.js

async function open(path, flags, mode) {  
  mode = modeNum(mode, 0o666);  
  path = getPathFromURL(path);
  validatePath(path);
  validateUint32(mode, 'mode');
  return new FileHandle(
    await binding.openFileHandle(pathModule.toNamespacedPath(path),stringToFlags(flags),mode, kUsePromises));
}
Copy the code

src/node_file.cc

static void Open(const FunctionCallbackInfo& args) {  
  Environment* env = Environment::GetCurrent(args);  
  const int argc = args.Length();  
  if(req_wrap_async ! =nullptr) {  
    AsyncCall(env, req_wrap_async, args, "open", UTF8,AfterInteger,uv_fs_open, *path, flags, mode);
  } else {
    CHECK_EQ(argc, 5);    
    FSReqWrapSync req_wrap_sync;    
    FS_SYNC_TRACE_BEGIN(open);    
    int result = SyncCall(env, args[4], &req_wrap_sync,"open",uv_fs_open, *path, flags, mode); FS_SYNC_TRACE_END(open); args.GetReturnValue().Set(result); }}Copy the code

uv_fs

dstfd = uv_fs_open(NULL,&fs_req,req->new_path,dst_flags,statsbuf.st_mode,NULL);
uv_fs_req_cleanup(&fs_req);
Copy the code

The general process is as follows:

When we call fs.open, Node calls the C/C++ layer open function via process.binding, which then calls the Libuv method uv_fs_open. The result is returned as a callback to complete the process.

Main thread “single thread”

In the traditional Web services model, most use multithreading to solve the concurrency problem. Because I/O is blocked, a single thread means the user has to wait, which is obviously unreasonable, so multiple threads are created to respond to user requests.

The single thread of Node means that the main thread is “single thread”. The main thread executes the program code step by step in the order in which it was coded.

We can create a simple Web server to verify this:

var http = require('http');

function sleep (time) {
  var _exit = Date.now() + time * 1000;
  while (Date.now() < _exit) {
  }
}

http.createServer(function (req, res) {
  sleep(10);
  res.end('server sleep 10s');
}).listen(8080);
Copy the code

If the browser requests to access http://localhost:8080 twice in quick succession, the first request is answered about 10 seconds later, and the second request is answered about 20 seconds later.

This is because Javascript is an analytical language, where code is pushed into the stack line by line in the order it is encoded. When the main thread receives the request, the program is pushed into a block of sleep code that executes synchronously (simulating business processing). If a second request comes in within 10 seconds, it will be pushed into the stack and wait for the first request to execute before processing the next one. As shown in the following stack diagram:

This also verifies that the main thread in Node is “single-threaded”.

Event-driven mechanism

Since the main thread of a Node is “single-threaded”, how can it handle ten thousand concurrent threads without causing a block? This is what we call the event-driven mechanism of Node.

Each Node.js process has only one main thread that executes program code, forming an Execution Context Stack.

In addition to the main thread, Node maintains an “Event Queue”. When a user’s network request or other asynchronous I/O operation comes in, Node places it in the Event Queue. It does not execute it, and the code does not block. Until the main thread code completes execution.

After the main thread code is executed, events are fetched from the Event Loop to the beginning of the Event Queue and each Event is allocated a thread from the thread pool to execute until all events in the Event Queue are executed.

When an event completes, the main thread is notified to execute the callback method and the thread returns to the thread pool.

Therefore, node.js is essentially asynchronous, with the thread pool doing all the blocking and handing it over to the internal thread pool. It only takes care of the round-trip scheduling to achieve asynchronous non-blocking I/O, which is the essence of Single-threaded and event-driven Node.

Event Loop Execution sequence

Each node.js event loop contains six phases, corresponding to the implementation in Libuv source code, as shown below:

  • Timers phase: Perform callbacks that expire in setTimeout() and setInterval().
  • I/O Callbacks phase: A few I/O callbacks from the previous cycle are deferred to this phase of the cycle
  • Idle, prepare: Queue movement, internal use only
  • Poll phase: The most important phase, where I/O callbacks are performed and block under appropriate conditions
  • The check phase: performsetImmediate()The callback
  • The close callbacks phase: performsocketcloseEvent callback

** core function uv_run source code **

int uv_run(uv_loop_t* loop, uv_run_mode mode) {
  int timeout;  
  int r;  
  int ran_pending; 
  r = uv__loop_alive(loop); 
  // Check if there are asynchronous tasks in the loop, if not, terminate directly
  if(! r) uv__update_time(loop);// The event loop is just a big while
  while(r ! =0 && loop->stop_flag == 0) { 
    // Update the event phase
    uv__update_time(loop); 
    // Handle the timer callback
    uv__run_timers(loop); 
    // Handle asynchronous task callbacks
    ran_pending = uv__run_pending(loop);
    // Node internal processing phase
    uv__run_idle(loop);
    uv__run_prepare(loop);    
    // Remember that timeout is a time
    // uv_backend_timeout after calculation, passed to uv__io_poll
    // If timeout = 0, uv__io_poll will be skipped
    timeout = 0;    
    if((mode == UV_RUN_ONCE && ! ran_pending) || mode == UV_RUN_DEFAULT) timeout = uv_backend_timeout(loop); uv__io_poll(loop, timeout);// Just do the setImmediate
    uv__run_check(loop);    
    // Close operations such as file descriptors
    uv__run_closing_handles(loop);    
    if (mode == UV_RUN_ONCE) {      
      uv__update_time(loop);
      uv__run_timers(loop);
    }
    r = uv__loop_alive(loop);    
    if (mode == UV_RUN_ONCE || mode == UV_RUN_NOWAIT)      
      break;
  }  
  if(loop->stop_flag ! =0)    
    loop->stop_flag = 0;  
  return r;
}
Copy the code

The Event loop reads events from the Event queue repeatedly and drives the execution of all asynchronous callback functions. The Event loop has a total of 6 stages, and each stage has a task queue. When all stages are executed sequentially, the Event loop completes a tick.