Parse the nodeJS event loop

This article was first published on Github. If you’re interested, click here

Nodejs is single-threaded and is based on an event-driven, non-blocking IO programming model. This allows us to continue executing code without waiting for the result of the asynchronous operation to return. When an asynchronous event is triggered, the main thread is notified, and the main thread performs a callback to the event.

All of that is well known. Today we start from the source code, analysis nodeJS event loop mechanism.

Nodejs architecture

First, let’s take a look at the NodeJS architecture, as shown below:

User code (JS code)

User code is the application code we write, the NPM package, the JS module built into NodeJS, etc., and we spend most of our daily work writing code at this level.

Binding code or tripartite plug-in (JS or C/C++ code)

Glue code, can let JS call C/C++ code. You can think of it as a bridge, with js at one end and C/C++ at the other, through which JS can call C/C++. In NodeJS, the main role of glue code is to expose nodeJS to the UNDERLYING C/C++ libraries. The three party plug-in is our own C/C++ library, and we need to implement the glue code to bridge JS and C/C++.

The underlying library

Nodejs dependencies include well-known V8 and Libuv. V8: As we all know, it’s a set of efficient javascript runtimes developed by Google, and it’s a big reason nodeJS can execute JS code so efficiently. Libuv: Libuv is a set of asynchronous functions implemented in C language, nodeJS efficient asynchronous programming model is largely due to the implementation of Libuv, and libuv is the focus of today’s analysis. Other dependencies include http-parser: parses HTTP responses openSSL: decrypts and decrypts c-ares: DNS parsing NPM: nodejs package manager…

About nodeJS no longer too much introduction, you can consult learning, next we focus on analysis is libuv.

Libuv architecture

As we know, libuv is the core of NodeJS ‘asynchronous mechanism. Libuv acts as a bridge between NodeJS and asynchronous tasks such as files, networks, and so on. Here is an overview of Libuv:

This is a diagram of the libuv website. It is clear that nodeJS network I/O, file I/O, DNS operations, and some user code are all working in Libuv. Since we’re talking about asynchrony, let’s first summarize the asynchrony events in NodeJS:

The I/O:
- Timer (setTimeout, setInterval)
- Microtask (promise)
- process.nextTick
- setImmediate
- DNS.lookup
The I/O:
- Network I/O
- File I/O
- Some DNS operations
.

Network I/O

For network I/O, different platforms have different implementation mechanisms. Linux is epoll model, Unix-like Kquene, Windows is efficient IOCP completion port, SunOs is event ports. Libuv encapsulates these network I/O models.

File I/O and asynchronous DNS operations

Libuv also maintains an internal thread pool of four threads by default, which are responsible for file I/O operations, DNS operations, and user asynchronous code. When the JS layer passes libuv an operation task, libuv queues the task. Then there are two cases:

When all threads in the thread pool are occupied, tasks in the queue are queued for idle threads.
2. When there are available threads in the thread pool, the thread is removed from the queue and executed. After the execution, the thread is returned to the thread pool and waits for the next task. It also notifies event-loop with an event, and event-loop executes the callback function registered for the event after receiving the event.

Of course, if four threads are not enough, you can set the UV_THREADPOOL_SIZE environment variable when nodejs starts. Libuv limits the number of threads to 128 for system performance.

Nodejs source

Start nodeJS:

Initialize nodeJS by calling platformInit.
2, call performance_node_start method, nodeJS performance statistics.
3. Judge openSSL Settings.
Initialize the libuv thread pool by calling v8_platform.Initialize.
5. Call V8::Initialize to Initialize the V8 environment.
Create a nodejs run instance.
7. Start the instance created in the previous step.
8, start to execute JS file, synchronization code execution is completed, into the event loop.
9, when there is no event to listen to, destroy nodeJS instance, program execution is finished.

This is how nodejs executes a JS file. Let’s focus on the eighth step, the event loop.

Let’s look at a few key sources:

1.core.c, the core file that the event loop runs.

int uv_run(uv_loop_t* loop, uv_run_mode mode) {
  int timeout;
  int r;
  int ran_pending;
// Determine if the event loop is alive.
  r = uv__loop_alive(loop);
  // If not, update the timestamp
  if(! r) uv__update_time(loop);// If the event loop is alive and the event loop is not stopped.
  while(r ! =0 && loop->stop_flag == 0) {
    // Update the current timestamp
    uv__update_time(loop);
    // Execute the Timers queue
    uv__run_timers(loop);
    // Execution is delayed to the I/O callback of this loop because the last loop did not finish.
    ran_pending = uv__run_pending(loop); 
    // Internal call, user does not care, ignore
    uv__run_idle(loop); 
    // Internal call, user does not care, ignore
    uv__run_prepare(loop); 
    
    timeout = 0; 
    if((mode == UV_RUN_ONCE && ! ran_pending) || mode == UV_RUN_DEFAULT)// Calculate the time difference between the arrival of the next timer.
      timeout = uv_backend_timeout(loop);
   If yes, I/O events will be executed. If no, I/O events will be blocked until the timeout time is exceeded.
    uv__io_poll(loop, timeout);
    // Enter the check phase and perform the setImmediate callback.
    uv__run_check(loop);
    // Perform the close phase, which mainly executes the ** close ** event
    uv__run_closing_handles(loop);

    if (mode == UV_RUN_ONCE) {
      
      // Update the current timestamp
      uv__update_time(loop);
      // Run the timers callback again.
      uv__run_timers(loop);
    }
    // Determine whether the current event loop is alive.
    r = uv__loop_alive(loop); 
    if (mode == UV_RUN_ONCE || mode == UV_RUN_NOWAIT)
      break;
  }

  /* The if statement lets gcc compile it to a conditional store. Avoids * dirtying a cache line. */
  if(loop->stop_flag ! =0)
    loop->stop_flag = 0;

  return r;
}
Copy the code

2,timersPhase, source file:timers.c.

void uv__run_timers(uv_loop_t* loop) {
  struct heap_node* heap_node;
  uv_timer_t* handle;

  for (;;) {
  // Fetch the handle of the timer with the closest timeout time in the timer heap
    heap_node = heap_min((struct heap*) &loop->timer_heap);
    if (heap_node == NULL)
      break;
    
    handle = container_of(heap_node, uv_timer_t, heap_node);
    // Check whether the timeout period of the latest timer handle is greater than the current time. If the timeout period is greater than the current time, it indicates that the timer handle has not timed out and the loop is broken out.
    if (handle->timeout > loop->time)
      break;
    // Stop the nearest timer handle
    uv_timer_stop(handle);
    // Check whether the timer handle type is repeat. If yes, create a timer handle again.
    uv_timer_again(handle);
    // Execute the callback function bound to the timer handlehandle->timer_cb(handle); }}Copy the code

3,Polling phaseSource code file:kquene.c

void uv__io_poll(uv_loop_t* loop, int timeout) {
  /* A series of variable initialization */
  // Determine whether an event has occurred
  if (loop->nfds == 0) {
    // Determine whether the observer queue is empty, and return if it is
    assert(QUEUE_EMPTY(&loop->watcher_queue));
    return;
  }
  
  nevents = 0;
  // The observer queue is not empty
  while(! QUEUE_EMPTY(&loop->watcher_queue)) {/* Fetch the observer object from the queue header. Fetch the event of interest to the observer object and listen. * /. W ->events = w->pevents; } assert(timeout >=- 1);
  // If there is a timeout, assign the current time to the base variable
  base = loop->time;
  // The maximum number of listener events to execute in this round
  count = 48; /* Benchmarks suggest this gives the best throughput. */
  // Enter the listening loop
  for (;; nevents = 0) {
  // If there is a timeout, initialize the spec
    if(timeout ! =- 1) {
      spec.tv_sec = timeout / 1000;
      spec.tv_nsec = (timeout % 1000) * 1000000;
    }
    
    if(pset ! =NULL)
      pthread_sigmask(SIG_BLOCK, pset, NULL);
    // Listen for kernel events and return the number of events when they arrive.
    // Timeout indicates the listening timeout period. The value will be returned once the timeout period is reached.
    // We know that timeout is the time interval for the next timers to be passed in, so the event-loop will block until the timeout or a kernel event is triggered.
    nfds = kevent(loop->backend_fd,
                  events,
                  nevents,
                  events,
                  ARRAY_SIZE(events),
                  timeout == - 1 ? NULL : &spec);

    if(pset ! =NULL)
      pthread_sigmask(SIG_UNBLOCK, pset, NULL);

    /* Update loop->time unconditionally. It's tempting to skip the update when * timeout == 0 (i.e. non-blocking poll) but there is no guarantee that the * operating system didn't reschedule our process while in the syscall. */
    SAVE_ERRNO(uv__update_time(loop));
    // If the kernel does not listen for any available events, and there is a timeout period for this listener, return.
    if (nfds == 0) { assert(timeout ! =- 1);
      return;
    }
    
    if (nfds == - 1) {
      if(errno ! = EINTR)abort(a);if (timeout == 0)
        return;

      if (timeout == - 1)
        continue;

      /* Interrupted by a signal. Update timeout and poll again. */
      gotoupdate_timeout; }...// Determine whether the observer queue for the event loop is emptyassert(loop->watchers ! =NULL);
    loop->watchers[loop->nwatchers] = (void*) events;
    loop->watchers[loop->nwatchers + 1] = (void(*)uintptr_t) nfds;
    // Loop through the event returned by the kernel, executing the event-bound callback function
    for (i = 0; i < nfds; I++) {... }}Copy the code

The uv__io_poll phase has the longest source code and the most complex logic, which can be summarized as follows: The event loop blocks in the poll phase when none of the event callbacks registered by the JS layer code return. So when you look at this, you might think, is it going to be blocked here forever?

First, when the poll phase is executed, a timeout is passed. This timeout is the maximum blocking time for the poll phase. Second, in the poll phase, if an event is returned before the timeout period expires, the callback function registered for the event is executed. Timeout When the timeout period expires, the poll phase exits and the next phase is executed.

So, we don’t have to worry about the event loop forever blocking in the poll phase.

These are the two core stages of the event loop. Limited to space, the other timers phase source and setImmediate, process. NextTick involved in the source code is not listed, interested children can see the source code.

Finally, the principle of event loop is summarized as follows, you can not care above, remember the following summary.

Event loop principle

Initialization of a node
- Initialize the node environment.
- Execute the input code.
- Execute the process.nextTick callback.
- Perform microtasks.
Enter the event loop
- Enter thetimersphase
  - Check whether there is a timer callback in the timer queue. If yes, perform the timer callback in ascending order based on the timerId.
  - Check whether there are any process. NextTick tasks. If so, execute them all.
  - Check whether microtasks exist. If yes, execute all microtasks.
  - Exit the phase.
- Enter theIO callbacksPhase.
  - Check for pending I/O callbacks. If so, perform a callback. If not, exit the phase.
  - Check whether there are any process. NextTick tasks. If so, execute them all.
  - Check whether microtasks exist. If yes, execute all microtasks.
  - Exit the phase.
- Enter theIdle, preparePhase:
  - These two stages have little to do with our programming.
- Enter thepollphase
  - First check to see if there are any pending callbacks, and if so, there are two cases.
    - First case:
      - If there are available callbacks (available callbacks include expired timers, some IO events, etc.), perform all available callbacks.
      - Check if there is a process.nextTick callback, and if so, execute it all.
      - Check whether microtaks exist. If yes, execute all microtaks.
      - Exit the phase.
    - The second case:
      - If no callback is available.
      - Check whether there is an immediate callback. If there is, exit the poll phase. If not, block at this stage and wait for new event notification.
  - Exit the poll phase if there are no callbacks that have not been completed.
- Enter thecheckPhase.
  - If there are immediate callbacks, all immediate callbacks are executed.
  - Check if there is a process.nextTick callback, and if so, execute it all.
  - Check whether microtaks exist. If yes, execute all microtaks.
  - Exit the Check phase
- Enter theclosingPhase.
  - If there are immediate callbacks, all immediate callbacks are executed.
  - Check if there is a process.nextTick callback, and if so, execute it all.
  - Check whether microtaks exist. If yes, execute all microtaks.
  - Exit closing stage
- Check if there are active handles (timer, IO, etc.).
  - If so, continue the next cycle.
  - If not, end the event loop and exit the program.

Careful observation shows that the following processes are executed in sequence before each subphase of the event cycle exits:

Check if there is a process.nextTick callback, and if so, execute it all.
Check whether microtaks exist. If yes, execute all microtaks.
Exit the current phase.

Keep that in mind.

So, using the above formula, plug in the various nodeJS event loop tests on the network and you should be able to explain why it produces such results. If not, send me a personal message

Nodejs architecture

Libuv architecture

Network I/O

File I/O and asynchronous DNS operations

Nodejs source

Event loop principle

Related Posts

[fans demand] how to get a front-end web page down?

An in-depth understanding of the js closure concept

React Core knowledge Refs