I originally intended to introduce node.js polling technology, but before doing so, it is helpful to understand the evolution of the OPERATING system I/O mode. Operating systems typically handle I/O in two steps

  • The application makes a system call and waits for the data to be ready
  • Copy data from the kernel buffer to the application buffer

Github technology Blog: Node.js Technology Stack

Quick navigation

  • Synchronously block I/O
  • Synchronize non-blocking I/ OS
  • I/O multiplexing (Select, poll, epoll, kqueue)
  • Signal drives I/O
  • Asynchronous I/O model
  • Vernacular style (Xiao Ming’s encounter with a girl)

The interview guide

  • What is the difference between SELECT and ePoll for I/O multiplexing?Reference:#

Synchronous blocking IO

From the start of the application the system call -> Data ready to copy -> end of copy, the application is in a wait state and can’t do anything else until the data is copied to user space or something goes wrong and then returns. This is called blocking I/O mode.

Synchronize non-blocking IO

Compared with synchronous blocking I/O mode, synchronization, non-blocking I/O after each call, if the data is not ready will return immediately, after repeated calls to check whether the I/O operation is in place, this to the operation of the CPU is a waste of resources, until the data is ready to copy the data from the kernel into user space, return to the application success indicators.

Read: Is an implementation that evaluates by polling I/O repeatedly.

IO multiplexing

I/O multiplexing is used here, as shown in the figure below. It is divided into two steps, first perform select data ready, then call recvFROM for real I/O read and write operations. It is also advanced in being able to process multiple sockets simultaneously with a single thread.

There are several implementation modes for I/O multiplexing: SELECT, Poll, epoll, and Kqueue

  • select

By polling to check the identifier bit set in the file descriptor to judge, the polling of Select is equivalent to searching for a record in the database without establishing an index, all the sockets are traversed, which is a waste of CPU. In addition, there is a limit to the number of file descriptors that can be opened by a single process, so the select-based polling technique can only handle 1000 concurrent throughput very well

  • poll

Poll and SELECT have no essential difference in implementation. Compared with SELECT, poll is implemented based on linked lists and has no maximum link limit of 1024. However, when there are many file descriptors, each call traverses the link linearly, and performance is still very low.

  • epoll

This is the most efficient I/O event notification mechanism in Linux. There is no maximum link limit. The callbak callback notification mechanism does not linearly traverse the link each time, so that the efficiency does not decrease as the file descriptor increases.

You can listen on about 100,000 ports on a 1GB machine, far exceeding the select limit of 1024. For details, see cat /proc/sys/fs/file-max on the server

  • kqueue

Similar to ePoll, it exists only on FreeBSD, a UNIX operating system.

Signal driven IO

Supported only on Unix, avoids the blocking polling of SELECT compared to I/O multiplexing. The application makes the system call and immediately returns to do something else. When the data is ready, the system sends a SIGIO signal to the application and the application starts reading the data.

Asynchronous IO model

The asynchronous I/O model is currently the most desirable one, in which the application makes a system call and does not wait to return to the current state of the call for subsequent tasks. The result of the I/O operation is notified to our application through callbacks, without blocking.

AIO, an asynchronous I/O implementation, was added after Linux2.6, but few systems can implement it.

Differences between SELECT and epoll

If you ask about the implementation of polling technology, you will generally look at the difference between SELECT and epoll

  • Select uses linear traversal search in the operation mode. After many links, you can imagine a large array each time through the traversal lock link, how much performance consumption. Epoll does not require traversal and uses a callback mechanism, which can be viewed as a HashTable, to lock an object quickly. The select limit for file descriptors (maximum number of connections) is 1024, whereas epoll does not have this limit and typically supports around 10W connections on a machine with 1 GB of memory.
  • In terms of operating system support, the popular high-performance Web server Nginx is based on epoll to achieve high concurrency. Of course, if your link is small, the difference is not big select can also be satisfied, if it is a large traffic, high concurrency epoll is still the preferred model.

Vernacular style

Explain the I/O model and the evolution of polling technology

Story title: Xiao Ming and sister’s encounter plot introduction: Xiao Ming met a sister in a campus variety party, only that sister’s name, mobile phone number, after a few days of hard pursuit, after thousands of mountains and rivers, eventually beauty! Actor introduction: male no.1 @ Xiao Ming, female no.1 @ sister, cross scene @ guard uncle

  1. Synchronously block I/O mode

Xiao Ming phone meet sister at the school gate, and then Xiao Ming is very dedicated, do not see sister do not go home, did not do anything during the period, has been waiting!

  1. Synchronize non-blocking I/O mode

Xiao Ming made an appointment with a girl on the phone at the school gate. The girl was not ready yet. At this time xiaoming is very persistent, every once in a while to send a message to the sister until the sister is ready.

  1. I/O multiplexing mode

    1. selectXiao Ming phone meet sister at the school gate, entrust the guard select uncle to help, select uncle is very dedicated every go out a person will ask, but select uncle has a limit can only ask 1024.
    2. pollPoll is similar to the select function, except that poll has no limit of 1024 and can persist forever. However, when poll exceeds 1024, more and more queries become exhausted.
    3. epollXiao Ming made a phone call with a girl at the school gate, entrusted the guard Epoll to help, epoll was not every inquiry, stipulated that everyone in and out of the school must bring a student ID card, so that opoll is to know which is the goddess of Xiao Ming, Epoll after finding the goddess in the phone to inform Xiao Ming.
  2. Signal driven I/O mode

Xiao Ming called sister at the school gate, the sister replied that I was not ready (a few hours of makeup before going out…) “, this time Xiaoming did not go, but to do other things, such as sister ready after the phone notice Xiaoming, I am ready, Xiaoming this time to go to the school gate waiting for the date and sister.

  1. Asynchronous I/O mode

Xiao Ming told sister we met at the gate of the campus, after xiao Ming did not wait in that dry, but first back to the dormitory to rest or friends in the ball and so on, sister to the school gate after the phone notice Xiao Ming, I have come.

EventLoop in node.js in the next section

Author: you may link: www.imooc.com/article/285… Github: Node.js technology stack

Reference guide

  • UNIX Network programming
  • Linux Network Programming