Akik Conscience still lives

Welcome to add wechat official account: Yulinjun

IO Concept differentiation

Four related concepts:

  • Synchronous

  • Asynchronous (Asynchronous)

  • Blocking

  • Nonblocking

Blocking I/O

Blocking is calling me and I won’t return until I’ve received the data or the result.

In Linux, all sockets are blocked by default. A typical read flow would look something like this:

When a user process calls a system call such as read()/recvfrom(), it enters the kernel space. When the network I/O is empty, the kernel waits for the data to arrive. On the user side, the whole process is blocked until the kernel space returns the data. When the kernel space data is ready, it copies the data from the kernel space to the user space, at which point the user process is released from the blocked state and restarted.

So, blocking I/O is characterized by blocking in both phases of IO execution (user space and kernel space).

Non-blocking I/O

Non-blocking, I’m called, I’m returned immediately. A blocked call means that the current thread is suspended until the result of the call is returned (the thread enters a non-executable state, in which the CPU does not allocate time slices to the thread, i.e. the thread is suspended). The function returns only after the result is obtained.

One might equate blocking calls with synchronous calls, but they are different. For synchronous calls, many times the current thread is still active, but logically the current function does not return, preempts the CPU to perform other logic, and proactively checks if I/O is ready.

The execution model is as follows:

As you can see, the non-blocking I/O feature is that the user process needs to constantly actively ask if the data in the kernel space is ready.

Synchronous I/O

In the operating system, the space for program running is divided into the kernel space and the user space. In the user space, all codes for IO operations (such as file reading and writing, socket sending and receiving, etc.) are entered into the kernel space through system calls to complete actual operations.

And we all know that CPUS are much faster than hard drives, network I/O, etc. In a thread, the CPU executes code very fast. However, once it has an I/O operation, such as reading and writing files or sending network data, it needs to wait for the I/O operation to complete before it can proceed to the next operation. This situation is called synchronous I/O.

In fact, synchronization is when a function call is made, the call does not return until the result is received. That means you have to do one thing at a time and wait until the first thing is done before you can do the next thing.

In the practical work, we seldom use synchronous I/O, because when you read and write a file, for I/O operation, if there is no timely response to the data, the system will hang to read and write the currently executing thread reads waiting for data, and other need CPU code can’t be the current thread of execution, this is the disadvantages of synchronous I/O. Just because one I/O operation blocks the current thread, other code cannot execute, of course, we will choose to use multi-thread or multi-process to execute the code concurrently.

However, multi-threading and multi-processing cannot eliminate this blocking problem, because the system cannot add threads and processes indefinitely due to the limitation of system memory size. In addition, too many threads and processes will lead to a large overhead of system switching threads and processes, the real running code time will be less, so that the subsystem performance will be seriously degraded.

Asynchronous I/O

In simple terms, the user doesn’t need to wait for the kernel to finish reading or writing the IO.

When an asynchronous procedure call is made, the caller does not get the result immediately. The part that actually handles the call notifies the caller through status, notifications, and callbacks after completion.

The I/O process is divided into two phases:

1. Data preparation

2. Copy the kernel space back to the user process buffer space

Whether blocking type IO or non-blocking IO, are synchronized IO model, from and after completion of the first step is to return, but the second step is for the current process to complete, asynchronous I/o, is to start from the first step is returned, until after the second step will return a message, that is to say, asynchronous can let you in the first step to do other things.

The difference between synchronous I/O and asynchronous I/O is whether a process is blocked during data copy

The difference between blocking IO and non-blocking IO is whether an application call returns immediately

The asynchronous I/O transfers THE I/O operations to the kernel and lets the kernel handle them. If the I/O is synchronized, it needs to wait for the I/O operations to be copied from the kernel-mode data buffer to the user-mode data buffer. Therefore, the synchronous I/O is blocked.

Multiplexing I/O

Multiplexing I/O is what we call select, poll, epoll, etc. The advantage of multiplexing is that a single process can process I/O from multiple network connections simultaneously. This is possible because functions such as SELECT, poll, and epoll poll all sockets for which they are responsible, and notify the user process when data arrives on a socket.

In general, we have the following character device read and write methods under Linux, here is a comparison of the use:

1, the query method: has been in the query, constantly to query whether there is an event, the whole process is to occupy CPU resources, very consumption of CPU resources.

2, interrupt mode: when there is an event, to jump to the corresponding event to deal with, CPU occupation time is less.

3. Poll mode: Interrupt mode occupies less CPU resources, but the application needs to continuously execute the reading function in an infinite loop, and the application cannot do other things. The poll mechanism solves this problem by executing the read function only when an event occurs, and other handlers when the key event is not pressed (return no key message if the specified time is exceeded).

Here we can see the advantages of poll. Select, poll, and epoll are all mechanisms for I/O multiplexing. I/O multiplexing is a mechanism for monitoring multiple descriptors and notifying a program to read or write when a descriptor is ready (typically read or write). But SELECT, poll, and epoll are all synchronous I/ OS in nature, because they need to do the reading and writing themselves after the read and write event is ready, that is, the reading and writing process is blocked, whereas asynchronous I/O does not need to do the reading and writing itself. The implementation of asynchronous I/O takes care of copying data from the kernel to user space.

In the server model developed by Epoll, the function epoll_wait() blocks FDS that are ready to use. Copying a ready FD into the epoll_events collection does nothing else (although this is a short period of time, so epoll with non-blocking IO is a very efficient and common server development pattern — synchronous non-blocking IO). Someone called epoll this way synchronous non-blocking (NIO), because the user thread needs to constantly polling, oneself read data, it looks as if there is only one thread in doing things, also some people call this way asynchronous non-blocking (AIO), because, after all, is a kernel thread is responsible for scanning fd list and populate the event list, Personally think really the ideal asynchronous non-blocking, should be the kernel threads fill after the event list, notify the user thread, or the calling application to register a callback function to handle the data, if you also need to user threads continuously polling for event information,, it is not perfect, so there are a lot of people think epoll is pseudo AIO, or reasonable.

The select function

This function allows the process to instruct the kernel to wait for any one of several events to be sent and wake up only after one or more events have occurred or a specified period of time has elapsed. Select is called as follows:

Select ()

(1) Every time you call SELECT, you need to copy the fd set from user mode to kernel mode, which can be very expensive if there are many FD’s

(2) Also, each call to SELECT requires the kernel to iterate over all FDS passed in, which is also expensive when there are many FDS

(3) The number of file descriptors supported by SELECT is too small, default is 1024

Poll has a similar mechanism to SELECT and is not much different in nature from SELECT. Managing multiple descriptors is polling and processing is based on the state of the descriptor, but poll has no limit on the maximum number of file descriptors. A drawback of poll and SELECT is that arrays containing a large number of file descriptors are copied between the user state and the kernel address space as a whole, regardless of whether the file descriptors are ready or not, and their overhead increases linearly with the number of file descriptors.

Epoll was introduced in the 2.6 kernel and is an enhanced version of the previous SELECT and Poll. Compared to select and poll, epoll is more flexible and has no descriptor constraints. Epoll uses a single file descriptor to manage multiple descriptors, storing the events of the file descriptor for the user relationship into an event table in the kernel so that the copy in user space and kernel space is done only once.

Epoll, since it is an improvement on SELECT and Poll, should avoid these three disadvantages. So how does epoll work? Before we do that, let’s look at the differences between the call interfaces of epoll and select and poll. Both select and poll provide only one function, select or poll. Epoll provides three functions,epoll create,epoll CTI, and epoll wait. Epoll Create creates an EPOL handle. Epoll CTL is the event type registered to listen on; Epoll wait waits for an event to occur.

For the first shortcoming, the solution to epoll is in the epoll CTL function. Each time a new event is registered into the epoll handle (epoll CTL ADD is specified in epoll ctI), all FDS are copied into the kernel instead of being copied again during an epoll wait. Epoll ensures that each FD is copied only once during the entire process.

On the second disadvantage, Instead of adding current to the device wait queue for each FD in turn, like select or poll-, the epoll solution only suspends current once (which is essential) and assigns one or more callback functions to each FD The callback function is called when the device is ready to wake up the wait queue, and the callback function adds the ready FD to the ready list. The job of epoll wait is to see if there are ready FDS in the ready list (schedule_ timeout0 is used to determine the effect of a nap, similar to step 7 in the SELECT implementation).

For the third disadvantage, ePoll does not have this limitation. The maximum number of FDS it supports is the maximum number of files that can be opened, which can be well above 2048, for example, On a 1GB machine, the number is about 100,000. You can check the number by cat /proc/sys/fs/file-max. Generally, this number depends on the system memory.

Conclusion:

Select,poll implementation needs to poll all FD collections continuously by itself until the device is ready, possibly alternating between sleep and wake up several times. Epoll also calls epoll Wait to poll the ready list, which may alternate between sleep and wake up several times. However, when the device is ready, epoll calls the callback function, puts the ready FD into the ready list, and wakes up the process that went to sleep in the epoll wait. Although both sleep and alternate, select and Poll traverse the entire FD collection while awake, while EPoll is awake to determine if the ready list is empty, saving CPU time. This is the performance benefit of the callback mechanism.

Select poll from the user state to the kernel state once for each call, and poll has to be copied to the device wait queue once for each call. Epoll has to be copied once for each call, and poll has to be suspended to the device wait queue only once for each call. Note that the wait queue here is not a device wait queue, just a wait queue defined internally by epoll. It also saves a lot of money.

Select select, poll, and epoll according to the specific situations and characteristics of the three methods.

1. On the surface, epoll performs best, but select and poll may perform better than epoll when the number of connections is small and the connections are very active. After all, epoll’s notification mechanism requires many function callbacks.

Select is inefficient because it requires polling every time. But inefficiencies are relative and can be improved by good design, depending on the situation

This is my share of select, poll, epoll, which refers to many people’s articles, if you have a better idea, also welcome to share communication ha.

*

– * * * END –

Recommended reading

[1] C++ smart pointer do you know?

[2] A brief introduction of software framework for embedded low-level development

[3] How does a program in the CPU run as a must-read

[4] C++ anonymous functions (lambda expressions)

[5] Summary and analysis of periodical articles

This public account all original dry goods have been sorted into a catalogue, reply [resources] can be obtained.

Reference links:

Blog.csdn.net/Crazy_Tengt…

Tutorial.linux.doc.embedfire.com/zh_CN/lates…

www.zhihu.com/question/19…