In configuring a High Availability cluster, we used Keepalived to implement a high availability configuration for our Nginx server to prevent the whole application from going down because the Nginx server went down. As one of the most popular web server software at present, Nginx can achieve its current status and achievement is not for no reason. Excellent performance, scalability, modifiable design, cross-platform features and very low failure rate are important factors that make Nginx so popular nowadays. How is the overall architecture of Nginx designed? In this article (because understanding the core technology of Nginx requires very deep technical internal forces, I don’t (cry)), we will gently peel back the veil of Nginx and explore the inner workings of Nginx.

The characteristics of Nginx

Nginx can be so popular, and widely adopted by enterprises, there must be a lot of skills full, after consulting relevant information, predecessors summed up a total of six Nginx server compared to other types of Web server software do better, which is also the main focus of Nginx design at the beginning. They are:

  • performance
  • scalability
  • simplicity
  • modifiability
  • visibility
  • portability

Performance:

Performance is the key to Nginx’s success. Even if Nginx does well in other areas, if it doesn’t perform as well as other Web servers, in this era of performance, Nginx is likely to get a cold shoulder. Nginx is different from traditional programs. Other programs such as games may require computing, graphics rendering, and network rendering capabilities. Nginx, as a Web server, has to compete with others in the network domain.

Nginx has done a lot of work on network performance, including the use of event-driven architecture with multi-stage asynchronous processing of requests, and the use of master-workers mechanism, to ensure that Nginx shows excellent performance in high concurrency scenarios.

Scalability:

Scalability can also be understood as scalability, such as Google browser plug-in, firefox plugin, and so on () didn’t know what the right example, Nginx support add related modules to enhance our service, at the same time good modular design allows us to custom or third party development module is used to meet our business needs extra.

Simplicity:

Simplicity usually refers to the simplicity of components, and the simpler each component is, the easier it is to understand, implement, and verify. Of course, the development of Nginx components can not be arbitrary, and should follow the unified specification of Nginx module development, and Nginx module interface is very simple, with high flexibility.

Modifiability:

Nginx is open source based on BSD, which means that when certain features of Nginx do not meet our additional requirements, we can modify its code to meet our business requirements. Nginx also allows you to modify certain configurations of your Web server to take effect without restarting or stopping the service. (Smooth restart)

Visibility:

Visibility is how transparent, how open we are to the user. Nginx has http_stub_status_module for basic visibility, which allows us to see how many links Nginx is currently creating, how many requests Nginx is processing, and so on. These monitoring parameters allow operations to better understand the overall health of the Nginx service. And make adjustments in time. For example, when the Reading + Writing value is high, it means that our current application concurrency is relatively large.

Portability:

Because Nginx is based on C language development, which means that Nginx can run on multiple operating system platforms, and Nginx repackaged logs, data structures and other tools and software, and the core code is operating system independent code implementation, but involves the interaction with the operating system. Nginx provides separate implementations for different operating systems, much like the Java Virtual Machine.

Having said that, how does Nginx implement these operations? Let’s take a brief look at how Nginx is designed to be so efficient from four aspects: module design, event-driven, request handling, and process management.

Excellent modular design:

Nginx follows the same design specification (ngx_module-t), which requires only a few core implementations. In Java, everything is an object except for a few basic types. Such as initialization, exit, and configuration, this has the same benefits as Java interface classes, giving the module designer full freedom while effectively preventing the module designer from causing problems with Nginx itself

Also, the ngX_Module-t specification allows us to customize service types. For example, in the previous configuration section, we mainly discussed Nginx global block, Events block, HTTP block. These are our Nginx module types. For example, the HTTP module is only responsible for the processing of relevant HTTP requests, while all the processing of events is handed over to the Events module.

At the same time, Nginx has introduced the concept of core modules. There are currently six core modules in Nginx, which are used to handle common problems

  • Log (ngx_errlog_module)
  • Events (ngx_events_module)
  • Safety (ngx_openssl_module)
  • Network (ngx_http_module)
  • Email (ngx_mail_module)
  • Core code (ngx_mail_module)

What’s the advantage of this? It means that Nginx non-modular code, such as Nginx’s core code, only needs to focus on how these six modules are called, regardless of how they are implemented. Similarly, the Nginx framework does not constrain the interface and functionality of core modules. This simple, flexible design enables dynamic extensibility and configurability for Nginx. Dynamic customizability brings great convenience. How do you understand this passage?

It doesn’t matter whether the cat is black or white, it’s a good cat that catches mice. The Nginx core module doesn’t matter how you implement it, just implement it. So the realization of the core module can be fully developed. Of course, all of this also needs to follow the relevant specification, but the specification is only a few parts, the overall space for the core module is very large.

Event-driven architecture:

Before we look at Nginx’s event-driven architecture, let’s take a look at how a traditional Web server works, and then get into the little theater:

Report, report tomcat King, a request is coming!

Well, you send a thread along, in case it does anything, and remember, let the thread come back after the request has finished and left.

In the traditional web server, a request will tend to allocate a separate thread or process to deal with, until the end of the thread, this is, of course, no problem, but if it is a request to a half want to read the file again, this time can cause IO block, we can only wait for in that thread, etc. It processed, And request to this process, the end of the threads are always occupies system resources, until the end of the request thread will be destroyed release resources, of course, if the request of the brush just finished processing it no problem, but if the request processing at her for a few minutes, a few minutes, the new request comes only additional to open a new thread, this who have to live, A little bit more concurrency and you get the maximum number of threads.

Of course, the above is just an example. Tomcat supports NIO asynchronous I/O processing after 7, and Tomcat 8 has ENABLED NIO mode by default in Linux.

Where Nginx is different, traditional Web servers tend to have event consumers occupying a process resource alone, whereas Nginx event consumers are called by event distributors for short periods of time. In a traditional Web server, for example, an event occurs when TCP establishes a link, and the link is then handed over to a process to handle consumption, such as reading and writing operations, which are done consistently by the same process.

What makes Nginx unique is that:

Events such as when a TCP connection to the collectors will be our first event, distributor, and distributor will be the events to, remember, to work only with the TCP link the consumer to deal with, and read the event and TCP connection TCP consumers had nothing to do, when read events to distributed to read only responsible for the events on the consumer, All event consumers are only short-term calls of the event distributor process. This design improves network performance, user perception and request delay. Each user’s request can be timely responded. The network throughput of the entire server increases due to timely response to events.

If 200 requests to the traditional web server, will allocate two hundred threads to deal with, if the traditional web server can only apply for maximum two hundred threads, the back of the user is only waiting for the request is completed, in front of the Nginx is two hundred requests a link, link event consumers only put connection event handling, The rest is then handled by other event consumers, so that when the 201st request comes in, it will receive a successful connection response because the TCP connection event consumer has already processed or most of the requested connections.

That’s awesome.

Of course, there are drawbacks to this, that is, our event consumer process can not block and sleep, for example, when a request comes in, the event consumer you are responsible for connecting to blocks, then my event distributor will have to wait for you to finish processing, otherwise I can not connect to execute the read event. Or the event consumer responsible for the TCP connection is so idle that the process falls asleep and the event distributor has to wake it up every time it calls the connection event consumer. So the overall implementation of Nginx is much more difficult than traditional Web servers.

Nginx event handling:

Multi-phase asynchronous processing of requests:

Speaking of multi-phases, only event-driven mechanisms can split a single request into multiple phases in Nginx, so asynchronous multi-phase processing of requests is actually implemented based on Nginx’s event-driven architecture.

An HTTP request for a static file, for example, can be divided into seven phases:

phase Triggering event
Establishing a TCP Connection Description A SYN packet in TCP was received
Start receiving user requests Receiving the ACK packet in TCP indicates that the connection is established successfully
Receive user requests and analyze the completeness of received requests Received data packets from users. Procedure
Start processing the user request when a complete user request is received Received data packets from users. Procedure
Portions of the content are read from the destination static file and sent directly to the user Receiving a packet from the user or receiving an ACK packet in TCP indicates that the user has received the packet sent last time. The TCP sliding window slides forward.
For non-keep-alive requests, actively close the connection after sending the static file. Receiving an ACK packet in TCP indicates that the user has received all the previously sent packets.
The request ended because the user closed the connection Description THE FIN packet in TCP was received.

Of course, for many computer network foundation poor students are not particularly understand it does not matter, this article is not to analyze how Nginx these operations are specifically to achieve, but to macro understanding of Nginx specific with what kind of ideas to design and implementation.

In this way, we can understand that each responding event will have a corresponding special event consumer to handle, because it is a single task (such as only handle connection or close), it is relatively easy and fast for each event consumer to handle. Responsible for the TCP connection event consumers after processing can immediately put down a TCP connection event processing, so that we can make our every event consumers process has been working at full speed without stopping, in the case of high concurrency dormancy the happening of this kind of situation, few process because in the scenario of high concurrency, Each process has so many events to process that there is no time to sleep. And the traditional web server, once appear, process of dormancy, the response of the request for the user’s perception is slow, and in high concurrency scenario, due to a request corresponding to a process (or thread), at this time, if the process is not enough, the system will be to create more process, process between switch will take up a considerable amount of resources of the operating system, This leads to the degradation of our network performance.

But how do you divide a request into phases? It is common to find blocking methods in the request processing flow.

For example, when send is used to send data to users, if the socket handle is blocked, after SEND sends data packets to the operating system kernel, the current process must be hibernated until the data is successfully sent. Nginx divides the send process into two phases based on the different trigger events:

  1. Sends packets to the operating system kernel without waiting for results
  2. Send The result is returned.

So you can use a non-blocking socket handle, and then add the socket handle to the event, so you send it, I’ll do something else, tell me when you send it, and I’ll deal with the packet.

In large files, can also keep the blocking methods according to the time breakdown into multiple stages of method calls, in the absence of open asynchronous I/o, for example, the 1000 MB file processing into 1000 pieces, each 1 m, finish the 1 m, deal with other things immediately, and then come back and then deal with the rest of the 999 m in turn, this has the advantage of, Instead of processing 1,000 megabytes at a time and waiting for it to finish, we can free up our hands to do other things first.

If there is no way to split a blocking operation into multiple stages, Nginx will send a new process to handle the blocking method separately and send a completion event notification when it is complete. While the method is blocked, the impact on other request processing is relatively small because additional processes are processing it.

Management process + multi-worker process design:

Nginx adopts master-worker mechanism, so that each worker process is an independent process, so it avoids the extra overhead caused by locks. If there are multiple cpus, multiple worker processes occupy different CPU cores to work, which improves network performance. This reduces the average latency of requests, because after all, ten processes are faster than one.

The master process does not handle requests, but mainly manages and monitors other worker processes. Therefore, the master process does not occupy too many system resources. Meanwhile, it can achieve load balancing among workers through process communication, such as when a request comes. Assign priority to less stressed worker processes. Similarly, if a single worker process is suspended, it will not affect the processing of other worker processes since the processes are independent. This improves the reliability of the entire system and reduces the risk of application failure caused by the failure of a single process.

As shown in the figure:

Let’s start with the technical summary:

Today, as to the backend Nginx primary introductory tutorial last article, the principle, we through to the Nginx architecture design simple explore a very shallow understanding of the internal Nginx is how to design and work, in general, the content of this article is the foundation, the analysis of the code level barely mentioned, the main reason for the first? Considering that this is a beginner tutorial, so there is no deep analysis of code design, more is the architecture design, implementation ideas above the macro explanation, at least let us know how Nginx works before we do not know the code implementation, the second is that Nginx source code is too complex, It’s not something a novice like me can thoroughly analyze (this is the main reason).

Finally, thank you very much for reading this article, it is a very happy thing for me to help you, if you have any questions or criticism, please leave a comment at the bottom of this article, I will reply one by one if I have time.

Hansu’s study notes have all been open source to Github, please click a star

Long march always feeling, to a star line

Hanshu development notes

Welcome to like, follow me, have good fruit to eat (funny)

Nginx for Nginx for Nginx

Nginx for the backend: The Basics

Nginx for the back end of the introductory tutorial: combat

Nginx for the backend: Configuring a high availability cluster