Nginx working principle and Optimization, Vulnerabilities (part 1)

(Click the public account above, you can follow it quickly)

Source: huangguisu

Link: http://blog.csdn.net/hguisu/article/details/8930668

1. Nginx module and working principle

Nginx is made up of a kernel and modules. The kernel is very small and compact in design, and does a very simple job of mapping the client request to a Location block by looking up the configuration file. Each directive configured in this location will start a different module to do the job.

Nginx modules are structurally divided into core modules, basic modules and third-party modules:

Core modules: HTTP module, EVENT module and MAIL module

Basic modules: HTTP Access, HTTP FastCGI, HTTP Proxy, and HTTP Rewrite,

Third-party modules: HTTP Upstream Request Hash module, Notice module, and HTTP Access Key module.

Modules developed by users according to their own needs belong to third-party modules. It is with the support of so many modules that Nginx is so powerful.

Nginx modules are functionally divided into the following three categories.

Handlers. This type of module directly processes the request and does things like output and modify headers information. Handlers generally have only one module.

Filters. This type of module mainly modifies the content output by other processor modules and is finally output by Nginx.

Proxies are Proxies. These modules are modules such as Nginx’s HTTP Upstream, which interact with back-end services such as FastCGI to implement service proxies and load balancing.

Figure 1-1 shows the normal HTTP request and response process of the Nginx module.

Nginx itself does very little work. When it receives an HTTP request, it simply maps the request to a location block by looking up the configuration file, and the directives configured in this location launch different modules to do the work. So modules can be seen as the real labor of Nginx. Directives in a location typically involve a handler module and multiple Filter modules (of course, multiple locations can reuse the same module). The Handler module is responsible for processing the request and generating the response content, while the Filter module processes the response content.

Nginx modules are compiled directly into Nginx and are therefore statically compiled. When Nginx is started, the modules for Nginx are loaded automatically, unlike Apache, which first compiles the modules into an SO file and then specifies whether to load them in the configuration file. When parsing configuration files, each Nginx module may handle a request, but only one module can handle the same request.

2. Nginx process model

In terms of working mode, Nginx is divided into single worker process and multiple worker process two modes. In single-worker mode, in addition to the main process, there is a worker process, the worker process is single-threaded; In multi-worker mode, each worker process contains multiple threads. Nginx defaults to single-worker mode.

After Nginx starts, there will be one master process and multiple worker processes.

The master process

It is mainly used to manage worker processes, including: receiving signals from the outside, sending signals to all worker processes, monitoring the running status of worker processes, and automatically restarting new worker processes when worker processes exit (under abnormal circumstances).

The master process acts as the interface between the entire process group and users and monitors the process. It does not need to deal with network events and is not responsible for the execution of services. It only manages worker processes to restart services, smooth upgrades, replace log files, take effect of configuration files in real time and other functions.

To control nginx, we simply send signals to the master process via kill. For example, kill -hup PID tells Nginx to restart nginx in a leisurely manner. We usually use this signal to restart nginx or reload configuration. What does the master process do when it receives the HUP signal?

First of all, after receiving the signal, the master process will reload the configuration file first, and then start the new worker process, and send a signal to all the old worker processes to tell them that they can retire with honor. The new worker starts to receive new requests after starting, while the old worker stops receiving new requests after receiving the signal from the master, and exits after all unprocessed requests in the current process are processed.

Of course, sending signals directly to the master process is an older way of doing things, but nginx introduced a series of command-line arguments after version 0.8 to make it easier to manage. For example,./nginx -s reload is used to restart nginx, and./nginx -s stop is used to stop nginx.

How do you do that? Taking the Reload example, we see that when we execute the command, we start a new Nginx process. The new Nginx process, after the reload parameter is resolved, knows that we are trying to control Nginx to reload the configuration file. It sends a signal to the master process, and then the next action, Just like we’re sending signals directly to the master process.

Worker processes:

Basic network events are handled in the worker process. Multiple worker processes are peer. They compete equally for requests from clients and are independent of each other. A request can only be processed in one worker process, and it is impossible for a worker process to process requests from other processes. The number of worker processes can be set, which is generally consistent with the number of MACHINE CPU cores. The reason for this is inseparable from nginx’s process model and event processing model.

Worker processes are equal, and each process has the same opportunity to process requests. When we provide HTTP service for port 80, a connection request comes in, and each process can potentially handle the connection. How does that happen? First of all, each worker process is fork from the master process. In the master process, after establishing the socket (listenFD) that requires listen, multiple worker processes are fork out.

The listenFD of all worker processes becomes readable upon the arrival of a new connection. To ensure that only one process processes the connection, all worker processes rush accept_mutex before registering the listenFD read event, and the process that obtains the mutex registers the ListenFD read event. Call Accept in the read event to accept the connection. When a worker process accepts the connection, it starts to read the request, parse the request, process the request, generate data and then return it to the client, and finally disconnect the connection. Such a complete request is like this.

We can see that a request is handled entirely by the worker process, and only within one worker process. Worker processes are equal, and each process has the same opportunity to process requests. When we provide HTTP service for port 80, a connection request comes in, and each process can potentially handle the connection. How does that happen? First of all, each worker process is fork from the master process. In the master process, after establishing the socket (listenFD) that requires listen, multiple worker processes are fork out.

The nginx process model can be represented as follows:

3. Nginx+FastCGI operation principle

1. What is FastCGI

FastCGI is a scalable, high-speed interface for communicating between HTTP Server and dynamic scripting languages. Most popular HTTP servers support FastCGI, including Apache, Nginx, and Lighttpd. FastCGI is also supported by many scripting languages, including PHP.

FastCGI is an evolution of CGI. The main disadvantage of the traditional CGI interface approach is poor performance, because every time the HTTP server encounters a dynamic program, it needs to restart the script parser to perform the parsing and then return the results to the HTTP server. This is almost unavailable when dealing with high concurrent access. In addition, the traditional CGI interface is also very insecure, and is now rarely used.

The FastCGI interface uses C/S structure to separate the HTTP server from the script parsing server and start one or more script parsing daemons on the script parsing server. Each time the HTTP server encounters a dynamic program, it can deliver it directly to the FastCGI process for execution, and then return the results to the browser. This approach allows the HTTP server to exclusively handle static requests or return the results of the dynamic script server to the client, greatly improving the overall performance of the application system.

2, Nginx+FastCGI operation principle

Nginx does not support direct calls or parsing of external programs. All external programs (including PHP) must be called through the FastCGI interface. The FastCGI interface is a Socket under Linux (this socket can be a file socket or an IP socket).

Wrapper: In order to invoke CGI programs, you also need a FastCGI wrapper (wrapper can be a program used to launch another program) bound to a fixed socket, such as a port or file socket. When Nginx sends a CGI request to the socket, the Wrapper receives the request through the FastCGI interface and forks out a new thread that calls the interpreter or an external program to process the script and read the returned data. The Wrapper then passes the data back to Nginx through the FastCGI interface along the fixed socket. Finally, Nginx sends the returned data (HTML pages or images) to the client. This is how Nginx+FastCGI works, as shown in Figure 1-3.

So, we first need a wrapper that will do the following:

Communicate with NINGx via socket by calling fastCGI functions (reading and writing sockets are implemented internally in FastCGI and are non-transparent to wrapper)
Schedule threads for forking and killing
Communicate with application (PHP)

Spawn -fcgi and PHP-FPM

The FastCGI interface starts one or more daemons on the script parsing server to parse dynamic scripts. These processes are called the FastCGI process manager, or FastCGI engine. Spawn -fcgi and phP-fpm are the two FastCGI process managers that support PHP. So HTTPServer is completely freed up for better response and concurrency.

Differences and differences between spawn-fcgi and PHP-FPM:

1) Spawn – FCgi is a part of the HTTP server Lighttpd, currently a separate project, usually used in conjunction with Lighttpd to support PHP. However, LIGTTPD’s SPWAN-FCGi can leak memory and even restart FastCGI automatically during high concurrent access. The PHP script processor is down, and if the user accesses it, there may be white pages (the PHP cannot be parsed or an error occurs).

2) Nginx is a lightweight HTTP server that requires a third-party FastCGI processor to parse PHP, so Nginx is very flexible. It can connect to any third party processor implementation that provides parsing for PHP (which is easy to set up in nginx.conf). Spwan-fcgi can also be used for nginx (you need to install Lighttpd along, but avoid ports for Nginx, some older blogs have tutorials on how to install this), but since spawn fcGi has the flaws described above that users are discovering over time, The nginx+spawn- FCgi combo is now slowly reduced.

Due to the spawn fcgi flaw, there is now a third-party PHP FastCGI processor called PHp-fPM that has the following advantages over spawn fcgi:

Since it is developed as a PATCH for PHP, it needs to be compiled with PHP source code when installed, that is, compiled into PHP core, so it has better performance.

It also handles high concurrency better than spawn- FCGi, at least without automatically restarting the FastCGI processor. Therefore, it is recommended to use the Nginx+PHP/ php-fpm combination for PHP parsing.

Php-fpm has more CPU and memory control than Spawn-FCGI, and it crashes easily and has to be monitored with crontab, whereas PHP-FPM doesn’t.

The main advantage of FastCGI is that it separates the dynamic language from the HTTP Server, so Nginx and PHP/ php-FPM are often deployed on different servers to share the load of the front-end Nginx Server, so that Nginx can only handle static requests and forward dynamic requests. The PHP/ php-FPM server exclusively parses PHP dynamic requests.

4, Nginx + PHP – FPM

Php-fpm is a manager for FastCGI. It exists as a plugin for PHP. To use php-fpm, you need to install php-fpm as a patch in PHP before PHP 5.3.3. It’s a must.)

Php-fpm is a patch to the PHP source code designed to integrate FastCGI process management into the PHP package. You must patch it into your PHP source code before you can use it after compiling and installing PHP.

PHP5.3.3 has been integrated with PHP-FPM and is no longer a third-party package. Php-fpm provides a better way to manage PHP processes, effectively control memory and processes, smooth overloading PHP configurations, and has more advantages than spawn-fcgi, so it is officially included in PHP. Enable php-fpm with the — enable-fpm parameter./configure.

Fastcgi is already in the core of PHP5.3.5, so you don’t need to add — enable-fastcgi to configure. Older versions such as PHP5.2 require this.

When we are done installing Nginx and php-fpm, configure the following information:

The default configuration for php-fpm is php-fpm.conf:

Listen_address 127.0.0.1:9000 # This represents the IP address and port that the PHP fastCGI process listens to

start_servers

min_spare_servers

max_spare_servers

Nginx configuration run PHP: edit nginx.conf and add the following statement:

location ~ .php$ {

root html;

Fastcgi_pass 127.0.0.1:9000; Specifies the port on which the FastCGI process listens and through which nginx interacts with PHP

fastcgi_index index.php;

include fastcgi_params;

fastcgi_param SCRIPT_FILENAME /usr/local/nginx/html$fastcgi_script_name;

}

Nginx uses the location directive to hand over all PHP files to 127.0.0.1:9000, where the IP addresses and ports are the ones that the FastCGI process listens for.

Its overall working process:

1) The FastCGI process manager php-fpm initializes itself, starts the main process php-fpm and starts the CGI child process start_Servers.

The main php-fpm process manages the FastCGI child process and listens on the 9000 port.

The FastCGI child process waits for a connection from the Web Server.

Nginx directs all PHP files to 127.0.0.1:9000 using the location directive. The Nginx directs all PHP files to 127.0.0.1:9000 using the location directive. Give all files with the PHP suffix to 127.0.0.1:9000 for processing.

3) FastCGI process manager PHP-fpm selects and connects to a child process CGI interpreter. Web Server sends CGI environment variables and standard input to the FastCGI child process.

4) The FastCGI child process returns the standard output and error message from the same connection to the Web Server. The request is processed when the FastCGI child closes the connection.

5) The FastCGI child process then waits and processes the next connection from the FastCGI process manager (running in WebServer).

4. Nginx+PHP is configured correctly

The web generally does a uniform entry: send PHP requests to the same file, and then route them by parsing “REQUEST_URI” in that file.

Nginx configuration files are divided into several blocks, common from the outside in order of “HTTP”, “server”, “location”, etc. The default inheritance relationship is from the outside in, that is, the inner block automatically gets the value of the outer block as the default value.

Such as:

server {

listen 80;

    server_name foo.com;

    root /path;

    location / {

        index index.html index.htm index.php;

if (! -e $request_filename) {

            rewrite . /index.php last;

        }

    }

    location ~ .php$ {

        include fastcgi_params;

        fastcgi_param SCRIPT_FILENAME /path$fastcgi_script_name;

Fastcgi_pass 127.0.0.1:9000;

        fastcgi_index index.php;

    }

}

1) Index should not be defined in the Location module

Once a new “location” needs to be added in the future, it is inevitable that there will be a repeated definition of “index” instruction. This is because multiple “location” are horizontal relations and there is no inheritance. At this time, “index” should be defined in “server”. The index directive works in all locations.

2) use try_files

Moving on to the “if” directive, it is safe to say that it is the most misunderstood Nginx directive:

if (! -e $request_filename) {

rewrite . /index.php last;

}

A lot of people like to do a bunch of checks with the “if” command, but that’s really what the “try_files” command does:

try_files $uri $uri/ /index.php;

On top of that, beginners tend to think the “if” directive is kernel-level, but it’s actually part of the rewrite module, and Nginx configuration is really declarative, not procedural, so mixing it with non-rewrite module directives can get you wrong.

3) Fastcgi_params “configuration file:

include fastcgi_params;

Nginx has two fastCGI configuration files, “fastcgi_params” and “fastcgi.conf”, which are not too different, the only difference being that the latter has an extra line

Definition of “SCRIPT_FILENAME” :

fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;

Note that there is no slash between $document_root and $fastcgi_script_name.

Nginx originally only had “fastcgi_params”, but it turned out that many people were defining “SCRIPT_FILENAME” in a hard-coded way, so “fastcgi.conf” was introduced to standardize usage.

But this raises the question: why introduce a new configuration file instead of modifying the old one? This is because the “fastcgi_param” directive is an array. It replaces the outer layer with the inner layer. Unlike normal commands, when used multiple times at the same level, they are added instead of replaced. In other words, if “SCRIPT_FILENAME” was defined twice at the same level, they would both be sent to the back end, which could cause some potential problems, and a new configuration file was introduced to avoid this situation.

There is also a security concern: with “cgi.fix_pathinfo” turned on in PHP, PHP may parse the wrong file type as a PHP file. If Nginx and PHP are installed on the same server, the simplest solution is to do a filter with the “try_files” command:

try_files $uri =404;

According to the previous analysis, give an improved version, is not much cleaner than the original version:

server {

    listen 80;

    server_name foo.com;

    root /path;

    index index.html index.htm index.php;

    location / {

        try_files $uri $uri/ /index.php;

    }

    location ~ .php$ {

       try_files $uri =404;

       include fastcgi.conf;

Fastcgi_pass 127.0.0.1:9000;

   }

}

5. Why Nginx has high performance – multi-process IO model

1. Benefits of nginx’s multi-process model

First of all, for each worker process, an independent process does not need to lock, so the cost of locking is saved. Meanwhile, programming and problem finding are much more convenient.

Secondly, independent processes can not affect each other. After one process exits, other processes are still working and the service will not be interrupted. The master process will soon start a new worker process. Of course, the abnormal exit of worker process must be a bug in the program. Abnormal exit will lead to the failure of all requests on the current worker, but it will not affect all requests, so the risk is reduced.

2. Nginx multi-process event model: asynchronous non-blocking

Nginx uses a multi-worker approach to handle requests. There is only one main thread per worker, so the number of concurrent requests can be limited. How many workers can handle concurrent requests? No, this is the genius of Nginx. Nginx takes an asynchronous, non-blocking approach to processing requests. In other words, Nginx can handle thousands of requests simultaneously.

The number of requests that a worker process can process at the same time is only limited by the size of memory. Moreover, in terms of architectural design, there is almost no restriction of synchronization lock when processing concurrent requests between different worker processes. Worker processes usually do not go to sleep. When the number of processes on Nginx is equal to the number of CPU cores (ideally each worker process is bound to a specific CPU core), the cost of interprocess switching is minimal.

The way Apache usually works (there is an asynchronous non-blocking version of Apache, but it doesn’t work that often because it conflicts with some of its own modules), each process processes only one request at a time, so when the number of concurrent requests reaches thousands, thousands of processes are processing requests at the same time. This is a significant challenge for the operating system. The memory footprint of the process is very high, and the CPU overhead of the context switch of the process is very high, and the natural performance is not good, and the overhead is completely meaningless.

Why can nginx handle asynchronous non-blocking, or what is asynchronous non-blocking all about?

Let’s go back to the beginning and look at the complete process of a request: first, the request comes in, a connection is established, data is received, data is received, data is sent.

Specifically to the bottom of the system, is the read and write event, and when the read and write event is not ready, it must not operate, if you do not have a non-blocking way to call, then you have to block the call, the event is not ready, then you can only wait, when the event is ready, you continue. Blocking calls will enter the kernel and wait, and the CPU will be let out for others to use, which is obviously not suitable for single-threaded workers. When there are more network events, everyone is waiting, and the CPU is idle and no one is using it, so the CPU utilization rate naturally cannot increase, not to mention high concurrency.

Well, if you’re talking about adding processes, how is this different from Apache’s threading model? Be careful not to add unnecessary context switches. Therefore, blocking system calls are the most taboo in Nginx. If you don’t block, you’re not blocking. If it’s not blocked, the event is not ready, and I’m going to return to EAGAIN, and I’m going to tell you that the event is not ready, so why are you panting? Come back later.

Well, you can check on the event later until it’s ready. In the meantime, you can do other things first and then check on the event. It’s not blocking, but you’ll have to check the status of events from time to time. You can do more, but it’s expensive.

On the IO model: http://blog.csdn.net/hguisu/article/details/7453390

Nginx supports the following event model (nginx wiki) :

Nginx supports the following methods for handling connections (I/O reuse methods), which can be specified by the use directive.

Select – Standard method. It is the default method at compile time if no more efficient method is available on the current platform. You can enable or disable this module with the configuration parameters — with-select_Module and — without-select_Module.
Poll – Standard method. It is the default method at compile time if no more efficient method is available on the current platform. You can enable or disable this module using the configuration parameters — with-poll_module and — without-poll_module.
Kqueue – Efficient method for FreeBSD 4.1+, OpenBSD 2.9+, NetBSD 2.0 and MacOS X. Dual processor MacOS X systems using Kqueue can cause a kernel crash.
Epoll – Efficient method for Linux kernel 2.6 and later systems. In some distributions, such as SuSE 8.2, there are patches for the 2.4 kernel to support epoll.
Rtsig – Executable real-time signal for use on Linux kernel versions 2.2.19 and later. By default, no more than 1024 POSIX real-time (queuing) signals can appear in the entire system. This situation is inefficient for high-load servers; Therefore, it is necessary to increase the queue size by adjusting the kernel parameter /proc/sys/kernel/rtsig-max. However, starting with Linux kernel version 2.6.6-mm2, this parameter is no longer used, and there is a separate queue for each process. The size of this queue can be adjusted with RLIMIT_SIGPENDING. When the queue becomes too congested, Nginx abandons it and starts using the poll method to process connections until they return to normal.
/dev/poll — Efficient method for Solaris 7 11/99+, HP/UX 11.22+ (EventPort), IRIX 6.5.15+ and Tru64 UNIX 5.A +.
Eventport — Efficient method for Solaris 10. To prevent kernel crashes, it is necessary to install this security patch.

Under Linux, epoll is the only efficient method

Now let’s see how efficient epoll is

Epoll is a poll modified by the Linux kernel for handling large volumes of handles. To use epoll, you only need these three system calls: epoll_create(2), epoll_ctl(2), and epoll_wait(2). Epoll (4) is a new API introduced in Linux kernel 2.5.44, which has been widely used in 2.6 kernel.

The advantages of the epoll

Support a process to open a large number of socket descriptors (FD)

The most intolerable thing about select is that there is a limit to the number of FDS a process can open, set by FD_SETSIZE. The default value is 2048. That’s too little for an IM server that needs to support tens of thousands of connections. You can either change the macro and recompile the kernel, which can lead to a loss of network efficiency, or you can choose a multi-process solution (the traditional Apache solution), but the cost of creating processes on Linux is relatively small, but still significant. Moreover, inter-process data synchronization is not nearly as efficient as inter-thread synchronization, so it is not a perfect solution. Epoll does not have this limitation, however. The maximum number of FDS it supports is the maximum number of files that can be opened, which is usually much higher than 2048, for example, around 100,000 on a 1GB machine. The exact number can be checked by cat /proc/sys/fs/file-max. Generally, this number depends on the system memory.

IO efficiency does not decrease linearly with increasing FD number

Another fatal weakness of traditional SELECT /poll is when you have a large set of sockets, but due to network latency, only some of the sockets are “active” at any one time, but select/poll scans the entire set linearly with each call, resulting in a linear decrease in efficiency. Epoll does not have this problem and only operates on “active” sockets-this is because in the kernel implementation epoll is implemented according to the callback function on each FD. Therefore, only “active” sockets actively call callback functions, but other idle sockets do not. In this regard, epoll implements a “pseudo-AIO” because the driving force is in the OS kernel.

In some benchmarks, epoll is no more efficient than select/poll if all sockets are active, such as in a high-speed LAN environment, whereas epoll_ctl is slightly less efficient if used too much. But once you use Idle Connections to simulate a WAN environment,epoll is far more efficient than SELECT /poll.

Use Mmap to speed up kernel-user space messaging.

This actually involves the implementation of epoll. Either SELECT,poll, or epoll requires the kernel to notify user space of FD messages. It is important to avoid unnecessary memory copying. In this case, epoll is implemented by the kernel in the same memory as user space MMAP. And if you’ve been following Epoll since the 2.5 kernel like ME, you won’t forget the manual MMap step.

The kernel fine-tuning

This is not really a strength of Epoll, but of the Linux platform as a whole. You can be skeptical of the Linux platform, but you can’t avoid the ability it gives you to fine-tune your kernel. For example, the kernel TCP/IP protocol stack uses the memory pool to manage the SK_buff structure, This memory pool(skb_head_pool) can be dynamically resized at runtime – by echo XXXX>/proc/sys/net/core/hot_list_length. For example, the second parameter of listen (the length of the TCP packet queue that completes the three-way handshake) can also be dynamically adjusted based on the size of your platform’s memory. Even try the latest NAPI nic driver architecture on a special system where the number of packets is huge but the size of each packet is small.

(Epoll content, see epoll_ Interactive Encyclopedia)

It is recommended to set the number of workers to the number of CPU cores. More workers will only cause processes to compete for CPU resources, resulting in unnecessary context switches. In addition, nginx provides a CPU affinity binding option to take advantage of the multi-core feature. We can bind a process to a certain core so that the cache will not be invalidated due to process switching.

Small optimizations like these are common in Nginx, and speak to the ingenuity of nginx’s authors. For example, when nginx does a 4-byte string comparison, it converts the 4 characters to an int, which is then compared to reduce the number of CPU instructions, etc.

To summarize nginx’s event handling model:

while (true) {

    for t in run_tasks:

        t.handler();

    update_time(&now);

    timeout = ETERNITY;

    for t in wait_tasks: /* sorted already */

        if (t.time <= now) {

            t.timeout_handler();

        } else {

Timeout = t.time — now;

            break;

        }

    nevents = poll_function(events, timeout);

    for i in nevents:

        task t;

        if (events[i].type == READ) {

            t.handler = read_handler;

        } else { /* events[i].type == WRITE */

            t.handler = write_handler;

        }

        run_tasks_add(t);

}

【微信号 recommend 】

For more recommendations, please see “Notable technology and Design public account”.

IT recommended popular public accounts related to technology, design, geeks and IT dating. Technology covers: Python, Web front-end, Java, Android, iOS, PHP, C/C++,.NET, Linux, database, operation and maintenance, big data, algorithms, IT workplace, etc. Click “noteworthy technology and design public number”, find wonderful!