Nginx overview

Nginx is a free, open source, high-performance HTTP server and reverse proxy server. It is also an IMAP, POP3, SMTP proxy server. Nginx can be used as an HTTP server for web site publishing and as a reverse proxy for load balancing.

Here is a brief introduction to Nginx through three main aspects

  • The reverse proxy
  • Load balancing
  • Nginx characteristics

1. Reverse proxy

About the agent

When it comes to agent, first of all, we should make clear a concept. The so-called agent is a representative and a channel.

At this point, two roles are designed, one is the agent role, one is the target role, the agent role through the agent to visit the target role to complete some tasks called the agent operation process; As in life, when a customer goes to an adidas store to buy a pair of shoes, the store is the agent, the agent role is the manufacturer of Adidas, and the target role is the user

 

Forward agent

Before saying reverse proxy, we first look at the forward proxy, forward proxy is also the most common contact to the proxy mode, we will from two aspects of the forward proxy processing mode, respectively from the software and life to explain what is the forward proxy

In today’s network environment, we are going to visit if due to technical need foreign web sites, you will find in abroad there is no way to access a web site we through the browser, at this time we will use an operation FQ visit, main FQ way is to find a can access to foreign websites, proxy server, We send the request to the proxy server, the proxy server visits the foreign website, and then passes the visited data to us!

This proxy mode is called forward proxy. The biggest characteristic of forward proxy is that the client is very clear about the server address to access. The server only knows which proxy server the request is coming from, not which specific client. Forward proxy mode masks or hides real client information.

 

The reverse proxy

Understand what is the forward agent, we continue to see about the reverse proxy approach, for one of my big Chinese treasure site, at the same time every day to connect to the web site access number has been extraordinary, a single server is far cannot meet the increasing the purchase desire of the people, at this time there is a familiar term: distributed deployment. That is to deploy multiple servers to solve the problem of limited number of visitors; Most of the functionality of the Tengine web site is also implemented directly using nginx reverse proxy, and it is named Tengine after encapsulating Nginx and other components. Interested children can visit the Tengine website for detailed information: So what is the specific way of reverse proxy to achieve distributed cluster operation? Let’s first look at a schematic diagram:

 

From the diagram above, you can see clearly that the requests sent by multiple clients to the server are received by the Nginx server and then distributed to the back-end business processing server for processing according to certain rules. At this point, the source of the request (i.e. the client) is clear, but it is not clear which server handles the request. Nginx acts as a reverse proxy

Reverse proxy is used to hide server information in distributed server cluster deployment.

Project scene

In general, the forward proxy and reverse proxy are likely to exist in the same application scenario in the actual project operation. The forward proxy client requests to access the target server, which is a reverse single-profit server and reverse proxy has multiple real business processing servers. The topology diagram is as follows:

2. Load balancing

Now that we’ve defined the concept of a proxy server, what rules does Nginx use to distribute requests as it acts as a reverse proxy server? Can the distribution rules be controlled for different project application scenarios?

The number of requests sent by the client and received by the nGINx reverse proxy is what we call the load

The rule that the number of requests is distributed to different servers according to a certain rule is a kind of balancing rule

Therefore, the process of distributing the requests received by the server according to rules is called load balancing.

Load balancing in actual project operation process, hardware load balancing and load balance two kinds of software, hardware load balancing is also called hard load, such as load balancing, F5 relatively expensive cost is higher, but the stability of the data safety and so on has very good security, such as China mobile, China unicom company will choose hard load operation; For cost reasons, more companies are opting for software load balancing, which is a message queue distribution mechanism using existing technology combined with host hardware

 

Nginx supports the following load balancing scheduling algorithms:

  1. Weight polling (default) : The received requests are allocated to different backend servers one by one in order. Even if a backend server fails, nginx will automatically remove the server from the queue and the request processing will not be affected. In this way, a weight value can be set for different back-end servers to adjust the rate of request distribution on different servers. The greater the weight data, the greater the probability of being assigned to the request; The weight value is adjusted according to the hardware configurations of back-end servers in the actual working environment.

  2. Ip_hash: Each request is matched according to the hash result of the IP address of the initiating client. In this algorithm, the next client with a fixed IP address always accesses the same back-end server, which to some extent solves the problem of session sharing in cluster deployment environments.

  3. Fair: intelligently adjusts the scheduling algorithm and dynamically allocates requests according to the time from processing to response of back-end servers. Servers with short response times and high processing efficiency are more likely to allocate requests, while servers with long response times and low processing efficiency are less likely to allocate requests. A scheduling algorithm combining the advantages of the former two. However, it is important to note that Nginx does not support fair by default. If you want to use this scheduling algorithm, install the upstream_fair module

  4. Url_hash: Allocates requests based on the hash result of the url accessed. Each request url points to a fixed server at the back end, which improves caching efficiency when Nginx is used as a static server. Also note that nginx does not support this scheduling algorithm by default. To use this algorithm, you need to install the Nginx Hash package

Nginx installation

1. Windows installation

Official website download address:

https://nginx.org/en/download.html
Copy the code

 

 

As shown in the following figure, download the corresponding nginx package and decompress it to the folder where the software is stored on your PC

 

After decompression, the file directory structure is as follows:

 

 

Start the nginx

1) Double-click nginx.exe to start the nginx server

2) The command line is included in the folder, execute the nginx command, also directly start the nginx server

D: / resp_application/nginx - 1.13.5 > nginxCopy the code

 

Visit nginx

Open the browser and enter http://localhost to access the web page. If the following page is displayed, the web page is successfully accessed

 

Stop nginx

Go to the nginx root directory and run the following command to stop the server:

D:/resp_application/nginx-1.13.5> nginx -s stop D:/resp_application/nginx-1.13.5> nginx -s quitCopy the code

 

2. The ubuntu installation

To install the software, run the following command:

$ sudo apt-get install nginxCopy the code

The /usr/sbin/directory contains the nginx command directory, and the /etc/nginx/ directory contains all nginx configuration files, which are used to configure the nginx server and load balancing information

 
Check whether the nginx process is started
$ ps -ef|grep nginxCopy the code

Nginx will automatically create a number of processes based on the number of cores on the current host CPU (the current Ubuntu host is a 2-core 4-thread configuration).

Note: There are actually four service processes started here, because the nginx process is started with a daemon, which is used to protect the official process from unexpected termination. If the daemon is terminated once the return nginx inheritance is terminated, it will automatically restart the process.

Daemons are generally called master processes and business processes are called worker processes

 
Start the nginx server command

Running nginx directly will start the server with the default configuration file

$ nginxCopy the code

 
Stop the nginx service command

The execution process is the same as that of the Windows system

$ nginx -s stop
or
$ nginx -s quitCopy the code

 
Restart loading

It is also possible to reopen nginx or reload the associated files using the commands reopen and reload.

 

3. Install the MAC OS

Install nginx directly from BREW or download the tar.gz package.

Install directly from BREW

brew install nginxCopy the code

After the installation is complete, the commands for starting, viewing, stopping, and restarting the server are the same as those for loading files.

Nginx configuration

Nginx is a very powerful Web server, reverse proxy server, mail server, etc

The three core functions most used in project usage are reverse proxy, load balancing, and static server

The use of these three different functions is closely related to the configuration of nginx. The configuration information of nginx server is mainly concentrated in the nginx.conf configuration file, and all configurable options are roughly divided into the following sections

{# nginx working mode configuration} HTTP {# HTTP setting.... Server {# Server host configuration.... Location {# Route configuration.... } location path { .... } location otherpath { .... } } server { .... location { .... Upstream name {# load balancing configuration.... }}Copy the code

 

As shown in the above configuration file, it consists of six parts:

  1. Main: used to configure nginx global information
  2. Events: used to configure nginx working mode
  3. HTTP: used to configure HTTP information
  4. Server: configures server access information
  5. Location: configures the access route
  6. Upstream: configures load balancing

The main module

Look at the configuration code below

# user nobody nobody;
worker_processes 2;
# error_log logs/error.log
# error_log logs/error.log notice
# error_log logs/error.log info
# pid logs/nginx.pid
worker_rlimit_nofile 1024;Copy the code

The above configuration items are stored in the main global configuration module

  • User specifies the user and user group for the nginx worker process. The default user account is nobody
  • Worker_processes Specifies the number of child processes to be enabled by nginx. During the running process, the memory consumption of each process (usually ranging from several to tens of meters) is adjusted based on actual conditions. The number is usually an integer multiple of the number of CPU cores
  • Error_log Defines the location and output level of the error log file.
  • Pid specifies the location where the process ID is stored
  • Worker_rlimit_nofile Specifies the description of the maximum number of files a process can open

The event module

On dry

event {
    worker_connections 1024;
    multi_accept on;
    use epoll;
}Copy the code

The above configurations are some of the operational configurations for the operating mode of the Nginx server

  • Worker_connections specifies the maximum number of connections that can be received at the same time. It is important to note that the maximum number of connections is determined in conjunction with worker processes.
  • The multi_accept configuration specifies that nginx accept as many connections as possible after being notified of a new connection
  • The use epoll configuration specifies the polling method for threads, using epoll if linux2.6+ or Kqueue if BSD, such as Mac

The HTTP module

As a Web server, HTTP module is the most core of nginx module, configuration items are also more, the project will be set to a lot of actual business scenarios, need to be configured according to the hardware information, under normal circumstances, use the default configuration can!

## sendFile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; # server_tokens off; # server_names_hash_bucket_size 64; # server_name_in_redirect off; include /etc/nginx/mime.types; default_type application/octet-stream; # ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # Dropping SSLv3, ref: POODLE ssl_prefer_server_ciphers on; # # # # # access_log log configuration/var/log/nginx/access log. error_log /var/log/nginx/error.log; ## Gzip on; gzip_disable "msie6"; # gzip_vary on; # gzip_proxied any; # gzip_comp_level 6; # gzip_buffers 16 8k; # gzip_http_version 1.1; # gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript; ## include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*;Copy the code

 

1) Basic configuration

Tc_nopush on: Setting sendFile to work and sending the file write back to the data buffer instead of the application will have some performance benefits. Let nginx send all headers in one packet, instead of sending them one by one: Keepalive_timeout 10: Assign a timeout period to the client, after which the server will close the connection. Client_header_timeout 10: Sets the timeout period of the request header client_body_timeout 10: sets the timeout period of the request body Send_timeout 10: Specifies the timeout period of the request body. Limit_conn_zone $BINARY_REMOTE_ADDR Zone =addr:5m: Limit_conn addr 100: limit_conn addr 100: Server_tokens: Include /etc/nginx/mime. Types: Include /etc/nginx/mime. Default_type application/octet-stream: Specifies that the file type to be processed by default can be binary type_hash_max_size 2048: The larger the value is, the more memory it consumes, the lower the hash key conflict rate and the faster the retrieval speed. A smaller key occupies less memory, increases the conflict rate, and slows down the search speedCopy the code

2) Log configuration

Access_log logs/access.log: Sets the log for storing access records error_log logs/error.log: sets the log for storing error recordsCopy the code

3) SSL certificate encryption

Ssl_protocols: Ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2, TLSv1.1 and TLSv1.2 Ensure that OpenSSL >= 1.0.1. SSLv3 is still in use in many places but has a number of vulnerabilities. SSL prefer Server ciphers: When setting the negotiation encryption algorithm, use our server encryption suite rather than that of the client browserCopy the code

4) Compressed configuration

Gzip tells Nginx to send data in gzip compressed form. This will reduce the amount of data we send. Gzip_disable Disables the gzip function for the specified client. We set it to IE6 or lower to make our solution widely compatible. Gzip_static tells Nginx to look for pre-gzip-processed resources before compressing them. This requires you to pre-compress your files (commented out in this example) to allow you to use the highest compression ratio so that Nginx doesn't have to compress them again (for a more detailed gzip_static information, click here). Gzip_proxied allows or disallows the compression of response flows based on requests and responses. We set it to any, which means all requests will be compressed. Gzip_min_length sets the minimum number of bytes to enable data compression. If a request is less than 1000 bytes, it is best not to compress it, because compressing such small data will slow down all the processes handling the request. Gzip_comp_level Sets the data compression level. This level can be any number between 1 and 9, with 9 being the slowest but with the highest compression ratio. We set it to 4, which is a compromise setting. Gzip_type sets the data format to be compressed. There are already some in the examples above, but you can add more.Copy the code

5) File cache configuration

Open_file_cache Specifies the maximum number of caches and the duration of the cache. We can set a relatively high maximum time so that we can clear them after they are inactive for more than 20 seconds. Open_file_cache_valid Specifies the interval for detecting correct information in open_file_cache. Open_file_cache_min_uses defines the minimum number of files to be used in open_file_cache during the period when the instruction parameter is inactive. Open_file_cache_errors specifies whether error messages are cached when a file is searched, including adding files to the configuration again. We also include server modules, which are defined in different files. If your server module is not in these locations, you will have to modify this line to specify the correct location.Copy the code

The server module

The Srever module configuration is a submodule of the HTTP module that defines the configuration information of a virtual access host, that is, a virtual server

server { listen 80; Server_name localhost 192.168.1.100; root /nginx/www; index index.php index.html index.html; charset utf-8; access_log logs/access.log; error_log logs/error.log; . }Copy the code

The core configuration information is as follows:

  • Server: a virtual host configuration, one HTTP can be configured with multiple servers

  • Server_name: Specify the IP address or domain name forcibly. Separate multiple configurations with Spaces

  • Root: indicates the root directory of the entire server virtual host and the root directory of all Web projects on the current host

  • Index: indicates the global home page when a user accesses a Web site

  • Charset: Used to set the default encoding format for web pages configured in the WWW/path

  • Access_log: Specifies the path for storing access log files in the virtual host server

  • Error_log: specifies the path for storing access error logs in the virtual host server

The location module

The location module is one of the most common configurations in nginx and is used to configure routing access information

In the configuration of routing access information, it is associated with reverse proxy, load balancing and other functions, so the location module is also a very important configuration module

The basic configuration

location / {
    root    /nginx/www;
    index    index.php index.html index.htm;
}Copy the code

Location / : indicates the matching access root directory

Root: specifies the web directory used to access the virtual host when accessing the root directory

Index: list of resource files displayed by default when no specific resource is specified

Reverse proxy configuration mode

Through reverse proxy proxy server access mode, through proxy_set configuration to make client access transparent

location / {
    proxy_pass http://localhost:8888;
    proxy_set_header X-real-ip $remote_addr;
    proxy_set_header Host $http_host;
}Copy the code

Uwsgi configuration

Configure the access mode for the server in WSGI mode

location / {
    include uwsgi_params;
    uwsgi_pass localhost:8888
}Copy the code

The upstream module

The upstream module is responsible for configuring load balancing and distributing requests to back-end servers using the default round robin scheduling method

The simple configuration is as follows

upstream name { ip_hash; Server 192.168.1.100:8000; Server 192.168.1.100:8001 down; Server 192.168.1.100:8002 max_fails = 3; Server 192.168.1.100:8003 fail_timeout = 20 s; Server 192.168.1.100:8004 max_fails = 3 fail_timeout = 20 s; }Copy the code

The core configuration information is as follows

  • Ip_hash: specifies the request scheduling algorithm. The default value is weight. The value can be specified

  • Server host:port: indicates the configuration of the distribution server list

  • — Down: The host service is suspended

  • — max_fails: specifies the maximum number of failures for which services are suspended

  • — fail_timeout: if the request fails to be processed, the request will be restarted after a specified period of time