This is the 27th day of my participation in the August More Text Challenge

Nginx overview

Developed specifically for performance optimization, Nginx is best known for its stability and low system resource consumption, as well as its high processing capacity for concurrent connections (a single physical server can support 30,000-50,000 concurrent connections, tomcat is usually several hundred, no more than 1000). Is a high-performance HTTP and reverse proxy server, as well as an IMAP/POP3/SMTP proxy server.

Clustered architectures can typically handle tens of thousands of connections.

Note: THE C10K problem refers to the problem of 10000 clients accessing the server at the same time.

In addition to the advantages of high efficiency and stability, the most commonly used Nginx servers are reverse proxies and load balancing, as well as static and static separation.

Nginx features

2.1 Nginx forward proxy

Usually, we can access a certain host in the network directly through an IP address (or rather, a certain process in the network through a socket socket). The forward proxy proxy server is actively set by users. The proxy server is configured in our browsers to access the Internet through the proxy server.

If server B can forward requests from clients to server A, and server B can forward responses from server A to clients, then it’s obvious that we can access server A by accessing server B.

Similar to the bastion machine in waisang, it provides bastion access to other servers.

In the following figure, the client accesses server A. The client also accesses server A through server B. However, please note that server B seems to be A medium for the client to access server A.

Forward proxy uses:

  • Access previously inaccessible resources, such as Google
  • Can do cache, speed up access to resources
  • Authorize client access and authenticate Internet access
  • The proxy can record user access records (online behavior management) and hide user information externally

2.2 Nginx Reverse Proxy

Unlike a forward proxy server, a reverse proxy server does more than act as a “medium” that can access the target server. The reverse proxy server is set by the server, not the user. The client does not know whether the proxy is used. The reverse proxy selects the target server.

A reverse proxy is used as a proxy on the server side, such as a Web server, rather than on the client side. A forward proxy allows clients to access different resources, while a reverse proxy allows many clients to access resources on different back-end servers without knowing the existence of these back-end servers and assumes that all resources come from this reverse proxy server.

That is, the user requests the target server, the proxy server decides which IP address to access, and the proxy server decides which node to access.

Nginx can assume the function of A “reverse proxy” where servers A and B can be two Tocmat servers.

Reverse proxy functions:

  • 1. Ensure Intranet security and prevent Web attacks. Large websites usually use reverse proxy as the public network access address, and the Web server is the Intranet.
  • 2, load balancing, through the reverse proxy server to optimize the load of the website.

2.3 Load Balancing

In the figure above, we can see that the functions of servers A and B are identical and can be replaced by each other. By sharing the concurrent pressure of tasks handled by one server to multiple servers, the bandwidth of servers is expanded and the throughput is improved —- this is load balancing.

Nginx provides two types of load balancing: built-in policies and extended policies.

The built-in policies are polling, weighted polling, and IP Hash. Expand policies and customize load balancing policies.

polling

Weight polling:

ip Hash

IP hash Hash operations are performed on the IP addresses requested by clients, and then requests from the same CLIENT IP address are distributed to the same server for processing. In this way, sessions are not shared.

2.4 Static and Static Separation

Some static resources that rarely change can be placed directly on the Nginx server. When the client accesses these static resources, Nginx can directly access the client instead of accessing the server cluster, thus saving some request and response time.

In this way, when the client accesses the Nginx server, when the client accesses the static resources, Nginx will directly return the static resources to the client, when the client accesses the dynamic resources, Nginx will access the cluster, so as to achieve the separation of dynamic resources and static resources, thereby improving the system throughput.

CDN

Content Delivery Network.

CDN is an intelligent virtual network built on the basis of the existing network. It relies on the edge servers deployed in various places, through the central platform of load balancing, content distribution, scheduling and other functional modules, users can get the content nearby, reduce network congestion, and improve user access response speed and hit ratio. The key technologies of CDN mainly include content storage and distribution.

The Nginx principle

See my previous article for details of the installation.

3.1 Nginx process model

After starting nginx, you can see that there are two processes, one master and one work.

How do these two processes work?

  • Master process: It is the master process and acts as a leader. It does nothing but distributes tasks.
  • Worker: a worker that serves I for the master process.

There is only one master process and only one worker process by default, but the number of worker processes can be configured in the configuration file. Of course, this value can be configured in nginx.conf, usually as CPU-1.

#user nobody; worker_processes 1; events { worker_connections 1024; // The maximum number of processing threads per worker thread. Total number of open is less than Linux maximum number of open 65535.}Copy the code

Conclusion:

  • The master will assign all request signals to the worker process to process. It is equivalent to the boss taking on a lot of work outside and then assigning it to his partner.
  • The master will monitor the worker to see if it is normal or exits once. At this time, the master will restart a new worker to perform the task again

3.2 Worker preemption mechanism of Nginx

1. Modify the number of worker processes through the configuration file, for example, 3.

2. The principle is: the master forks three worker processes through the main process. If a client requests an nginx server, the three worker processes will accept the accept_mutex lock to handle the client request.

Note the key to Nginx where all worker processes work together (shared memory).

3.3 Nginx event handling mechanism

1. The underlying worker process uses the IO mode epoll of the Linux underlying operating system to complete processing.

2. The IO thread processing model of Epoll adopts an asynchronous non-blocking mechanism to complete thread processing, so if the above client1 is blocked, the execution of client2 and Client3 will not be affected. In general: nginx can handle 6-8 w threads per worker, so the concurrency is high.

3.4 nginx. Conf

#user nobody; // worker_processes 1; Events {// Linux uses epoll worker_connections 1024 by default; HTTP {include mime.types; // Maximum number of open files per worker} // HTTP {include mime.types; default_type application/octet-stream; #log_format main '$remote_addr - $remote_user [$time_local] "$request" ' # '$status $body_bytes_sent "$http_referer" ' #  '"$http_user_agent" "$http_x_forwarded_for"'; #access_log logs/access.log main; sendfile on; #tcp_nopush on; // Use with sendFile to send #keepalive_timeout 0 when requested packets have accumulated to a certain size; keepalive_timeout 65; // Timeout period for the client to connect to the server. The default value is 65 seconds. 0 indicates that the connection is not maintained. #gzip on; // Must be open, easy to file and request data transfer, and consume time. // server {listen 80; Server_name localhost; Localhost #charset koi8-r; #access_log logs/host.access.log main; location / { root html; index index.html index.htm; } #error_page 404 /404.html; # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root html; } # proxy the PHP scripts to Apache listening on 127.0.0.1:80 # #location ~.php${# proxy_pass http://127.0.0.1; #} # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000 # #location ~.php${# root HTML; # fastcgi_pass 127.0.0.1:9000; # fastcgi_index index.php; # fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name; # include fastcgi_params; #} # deny access to .htaccess files, if Apache's document root # concurs with nginx's one # #location ~ /.ht { # deny all; #} } # another virtual host using mix of IP-, name-, and port-based configuration # #server { # listen 8000; # listen somename:8080; # server_name somename alias another.alias; # location / { # root html; # index index.html index.htm; # } #} # HTTPS server # #server { # listen 443 ssl; # server_name localhost; # ssl_certificate cert.pem; # ssl_certificate_key cert.key; # ssl_session_cache shared:SSL:1m; # ssl_session_timeout 5m; # ssl_ciphers HIGH:! aNULL:! MD5; # ssl_prefer_server_ciphers on; # location / { # root html; # index index.html index.htm; # #}}}Copy the code

3.5 Load Balancing Demonstration

1, polling (weight=1)

By default, if weight is not specified, all servers have the same weight.

Each request is allocated to a different backend server one by one in chronological order. If the backend server goes down, the request is automatically removed.

Upstream bakend {server 192.168.1.10; Server 192.168.1.11; }Copy the code

2, weight

Specifies the polling probability, weight proportional to the access ratio, for cases where back-end server performance is uneven.

If the backend server becomes Down, it will be deleted automatically.

For example, in the following configuration, 1.11 server receives twice as many visits as 1.10 server.

Upstream bakend {server 192.168.1.10 weight=1; Server 192.168.1.11 weight = 2; }Copy the code

3. Ip_hash Hash

Each request is allocated according to the hash result of the access IP, so that each visitor has fixed access to one back-end server, which can solve the problem that sessions cannot cross servers.

If the back-end server goes down, manually go down.

upstream resinserver{
    ip_hash;
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}
Copy the code

4. Fair (third-party plugin)

Requests are allocated based on the response time of the back-end server, with priority given to those with short response times.

Upstream resinserver {server 192.168.1.10:8080; Server 192.168.1.11:8080; fair; }Copy the code

5. Url_hash (third-party plugin)

Requests are allocated based on the hash results of urls so that each URL is directed to the same backend server, which is more effective when the backend server is a cache server.

Add the hash statement to upstream (hash_method is the hash algorithm to use).

Upstream resinServer {server 192.168.1.10:8080; Server 192.168.1.11:8080; hash $request_uri; hash_method crc32; }Copy the code