Today we’ll talk about some of the problems with Nginx load balancing:

  • The Session problem
  • File upload and Download

The server load problem is usually solved by splitting the load between multiple servers. Common solutions are:

  • Website entrance through the sub-station link load (sky software station, Huajun Software Park, etc.)
  • DNS round-robin
  • F5 Physical device
  • Lightweight architectures such as Nginx

Upstream (default) Each request is allocated to a different backend server in chronological order. If the backend server goes down, it is automatically rejected. 2. Weight specifies the polling probability. Weight is proportional to the access ratio and is used when the back-end server performance is uneven. 2. Ip_hash Each request is allocated according to the hash result of the access IP address. In this way, each visitor has a fixed access to the back-end server, which can solve the session problem. 3. Fair (third party) allocates requests based on the response time of the back-end server, with priority given to those with short response times. 4. Url_hash (third-party) Allocates requests based on the hash result of the url, so that each URL is directed to the same backend server, which is more effective when the backend server is used for caching.

Upstream: How does the load work

http { upstream www.test1.com { ip_hash; Server 172.16.125.76:8066 weight = 10; Server 172.16.125.76:8077 down; Server 172.16.0.18:8066 max_fails = 3 fail_timeout = 30 s; 8077 backup server 172.16.0.18:; } upstream www.test2.com {server 172.16.0.21:8066; Server 192.168.76.98:8066; } server { listen 80; server_name www.test1.com; location /{ proxy_pass http://www.test1.com; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } } server { listen 80; server_name www.test2.com; location /{ proxy_pass http://www.test2.com; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; }}Copy the code

 

When there is a request to the www.test1.com/www.test2.c… The request is dispatched to the list of servers set up by the upstream. Each request for Test2 is distributed to a random server, as enumerated in the first case. Test1 is distributed to the specified server based on the hasHID that accesses the IP address.

Different parameters can be set according to the server’s performance and function.

Down indicates that the load is heavy or does not participate in the load

Weight A large weight means that the load is larger

Backup The backup server is requested only when other servers or the server is Down

Max_fails fails more than a specified number of times and is paused or requests are forwarded to another server

Fail_timeout Pause time after the number of failures exceeds the specified number

Above is a simple configuration of Nginx load balancing. To continue our discussion in this section:

Session problems

When we identify a set of load servers, our WEB site will be distributed to those servers. In this case, if you use Test2 to randomly access any server with each request, you will visit server A and then suddenly redirect the next request to server B. At this time, the Session established with server A cannot be properly responded to the server at site B. Let’s look at common solutions:

  • Session or credentials are cached to a separate server
  • Session or credentials are stored in the database
  • Nginx IP_hash requests that maintain the same IP address are assigned to a fixed server

The first way of caching is ideal, and the efficiency of caching is relatively high. But every requesting server accesses the Session server. Doesn’t that load the Session server?

The second way to save to the database, in addition to control the validity of the Session, but also increase the burden of the database, so the final change to SQL Server load balancing, involving read, write, expiration, synchronization.

The third option is to maintain sessions to the same server through the Nginx IP_hash load, which seems to be the most convenient and lightest.

Under normal circumstances ip_hash can solve Session problems if the architecture is simple, but let’s look at the following case

 

In this case, all the requests received by the IP_hash are from the proxy with a fixed IP address. If the proxy IP address is overloaded, the server of the IP_hash will be overloaded, and the IP_hash will lose its load balancing function.

If caches can be shared synchronously, we can use multiple session servers to solve the problem of single overload. Can Memcached be used as a Session cache server? MemcachedProvider provides the Session function to save the Session to the database. Why not save it directly to the database instead of Memcached? If you save the Session directly to the database, you need to check the validity of each request to the database. Second, even if we build a cache for the database, the cache cannot be distributed and is still overloaded for the same cache server. Online also see useful Memcached Session cache success cases, of course, database implementation is more commonly used, such as open source Disuz.net forum. Small scale distribution of cache implementations is also common, with single sign-on as a special case.

Two, file upload and download

If load balancing is implemented, in addition to Session problems, we will also encounter file upload and download problems. Files cannot be uploaded to different servers. As a result, corresponding files cannot be downloaded. Let’s look at the following scenario

  • Standalone file server
  • File compression database

Two kinds of schemes are commonly used, let’s talk about file compression database, the previous way is to compress the binary file to the relational database, and now the popularity of NOSQL, coupled with MongoDB file processing is more convenient, so the file library and a choice. After all, a file server is less efficient, manageable, and secure than a database.

Just talking about it, it’s a trend of some applications and the realization of one more solution.

Nginx not only can be used as a powerful web server, also can be used as a reverse proxy server, and nginx can also according to the operation rules for dynamic and static page separation, can according to the poll, IP hash, URL hash, weight and so on a variety of ways to the back-end server load balancing, at the same time also supports the backend server health check.

If there is only one server and that server dies, it is a disaster for the site. Therefore, this is where load balancing comes into its own and automatically weeds out failed servers.

 

Here is a brief introduction to my experience using Nginx as a load

Download – Install Nginx will not be covered

Configure Nginx load on Windows and Linux.

Nginx load balancing

Upstream currently supports 4 different allocation methods: 1), polling (default) Each request is allocated to a different backend server in chronological order. If the backend server goes down, it is automatically rejected. 2), weight specifies the polling probability, weight is proportional to the access ratio, used in the case of uneven back-end server performance. 2) Ip_hash Each request is allocated according to the hash result of the access IP, so that each visitor has a fixed access to the back-end server, which can solve the session problem. 3) Fair (third party) allocates requests based on the response time of the back-end server, with priority given to those with short response times. 4) url_hash (third party)

Configuration:

Upstream myServer {server 127.0.0.1:9090 down; upstream myServer {server 127.0.0.1:9090 down; Server 127.0.0.1:8080 weight = 2; Server 127.0.0.1:6060; 7070 backup server 127.0.0.1:; } add proxy_pass http://myServer under the Server node that needs to use the load; Upstream Status of each device: Down indicates that the server does not participate in the current load weight. The default value is 1. The larger the weight, the larger the load weight. Max_fails: Specifies the number of failed requests allowed. The default value is 1. When the maximum number of failed requests is exceeded, an error defined by the proxy_next_upstream module is returned. Backup: Request the backup machine when all other non-backup machines are down or busy. So this machine will have the least pressure.Copy the code

Nginx also supports multi-group load balancing, where multiple upstreams can be configured to serve different servers.

Configuring load balancing is relatively simple, but the most critical problem is how to share sessions among multiple servers

There are several ways to do this (the following is from the web, the fourth method is not in practice).

1) Use cookies instead of session

If you can change session to cookie, you can avoid some of the drawbacks of session. In a J2EE book I read before, it is also pointed out that session cannot be used in a cluster system, otherwise it will be difficult to do. If the system is not complex, the priority is to remove the session, if it is too troublesome to change, then use the following method.

2) The application server implements sharing by itself

Asp.net can store sessions in a database or memcached to create a session cluster in asp.net itself. This way, sessions are stable and will not be lost even if a node fails. It is suitable for strict but not demanding situations. But its efficiency is not very high, is not suitable for high efficiency requirements of the occasion.

Nginx has nothing to do with nginx. Here’s how to use nginx:

3) ip_hash

Nginx uses the ip_hash technology to redirect requests from an IP address to the same backend, thus establishing a solid session between a client and a backend. The ip_hash is defined in the upstream configuration:

Upstream Backend {server 127.0.0.1:8080; Server 127.0.0.1:9090; ip_hash; }

Ip_hash is easy to understand, but because only the IP factor can be used to assign the back end, ip_hash is flawed and cannot be used in some cases:

1/ Nginx is not the most front-end server. Ip_hash requires that nginx be the front-end server. Otherwise, nginx cannot hash based on the IP if it fails to obtain the correct IP address. For example, if squid is used as the front end, then nginx can only get the IP address of squid server, using this IP address is definitely wrong.

There are other ways to load balance the 2/ Nginx backend. If there are other load balancers on the Nginx backend that divert requests in a different way, then a client’s requests cannot be directed to the same session application server. The nginx backend can only point directly to the application server, or a SQUID, and then point to the application server. The best way to do this is to use location as a single stream, where part of the requests that require a session are routed through the IP_hash, and the rest go to the other backend.

4) upstream_hash

To resolve some of the ip_hash issues, you can use the upstream_hash third-party module, which is mostly used for URl_hash, but does not prevent it from being used for session sharing:

If the front end is squid, add the IP to x_forwarded_for http_header, and use upstream_hash to factor the request to the specified backend:

Is this document: www.sudone.com/nginx/nginx…

$request_URI is used as a factor in the document, slightly changed:

hash $http_x_forwarded_for;

X_forwarded_for (‘ cookie ‘, ‘cookie’);

hash $cookie_jsessionid;

If a session is configured in PHP as cookie-free, nginx can generate a cookie with a userid_module of its own. Wiki.nginx.org/NginxHttpUs… http://code.google.com/p/nginx-upstream-jvm-route/