As we all know, Nginx supports load balancing, which can be very convenient to help us carry out horizontal expansion. However, it is based on what principle to distribute requests, and which load balancing algorithm can be selected and configured, today let us have a good understanding.

Definition of load balancing

What is load balancing? We can refer to this situation in the picture:

  1. When a client sends a request, it goes to Nginx first, and Nginx then distributes the request to different servers in the background.
  2. If one of the server farms in the background goes down, Nginx automatically ignores that server and does not redistribute requests to that server.
  3. If a new server is added, Nginx will distribute requests to that server as well.

My understanding of load balancing is:

Requests from clients can be evenly distributed to application servers in the background to relieve server pressure.

And when the server is down or expanded, it can run properly.

Load balancing method

Now that you know what load balancing is, how does Nginx implement it?

Upstream and Server usage

The modules in Nginx that are responsible for interacting with upstream are collectively referred to as the upstream module.

Upstream and server directives specify the upstream service address as follows:

When the address of the upstream server is specified, it can be a domain name, IP address, or Unix socket address.

You can add a port to the domain name or IP address. If no port is added, port 80 is used by default.

You can add some parameters after address, such as:

Backup: Specifies the current server as the backup service. Requests are forwarded to this server only when the non-backup server is unavailable.

Down: indicates that a service is offline and no longer in service.

Here’s an example:

Upstream -service {server 127.0.0.1:17002; Server 127.0.0.1:17000; }Copy the code

round-robin

Upstream also provides a basic round-robin load balancing algorithm in the upstream module.

Its functions are:

Access the upstream service specified by the Server directive in weighted polling mode.

This algorithm is integrated into the Nginx framework by default and cannot be removed, so all algorithms in the following section are based on this algorithm, and all algorithms will eventually become round robin in some special cases.

The directives involved are:

  1. weight: Weight of service access. The default value is 1.
  2. max_conns: Maximum number of concurrent connections to the server, only for a single worker process.
  3. max_failsIn:fail_timeoutThe maximum number of failures in a period of time. When the maximum failure is reached, thefail_timeoutTime cannot be selected again.
  4. fail_timeout: Unit: second. The default value is 10 seconds. Specifies the maximum number of failures in a period of timemax_fails. arrivemax_failsThe time after the server is inaccessible.

Simple hash module

Sometimes, the normal polling algorithm is not enough for our needs,

For example, if the application service does not have a dedicated server to manage cookies, we expect the same user to be assigned to the same server.

Another example: our back-end application needs to put the same request to the same server for processing according to the parameters or URL in the request.

For the first case, you can use upstream_IP_hash. For the second case, you can use upstream_hash.

upstreamiphash

Function:

The IP address of the client is used as the keyword of the hash algorithm and mapped to a specific upstream server.

1. Use the first three bytes as the keyword for aN IPV4 address, and use the full address for an IPV6 address.

2. Parameters of the round-robin algorithm can be used.

3. You can modify the IP address used to execute the algorithm based on the realIP module.

Here’s an example:

    upstream upstream-service {
        ip_hash;
        server 127.0.0.1:17002;
        server 127.0.0.1:17000;
    }Copy the code

upstream_hash

Function:

You can specify keywords as hash keys and map them to specific upstream servers based on the Hash algorithm.

1. The keyword can contain variables and strings.

2. Parameters of the round-robin algorithm can be used.

Here’s an example (using username as the hash key in the request) :

    upstream upstream-service {
        hash user_$arg_username;
        server 127.0.0.1:17002;
        server 127.0.0.1:17000;
    }Copy the code

Consistent hashing algorithm

Hash algorithm to a certain extent, have been able to meet our business needs, but if this time meet application downtime or capacity, so the total number of hash will change, this is likely to bring a lot of request that the server will replacement, routing will failure, so for our application service also will produce great influence, In this case, a consistent hash algorithm can be used.

For an understanding of consistent hashing algorithms, see this article: Understanding and Practice of consistent Hashing Algorithms

If Nginx uses consistent hashing, it should be able to use consistent hashing. For example, if Nginx uses consistent hashing, it should be able to use consistent hashing.

For example (still using the request parameter username as the hash key) :

    upstream upstream-service {
        hash user_$arg_username consistent;
        server 127.0.0.1:17002;
        server 127.0.0.1:17000;
    }Copy the code

conclusion

These are some of the more common load balancing methods in Nginx, as well as some applications such as the least connection algorithm. If you have any questions, feel free to leave a comment below.

If you are interested, you can visit my blog or pay attention to my public number and headline number. Maybe there will be unexpected surprises.

death00.github.io/