This article is an extension of the Nginx knowledge learned in the previous [Nginx] front end.

At present, the complexity of background system mainly comes from three aspects: high availability, high performance and expansibility. With a plethora of metrics and solutions for every aspect, this question will introduce some common concepts related to high availability as part of the high availability literacy section. Learning about extensibility can also help you better categorize your skills as you move up the ladder. The availability solution (Reload) involving Nginx will also be covered.

Understanding high availability

High availability (ABBREVIATED HA) refers to the ability of a system to perform its functions without interruption and represents the degree of availability of the system. Is one of the criteria for system design. High availability systems can run longer than the components that make up the system.

High availability is usually achieved by improving the fault tolerance of the system. It can be designed to reduce the time when the system cannot provide services and eliminate single points of failure to ensure long-term continuous operation or normal operation. Assuming that the system is always providing service, let’s say that the system is 100% available. The average company’s high availability goal is five 9’s (99.999%) availability.

When do usability problems occur

Downtime mainly consists of two aspects:

  • Planned outages: Typically, scheduled outages are the result of maintenance, such as software system upgrades, database or server maintenance, that must be disruptive to the system during a scheduled period of time.
  • Unplanned downtime: Usually refers to value outage, CPU failure, or disconnection from other physical layers, such as network failure, middleware failure, etc.

The specific analysis of these two aspects is mainly caused by the following problems:

  1. Network problems. The network connection is faulty and the network bandwidth is congested

  2. Performance issues. Database slow SQL, Java Full GC, hard disk I/O is too large, CPU is too high, and memory is insufficient

  3. Security issues. Network attacks, such as DDoS attacks, occur

  4. Operation and maintenance problems. Systems are constantly being updated and modified, and architectures are constantly being adjusted to monitor problems

  5. Management issues. Key services and service dependencies are not sorted out, and operation information is not synchronized with the control system

  6. Hardware problems. The hard disk is damaged or the network adapter is faulty

How can Nginx reduce planned outages

Suppose there is a scenario where Nginx needs to update the configuration and publish it. There are two options at this point

  • nginx -s quitShut downNginxAnd then start a new oneNginxservice

If you do this, the connection you are working on will simply break, the client will return an error, and the user may still receive an error request even if the client tries again. There is a short service vacuum between shutdown and startup, and if the high availability requirement on the business side is 5.9, this is enough time to affect your KPIs. This is the planned downtime mentioned above.

  • nginx -s reloadSmooth overloading updates configuration information

With reload, services are not suspended. As we know, Nginx has multiple processes open when it runs, so we let requests from the child process only go in and out. When there are no requests from the child process, Nginx updates the configuration of the response child process.

Nginx signal processing

Nginx command line commands work by sending Linux signals to the Master process. Nginx -s reload sends Hub commands to the Master process to tell it to reload. Some of the main configurations are as follows:

The command line signal explain
nginx -s reload HUB The main process receives the HUB overload configuration file
nginx -s stop TERM The main process stops running nginx immediately after receiving TERM
nginx -s quit QUIT The main process receives QUIT and stops gracefully
nginx -s reopen USR1 Main process log cutting

The reload instruction handles the logic

  • First command line input nginx -s reload, nginx first through the -s parameter to determine the signal processing, he will check the Master process through pid process number, and then send the corresponding Hub signal to the Master process. We can also through the ps – ef | grep nginx process, view the Master process.

  • When the Master process receives the signal, it parses the new.conf configuration file. After parsing, the Master will not directly send QUIT signal command to the child process to stop the process, but open a new listening port to start the new Worker process. At this point, new and old Worker processes coexist. The old Worker process will still use the old configuration to process the request data.

  • After the new Worker process starts, the Master will send the QUIT process to the old Worker process. At this time, the old Worker process will close the listening port and stop the new network requests. Wait for all network requests to complete and the old Worker will exit.

If Nginx is configured with sockets or long connections, a large number of Worker processes will accumulate through reload reload. In addition, once Worker processes exceed the number of cpus, CPU competition mechanism may be triggered, resulting in performance degradation.

It is also possible to resolve this problem, now we can close connections that are still open during reload by setting the expiration time with worker_shutdown_timeout.

The essence of high availability is redundancy + failover, both of which will come up again and again in the next proposed approach.

Hierarchically explore usability options:

Remember from the availability cluster image above, here is a detailed analysis of how each level deals with availability issues.

Application service layer high availability scheme:

The service layer is mostly stateless at a level where basic common services are handled or specific business logic is handled. The so-called stateless application refers to that the application server does not store the context information of the business, but only performs the corresponding business logic processing according to the data submitted for each request. Multiple service instances are completely equal, and the processing results are the same when the request is submitted to any server.

Since it is the same for everyone, we can achieve high availability through redundancy + failover.

As for the service layer, it can be divided into two parts in sequence. The first layer is the downstream client to the reverse proxy, and the second layer is the reverse proxy to the specific upstream server. However, the first part is how to deal with the downstream client to the reverse proxy, because this is relatively mature.

Reverse proxy to the server

The Nginx reverse proxy layer can fail over upstream servers, using weighted polling load balancing as an example. For specific load balancing algorithms, see upstream for two parameters that are responsible for high availability.

  • Backup: Top a service as a backup service to which requests will be forwarded when all other non-backup services fail

  • Down: Indicates that the current machine is offline and the request will not be made to this machine

Now that you can set up a backup service, what is a service failure? In fact, it can also be set manually, and the parameters are very simple ↓

  • Fail_timeout: The default value is 10 seconds. Within the fail_timeout period, if the number of failures exceeds max_fails, requests will not be forwarded to the service within the Fail_timeout period.

  • Max_fails: Sets the number of failures within a period (fail_timeout).

We start by turning on five services in our code, all of which perform the same function (set redundancy) and each of which returns the same image return value.

# Enable multiple picture server cluster link pools

upstream img_server {
    server 127.0.0.1:3001 weight=1;
    server 127.0.0.1:3002 weight=3 max_fails=3 fail_timeout=3s;
    server 127.0.0.1:3003 weight=1;
    server 127.0.0.1:3004 backup;
    server 127.0.0.1:3005 down;
}
server {
    listen 8091 default_server;
    location / {
        proxy_pass http://img_server; Proxy to cluster connection pool}}Copy the code

With the Nginx configuration above, the cluster will be weighted to server 3001, 3002, 3003

  1. When the 3002 server has more than 3 incorrect requests within 3s, the current server is unavailable and the requests will be sent to 3001 and 3003.

  2. If 3001, 3002, and 3003 are unavailable, all requests are forwarded to standby machine 3004.

  3. If 3005 is unavailable, you can manually delete the down field of 3005 and reload the nginx service using nginx -s reload.

This ensures high availability of proxy servers to upstream business servers, which is the redundancy + failover concept mentioned above.

Client to reverse proxy

The high availability of downstream clients to the reverse proxy layer is achieved through the redundancy of the reverse proxy layer.

Nginx is used as an entry point for proxy requests. To prevent the failure of a single Nginx server from causing a total service outage, we can set two Nginx servers to the same Virtual IP Address (Virtual IP Address). When the primary Nginx server is down, we are actually accessing the standby Nginx machine and load balancing through the standby Nginx machine.

What is a VIP (Virtual IP Address)

A VIP is a Virtual IP Address. Usually, an IP address can be bound to only one network card. If the machine fails, the client’s request to locate the machine based on the IP will not be answered.

VIP is also configured on one machine, but when the machine fails, the standby machine will preempt the IP address. This preemption process is also called drift. This is done by drift with IP unchanged, failure over to the standby fuselage.

How can VIP be used on Nginx? Here comes Keepalived solution.

What is Keepalived && How to implement VIP failover with Keepalived

Keepalived itself is a state detection and switching tool designed for LVS load balancing. It monitors the state of nodes in a cluster, but then adds virtual Router Redundancy Protocol (VRRP) to failover viPs.

On the other hand, Keepalived is currently a highly available solution that can be used in Nginx and various databases in addition to LVS. As the front-end ER, we only need to understand the following points, more is to rob the job.

VRRP sets each device to the Master and Backup roles. Each device has a weight. The normal device with the highest weight is elected as Master. The Master will hold the VIP.

If the Master does not respond after the set time, a new election is held in Backup, thus failover is achieved. Keepalived is an intermediary that determines when the master and slave machines are enabled.

  • Master-> Backup mode: When the host is down, the virtual IP will automatically migrate to the standby machine, and Keepalived will also move the VIP to the standby machine after listening to the host repair.

  • Backup-> Backup mode: When the host is down, the virtual IP will automatically migrate to the standby host. However, even if the host is restored, the VIP will not change, but will always use the original Backup standby host as the host.

However, in high concurrency scenarios, an Nginx can’t handle it and will fail no matter how many slave nodes there are. So we used the LVS four-tier proxy again for high availability, and moved Keepalived to the LVS tier for concurrent scenarios.

Data storage layer availability scheme:

High availability at the database level is mostly achieved through master-slave schemas and read/write separation, but there are some details and other solutions that need to be known.

First of all, the essence of the high availability of storage scheme by copying data to multiple storage devices, to achieve high availability by means of data redundancy, but here is different from server redundancy problem, database storage and read write very tall to the requirement of synchronization, its complexity lies in how to deal with replication delay and disruption to data inconsistency problem.

If the line transmission between each server is a few milliseconds, but if it is a cross-region machine room, it may take tens of milliseconds. In these tens of milliseconds, if the data read and write operation occurs, the two machine rooms will have data synchronization problem, so dealing with the availability of storage layer is relatively complicated. The following two schemes are discussed in detail: master/slave mode and master/slave mode.

The main standby mode

The active/standby mode is a relatively old mode. It is known by its name as one host with multiple standby machines to achieve high availability. The host is responsible for reading and writing all the data, and the host regularly gives the replicated data to the standby machine. Once the host goes down, one of the standby machines acts as the primary node.

In the active/standby mode, the data consistency problem is solved through negotiation rules. When the server is started, both servers can be used as the standby machine, and then the connection is established between the two machines. One machine is used as the decision-making host, and the other machine is used as the standby machine.

The advantage of this method is that the client is not aware of the fault of the host. When the host is faulty, services can be continued through the standby host. In addition, data replication between the standby host and the host is required, and distributed read/write is not required.

But there is also the problem of poor decision making, such as when two machines are connected to each other.

  • If the standby server considers that the host is faulty when the connection is interrupted, the standby server needs to be upgraded to the host, but the host is not faulty. In this case, the system has two hosts, which is inconsistent with the original design.

  • If the standby host does not consider that the host is faulty even when the connection is interrupted, if the standby host is faulty, the system does not have a host, which is also inconsistent with the original design.

  • To avoid the impact of disconnection on status decisions, you can add more secondary connections. Although this can reduce the impact of connection interruption on the status, it also introduces the problem of information trade-off among these connections, that is, if the information transmitted by different connections is different, which connection should prevail?

How to determine the state connection of two machines

The main cause of this problem is that when a host is down, the standby host determines whether the host is down or the connection between the active and standby hosts is faulty. If the host is not down but has connection problems, which machine does the server want to perform read and write operations on?

For host communication and state transfer, the following directions are introduced:

  • Status transfer channel: is the master and slave connected to each other or through third-party arbitration?

  • Content of status detection: What should be checked for master/slave communication? Whether the machine is powered off, whether the process exists, and whether the response is slow.

  • Switchover timing: When should the standby server be upgraded to the host? Is the machine power failure backup machine to upgrade, or the host process does not exist to upgrade.

  • Switchover policy: After a host recovers from a fault, do we need to restore the original host as the host or the standby host as the host?

  • Degree of automation: Is the switch fully automatic, or does it require manual verification?

  • Data conflict: After the original faulty host recovers, data conflict may exist between the old and new hosts.

The mediation pattern resolves active/standby conflicts

Among them, the simplest way to solve the relationship between master and standby state is the mediation mode. We add a third party intermediary between master and standby machines to complete state management of both parties through the intermediary. If you are really interested, you can go to see the switching architecture of Apach Zookeeper. I will not go into the details of the implementation principle, but just introduce its solution.

  • All machines start as backup machines! And as long as the intermediary disconnects automatically becomes the standby machine!

  • After the host is disconnected from the intermediary, the host automatically becomes the standby host, and the intermediary notifies the standby host to upgrade itself to the host.

  • If a host is disconnected from the mediation due to a network interruption, the host is demoted to the standby host. After the network is restored, the original host reports its status to the mediation as the new standby host.

A master-slave mode

In the active/standby replication mode, the standby machine is enabled only when the host is down. In the master/slave mode, the slave machine is the machine that performs the read operation. The master machine copies data to the slave machine for data unification.

  • First, master-slave mode ensures readability by reading data from the slave machine.

  • Second, client-side reads typically account for more than 80% of the data stream, so multiple machine reads can take advantage of hardware performance.

In master-slave mode we can also fail over with Keepalived. Of course, we have so many machines running read and write operations together, which is extremely late on the network, and how the different operations or read operations are allocated is a bit of a hassle.

Cluster/Partition

The cluster:

When single machine performance is not enough to meet the needs of the business will appear when the cluster concept, above the main case, master-slave has multiple standby machine, and from the machine, is it just a cluster, here to say the cluster is mainly refers to more than one can read and write operation on host of dispersed cluster, each server will be responsible for the storage of data.

Of course, there will be a special host in the cluster to do the data allocation. A common cluster is a Hadoop cluster. Interested can look at their own company background documents, now the basic are this mode.

Partitions:

In fact, the concept of partition is also because the above range of high availability is not large enough. What if the power failure of the whole machine room leads to the failure of the primary and secondary?

Instead of putting all eggs in one basket, common partitioning concepts such as geographic partitioning can be very simple to deal with.

Each partition stores a portion of the data. When an accident occurs somewhere, only some data and some servers are affected, not all data is unavailable. After a fault is rectified, services in the faulty area are quickly restored using backup data in other areas. There are also many ideas about partition, you can refer to the following ↓

  1. Independent partition, each partition will store their own backup separately, there is no relationship between each partition, this kind of relatively rich, for example, three partitions correspond to three backup centers.

  2. Centralized partitions are relatively inexpensive, all partitions store backups in one place, and are simple to design.

  3. Partition backup, this kind of perfect, partition between the backup files of other partition, but the design complexity of this way, how many partition backup data? Where to store data after creating a partition? All questions.

Of course, after the cluster and partition solution, we have another way to meet high availability: DNS domain name division.

DNS domain name server can do a lot of articles, in addition to high availability, DNS is also a very important solution in high performance scenarios, because when the business scope is very large, the client server is very far away, the network impact will gradually increase. Network communication delay and packet loss cause slow page loading. Therefore, DNS and CDN servers are required to carry services. If you are interested, you can talk about various CDN principles and mechanisms.

The DNS server checks the health status of cluster services and assigns the most appropriate IP addresses to users to meet user experience. In addition, when a disaster occurs, the DNS health check can perform a disaster recovery switchover between active and standby IP addresses.

conclusion

High availability of systems is all about redundancy and failover.

In addition to the available solutions discussed above, we also need to adopt certain overload protection policies to ensure server availability. Common policies such as current limiting, fusing, and degradation are all methods to deal with faults.

But these protection policies are not enough. We also need a fault monitoring system, which in turn requires monitoring management or data collection. These Apache servers have built-in functions such as user log collection and server performance monitoring reports.

Each company may have different strategies for this. By understanding these basic ideas, we can improve our vision of engineering. I hope this paper can be helpful to students who see this.