You call this shit load balancing?

This article has been included in my github address, welcome to star support ^_^

I believe we have all heard such a classic interview question: “Please tell the whole process of entering a keyword in Taobao to the final display of the web page, the more detailed the better”

This is a very difficult question. It involves HTTP, TCP, gateways, LVS, and a number of related concepts and mechanisms. If you can master each of these concepts, it will greatly enhance your skill tree and you will have a good understanding of how the network works. But know how traffic flow for locating problems will help a lot to you, I will use these knowledge before orientation to the many problems, in order to understand the whole process, I looked at a lot of data also consulted many, believe should be able to explain the problem, but found that I write this length is too long, so divided into two article to introduce respectively. This paper first introduces the overall architecture diagram of the back-end traffic, and the next chapter will deeply analyze the details, such as the working details of LVS, which will involve the working mechanism of switches and routers

Li Daniu started his own business. Since there was little traffic in the early stage, he only deployed a Tomcat server and let clients directly send requests to this server

In the beginning, there was no problem with the deployment, because the business volume was not very large, and the single machine was enough to withstand it. But later, Li Daniu’s business was caught in the tuyere, and the business developed rapidly, so the performance of the single machine gradually encountered a bottleneck, and because only one machine was deployed, the business fell to zero when the machine died, which was not good. Therefore, in order to avoid the performance bottleneck of a single machine and solve the hidden trouble of single point of failure, Li Daniu decided to deploy several machines (suppose three), so that the client can randomly call one of the machines. In this way, even if one of the machines dies, the other machines are still alive, and the client can call other machines that are not down

Now the question is, which of the three machines should the client call? It is not appropriate to let the client choose a specific server, because if the client chooses a specific server, it must know which servers there are, and then connect to one of the machines randomly by polling or something like that. However, if one of the servers goes down, the client will not know in advance, and it is likely that the client will connect to the failed server. Therefore, the task of choosing which machine to connect to is best left in the server. There is nothing that cannot be solved by adding a layer. If there is one, add another layer. Therefore, we add another layer on the server side and name it LB (Load Balance). The LB receives the requests from the client and then decides which server to communicate with. The industry generally uses Nginx as LB

The adoption of such an architecture finally supported the rapid growth of the business, but li Daniu soon found that there were problems with such an architecture: All the traffic can be sent to the server, which is not secure. Can we do another layer of authentication before the traffic is sent to the server? After the authentication passes, the traffic can be sent to the server, and this layer is called the gateway (to avoid single point of failure, the gateway should also exist in the form of cluster).

Otherwise, an error message will be returned to the client. In addition to authentication, the gateway also plays a role in risk control (to prevent woof party) and protocol conversion (such as converting HTTP to Dubbo). Traffic control to ensure that the traffic forwarded to the server is secure and controllable.

Such a design lasted for a long time, but later Li Found that there were problems with such a design. Both dynamic requests and static resource requests (such as JS and CSS files) were sent to Tomcat, which would cause tomcat to bear great pressure when the traffic was heavy. In fact, Tomcat is not as good as Nginx in handling static resources. Tomcat has to load files from disks every time, which affects performance. Nginx has proxy cache and other functions to greatly improve the processing capacity of static resources.

Voiceover: Proxy cache means that nginx gets a resource from a static resource server and stores it in local memory and disk. If the next request hits the cache, nginx returns it directly from the local cache

Therefore, Li Daniu made the following optimization: if it is a dynamic request, it will call Tomcat through the Gateway, if it is a static request, it will call the static resource server

This is what we call static/static separation, separating static requests from dynamic requests, so that Tomcat can focus on the dynamic requests it is good at, while static resources take advantage of Nginx proxy cache and other features, and the back-end processing power is a step up.

In addition, it should be noted that not all dynamic requests need to pass through the gateway. For example, since the background of our operation center is used by internal employees, its authentication is different from the API authentication of the gateway. Therefore, we directly deployed two servers of the operation center. Let Nginx direct operations center requests to the two servers, bypassing the gateway.

Of course, to avoid single point of failure, Nginx also needs to deploy at least two machines, so our architecture is like the following: Nginx has two machines deployed, in a standby mode, and the standby Nginx uses keepalived mechanism (sending heartbeat packets) to detect the existence of the master Nginx. Found itself in the role of master Nginx when it went down

This looks like a great architecture, but it’s important to note that Nginx is a seven-tier (application tier) load balancer, which means that if it wants to forward traffic, it must first establish a TCP connection with the client and then establish a TCP connection with the upstream server. And we know that the establishment of TCP connection is actually need to consume memory (TCP Socket, receive/send cache, etc., need to occupy memory), the client and upstream server to send data need to send temporary storage to Nginx and then through the other end of the TCP connection to each other.

So the load capacity of Nginx is limited by a number of configurations such as machine I/O, CPU memory, and so on. Once there are many connections (say, millions), the load capacity of Nginx drops dramatically.

Through analysis of Nginx load capacity is bad because it was seven layers load balancer must establish two TCP respectively on upstream and downstream, so if you can design a similar routers that load only forward packets but do not need to connect the load balancer, such as do not need to establish a connection, only responsible for forwarding packets, Without the need to maintain additional TCP connections, its load capacity must be greatly increased, so the four-tier load balancer LVS was born. A simple comparison of the two

It can be seen that LVS simply forwards packets without establishing connections with upstream and downstream. Compared with Nginx, it has strong load resistance and high performance, which can reach 60% of F5 hardware. Low memory and CPU resource consumption

So how does a four-tier load balancer work

When receiving the first SYN request from the client, the load balancing device uses a load balancing algorithm to select an optimal server, changes the destination IP address in the packet to the IP address of the back-end server, and directly forwards the packet to the server. The TCP connection, namely the three-way handshake, is directly established between the client and the server. The load balancer only performs a forwarding action similar to that of a router. In some deployment scenarios, the original source IP address of the packet may be changed during packet forwarding to ensure that the packet can be correctly returned to the load balancer.

To sum up, we add another layer of LVS on Nginx to handle all of our traffic. Of course, to ensure the availability of LVS, we also deploy LVS in the mode of active/standby. In addition, with this architecture, we can easily expand the capacity of Nginx horizontally if it is insufficient. So our architectural improvements are as follows:

Of course, if there is only one LVS in the case of heavy traffic, it will not be able to find. How to do, add more LVS, and use DNS load balancing to randomly call one of them when resolving the domain name

Through this way can finally let the flow of stable flow, there is a point may be some friends will have questions, let’s have a look

If LVS can deploy multiple servers to avoid single points of failure, then Nginx can do the same, and Nginx started supporting four tiers of load balancing after 1.9, so LVS is not necessary?

Without LVS, the architecture diagram looks like this

LVS is a Linux kernel module that works in kernel mode, while Nginx works in user mode and is relatively heavy. Therefore, Nginx is not as good as LVS in terms of performance and stability. This is why we adopted LVS + Nginx deployment.

In addition, I believe you have also noticed that in case of heavy traffic, static resources should be deployed on CDN, which will automatically select the node nearest to the user and return it to the user. Therefore, our final architecture improvement is as follows

conclusion

Architecture must be combined with the actual situation to design of the business, from business talk about architecture is play rascal, you can see above every architecture derivation is closely related to our business development, for small and medium flow without the size of the company, actually use Nginx as load balancing enough, after the rapid growth in traffic, consider using the LVS + Nginx, Of course, LVS does not work for meituan’s huge traffic (tens of Gbps traffic, tens of millions of concurrent connections), so they developed their own set of four-layer load balancer MGW

Another look at the concept of this paper believe that everybody on hierarchical there should be a more thorough understanding, nothing is layered cannot solve the matter, if you have, then add a layer, the layer to make each module, function is decoupled, and convenient extension, everybody is very familiar with TCP/IP is a good example, each layer is responsible for your own business, and As for the lower layer what is the implementation of the upper layer is not care

The above is all the content of this article, I hope you have read the harvest ^^, next we will continue to delve into a request round-trip link, will be an in-depth analysis of LVS, switches, routers, and so on the working principle, please look forward to ^^

Welcome everyone welcome the public account “code sea”, common progress

conclusion

Related Posts

【LeetCode】 Student attendance Java problem solving

DELL server iDRAC batch management tool introduction — RACADM

Django Cache (Study Series 7)