Article source: www.ha97.com/5646.html

www.haproxy.org/

nginx.org/\

www.linuxvirtualserver.org/\

www.keepalived.org/\

PS: Nginx/LVS/HAProxy is currently the most widely used three load balancing software, I have implemented in a number of projects, reference some information, combined with some of their own experience, summed up.

The common use of load balancing is to use different techniques at different stages as the site grows in size. Specific application requirements have to be analyzed, if it is a small and medium-sized Web application, such as daily PV less than 10 million, using Nginx is completely ok; If there are many machines, you can use DNS polling, LVS takes a lot of machines; Consider using LVS for large websites or important services with a large number of servers.

One is to carry out hardware, common hardware is relatively expensive F5, Array and other commercial load balancers, its advantage is that it has a professional maintenance team to maintain these services, the disadvantage is that it costs too much, so it is not needed for small-scale network services. The other is open-source, free linux-based load balancing software like Nginx/LVS/HAProxy, which is implemented at the software level, so it is very cheap.

At present, the website architecture is generally more reasonable and popular architecture scheme: the Web front-end uses Nginx/HAProxy+Keepalived as a load balancer; Back-end MySQL database with a master and multiple slave and read and write separation, using LVS+Keepalived architecture. Of course, plans should be made according to the specific needs of the project. Below talk about respective characteristic and applicable occasion.

A, Nginx

The advantages of Nginx are:

1, work in the network layer 7, can be for HTTP applications to do some split strategy, such as for domain name, directory structure, its regular rules than HAProxy is more powerful and flexible, this is one of the main reasons it is widely popular, Nginx alone by this can be used on far more occasions than LVS. 2. Nginx has very little dependence on network stability. Theoretically, if it can ping through, it can carry out load functions, which is also one of its advantages. On the contrary, LVS is more dependent on network stability, which I have experienced deeply. 3. Nginx is relatively simple to install and configure, and relatively convenient to test. It can basically print out errors with a log. It takes a long time to configure and test LVS, because LVS are heavily dependent on the network. 3, can bear high load pressure and stability, in the case of the hardware is not bad can support tens of thousands of concurrent quantity, the load degree is relatively small than LVS. 4. Nginx can detect internal server failures through ports, such as status codes, timeouts, etc., and resubmit the returned error request to another node, but it does not support URL detection. For example, if a user is uploading a file and the node that handles the upload fails during the upload process, Nginx will switch the upload to another server for re-processing, and LVS will simply fail. If a large or important file is uploaded, the user may be annoyed. 5. Nginx is not only a great load balancer/reverse proxy software, it is also a powerful Web application server. LNMP is also a very popular Web architecture in recent years, and has good stability in high traffic environment. 6. Nginx is now becoming more mature as a Web reverse accelerated cache, faster than traditional Squid servers, so consider using it as a reverse proxy accelerator. 7. Nginx can be used as a mid-level reverse proxy, which is basically unmatched by Nginx. The only comparison to Nginx is Lighttpd, but Lighttpd is not quite as functional as Nginx, the configuration is not that clear and easy to read, and the community data is far less active than Nginx. Nginx can also be used as a static web page and image server, which is also unmatched in performance. Also, the Nginx community is very active and there are many third-party modules.

Tengine, the front end of Taobao, is a customized version of nginx.

Nginx’s general HTTP request and response flowchart:

 

Nginx only supports HTTP, HTTPS, and Email protocols, which makes it less applicable. 2. The back-end server health check can be performed only by port, not by URL. Session persistence is not supported, but can be resolved using ip_hash.

Second, the LVS

LVS: Implements a high-performance, highly available load balancing server with Scalability, Reliability, and Manageability using Linux kernel clusters.

LVS has the following advantages: 1. It has strong load resistance and works on layer 4 of the network only for distribution without generating traffic. This feature also determines its strongest performance among load balancing software and low consumption of memory and CPU resources. 2, the configuration is low, which is a disadvantage and an advantage, because there is not too much configuration of things, so it does not need too much contact, greatly reducing the probability of human error. 3, stable work, because its own load capacity is very strong, its own complete dual-machine hot backup solution, such as LVS+Keepalived, but we use the most in the implementation of the project or LVS/DR+Keepalived. 4, no traffic, LVS only distribute requests, and the traffic does not go out from itself, which ensures that the performance of equalizer IO will not be affected by heavy traffic. 5. Wide range of applications, because LVS works at layer 4, it can load balance almost all applications, including HTTP, databases, online chat rooms, and so on.

LVS Direct Routing (DR) network flow chart:

The disadvantages of LVS are as follows: 1. The software itself does not support regular expression processing and cannot do static and dynamic separation; Now many sites in this area have a strong demand, this is Nginx/HAProxy+Keepalived advantage. 2, if the website application is huge, LVS/DR+Keepalived implementation is more complex, especially behind the Windows Server machine, if the implementation and configuration and maintenance process is more complex, relatively speaking, Nginx/HAProxy+Keepalived is much simpler.

Third, HAProxy

The characteristics of HAProxy are as follows: 1. HAProxy also supports virtual host. 2. The advantages of HAProxy can supplement some disadvantages of Nginx, such as supporting Session persistence and Cookie guidance; It can also detect the status of the back-end server by obtaining the specified URL. 3. HAProxy is just a load balancing software like LVS; In terms of pure efficiency, HAProxy has better load balancing speed than Nginx, and it is also better than Nginx in concurrent processing. 4, HAProxy support TCP protocol load balancing forwarding, can load balancing MySQL read, back-end MySQL node detection and load balancing, you can use LVS+Keepalived to do load balancing MySQL master and slave. 5, HAProxy load balancing strategy is very much, HAProxy load balancing algorithm now has the following 8 kinds of specific: ① roundrobin, that is simple polling, this is not to say, this is the basic load balancing have; ② static-rr indicates that attention should be paid according to the weight. ③ Leastconn, which indicates that the least connected person is processed first. (4) source, according to the request source IP address, this is similar to Nginx IP_hash mechanism, we use it as a method to solve the session problem, we propose to pay attention to; ⑤ RI, indicates the URI according to the request; ⑥ rl_param, which requires an URl parameter name according to the request URl parameter ‘balance url_param’ requires an URl parameter name; ⑦ HDR (name), which means to lock each HTTP request according to the HTTP request header; 8 RDP-cookie (name) : locks and hashes each TCP request based on the cookie(name).

Four,

1. Nginx works at layer 7 of the network, so it can do traffic policies for HTTP applications, such as domain names, directory structures, etc. LVS does not have such functions, so Nginx can be used in many more situations than LVS. But Nginx’s useful features make it more scalable than LVS, so touch it often, and if you touch it a lot, you have a higher chance of human error. 2, Nginx is less dependent on network stability, in theory, as long as ping through, web access is normal, Nginx can be connected, this is a big advantage! Nginx can also distinguish between extranets and extranets. If you have both extranets and extranets, it is equivalent to a single machine with backup lines. LVS is more dependent on the network environment. At present, servers are on the same network segment and LVS are shunting in direct mode, so the effect can be guaranteed. In addition, LVS needs to apply to the hosting company for at least one more IP for Visual IP. It seems that LVS cannot use its own IP for VIP. To be a good LVS administrator, you really need to learn a lot about network communication, not just HTTP. 3. Nginx is easy to install and configure, and easy to test because it basically prints errors in a log. It takes a long time to install, configure, and test LVS; LVS rely heavily on the network. In most cases, the failure of configuration is due to network problems rather than configuration problems. In case of problems, it will be much more troublesome to solve them. 4. Nginx can also handle high loads and be stable, but there are several levels of load and stability difference between LVS: Nginx handles all traffic and is limited by machine IO and configuration; Bugs are hard to avoid. 5. Nginx can detect internal server failures, such as the status code returned by the server processing web pages, timeouts, etc., and resubmit the returned error requests to another node. At present, LVS lDirecTD can also support monitoring for internal conditions of the server, but the principle of LVS makes it impossible to resend requests. For example, if a user is uploading a file and the node that handles the upload fails during the upload process, Nginx will switch the upload to another server and the LVS will simply fail. If a large or important file is uploaded, the user may get annoyed. 6, Nginx on the request of asynchronous processing can help node server load, if the use of Apache direct external services, then there are a lot of narrow band links apache server will take up a large amount of memory and can not be released, using a more than one Nginx apache proxy, these narrow band links will be blocked by Nginx, Requests don’t pile up on Apache, which reduces resource usage considerably. The same is true with SQUID, even though SQUID is configured not to cache itself, which is a great help to Apache. 7. Nginx supports HTTP, HTTPS, and email (email is less used), and LVS supports more applications than Nginx. In use, the general front end strategy should be LVS, that is, DNS should point to LVS equalizer, the advantages of LVS make it very suitable for this task. Important IP addresses, such as database IP and WebService server IP, should be managed by LVS. As time goes by, these IP addresses will be more and more widely used. If the IP addresses are replaced, failures will occur one after another. Therefore, it is safest to host these important IPS to LVS, the only drawback is that the number of VIPs required is quite high. Nginx can be used as an LVS node machine to take advantage of both Nginx’s capabilities and Nginx’s performance. Of course, this layer can also be directly used squid, squid function is much weaker than Nginx, performance is also inferior to Nginx. Nginx can also be used as a mid-level proxy, where Nginx is virtually unmatched. The only thing that can really shake Nginx is Lighttpd, but lighttpd doesn’t have the full functionality of Nginx yet, and the configuration isn’t that clear and readable. In addition, the IP of the middle agent is also important, so the middle agent also has a VIP and LVS is the perfect solution. The specific application has to be analyzed, if it is a relatively small website (daily PV less than 10 million), with Nginx is completely ok, if there are many machines, you can use DNS polling, LVS cost more machines; Large sites or important services, when the machine is not worried, to consider the use of LVS.

The current use of network load balancing is to use different techniques at different stages as the site size increases:

The first stage: use Nginx or HAProxy to carry out single-point load balancing. In this stage, the server scale has just broken away from the single-server and single-database mode, and certain load balancing is required. However, the scale is still small, and there is no professional maintenance team to carry out maintenance, and there is no need to carry out large-scale website deployment. Using Nginx or HAproxy is the first choice. These things are quick to use, easy to configure, and use HTTP over seven layers. This is the first choice.

Stage 2: With the further expansion of network services, the single point of Nginx is no longer enough. At this time, LVS or commercial Array is the primary choice. Nginx is used as the node of LVS or Array, and the specific CHOICE of LVS or Array is based on the company size and budget. Array’s application delivery function is very powerful. I have used it in a project, and its cost performance is much higher than F5. It is the first choice for commercial use! However, in general, the talent involved at this stage can not keep up with the improvement of the business, so purchasing commercial load balancing has become a necessary step.

The third stage: At this time, network service has become the mainstream product. At this time, with the further expansion of the company’s popularity, the ability and number of related talents will also improve. At this time, open source LVS has become the first choice in terms of developing customized products suitable for its own and reducing costs. The ideal basic architecture is Array/LVS – Nginx/Haproxy – Squid/Varnish – AppServer.

\