See article:

Quickly understand the principles of high-performance HTTP server load balancing

This section describes load balancing principles

Brief introduction to several common load balancing architectures

Read the distributed architecture under the load balancing technology: classification, principle, algorithm, common solutions, etc

One, the introduction

Load balancing refers to balancing loads (tasks) and allocating them to multiple operating units so that multiple devices can complete one or more tasks more quickly and efficiently. Based on the existing network structure, load balancing provides a transparent, inexpensive and effective method to expand the bandwidth of servers and network devices, strengthen the network data processing capacity, increase the throughput, and improve the availability and flexibility of the network.

Load balancing has two meanings:

  • ① Divide a complex task into multiple sub-tasks, and then hand over to multiple operation units for collaborative processing, and finally jointly complete the task. Isn’t it a bit like distributed computing?
  • (2) The massive and high concurrent access processing is evenly divided into multiple processing units to evenly touch rain and dew to avoid drought death and flood death.

Why load balancing?

At the beginning of website establishment, only one server (usually LAMP architecture as the solution) is required to provide external services. However, with the rapid expansion of business, the original architecture still cannot meet the current business needs. In this case, it is necessary to expand the capacity of servers. Simply speaking, multiple servers are grouped into a cluster to provide external services (the architecture is in place in one step). Users’ requests are distributed to different servers through the load balancing technology, thus achieving the purpose of capacity expansion.

At present, load balancing technology has been widely used in network. Take the nationwide “Double 11 Shopping Festival” as an example, every second order volume is huge, only through a few servers completely unable to meet so many people visiting at the same time. Each Internet giant (taobao, jingdong, WeChat, alipay, trill, Meituan, etc.) by all kinds of technology to solve this high concurrent access, mass data storage and so on, to meet their business requirements at the same time, will own solutions through constructing a cloud platform to sell, is for the use of the other small companies, This reduces operating costs for small companies (no need to start from scratch and work your way through a highly concurrent architecture solution).

Most of these solutions now use clustering technology to achieve optimal resource usage, maximum throughput, shortest response time, and avoid single point of overload. The use of cluster technology, it is inevitable to use the load balancing technology, otherwise it can not give full play to the advantages of cluster technology. The following load balancing technology is a detailed introduction.

2. Load balancing is classified

Currently, load balancing can be divided into three categories:

DNS based load balancing

Hardware-based load balancing

Such as F5

Software-based load balancing

For example, the LVS, NGINX, squids

Principle: When a user accesses a domain name (for example, www.taobao.com), the DNS server first resolves the IP address corresponding to the domain name. In this case, the DNS server can return different IP addresses according to users in different geographical locations. For example, users in Hangzhou will return the IP address of Taobao’s business server in Hangzhou, and users in Beijing will return the IP address of Taobao’s business server in Beijing.

In this mode, users can distribute requests according to the “proximity principle”, which not only reduces the load on a single cluster, but also improves the access speed of users.

The DNS load balancing solution has the natural advantage of simple configuration, low implementation cost and no additional development and maintenance.

** But it also has an obvious disadvantage: ** When the configuration is modified, it does not take effect in a timely manner. This is due to the DNS feature. DNS generally has multi-level cache. After the DNS configuration is modified, IP addresses are not changed in a timely manner due to cache, affecting load balancing.

In addition, DNS load balancing is based on regions or IP polling without advanced routing policies. This is also the limitation of THE DNS solution.

Special hardware support may be more efficient than software processing. Among hardware load balancers, the best known is the F5 load balancer:

Hardware load balancers have one thing in common: they’re good for everything but expensive. This one characteristic leads to its market is not that popular, so more small and medium-sized companies use software load balancing technology.

Software technology load balancing, based on OSI network model to achieve. We briefly illustrate the existing load balancing technology through the OSI 7 layer model.

Software load balancing classification:

  • Layer 2 Load Balancing

A load balancing server provides a public IP address. Real servers in the cluster use the same internal IP address but different MAC addresses. When a load balancing server receives a customer’s request, it changes the destination MAC address in the packet and distributes the packet to different devices to achieve load balancing. This technique does not seem to be used often.

  • Layer 3 load balancing

A load balancing server provides one public IP address, and real servers in the cluster use different internal IP addresses. When a load balancing server receives a customer request, it changes the destination IP address in the packet and distributes the packet to different devices to achieve load balancing.

  • Layer 4 load balancing

The layer-4 load balancing server uses the port information of the transport layer (TCP\UDP) as well as the source and destination IP addresses of the IP layer for processing. When receiving customer requests, the load balancing server modifies the IP addresses and ports in the packets and distributes the packets to different devices to achieve load balancing.

  • LVS

  • NAT

  • The IP tunnel

  • Layer 7 load balancing

Layer-7 load balancing Works at the application layer of the OSI model. The application layer has multiple protocol types (see the figure in the preceding figure). The load balancing function can be implemented based on the content of packets (such as URL, browser type, language, and even address location).

  • Nginx load balancing

The most commonly used in software are Nginx load balancing (layer 7) and LVS load balancing (layer 4).

The load balancing based on four layers has high efficiency, which can generally reach hundreds of thousands of processing capacity per second. Load balancing based on seven layers typically runs in the tens of thousands per second.

Software-based load balancing features are obvious: cheap!! . Small and medium-sized companies can be directly based on the open source code to do the transplantation, adaptation, and can be deployed on the ordinary server, in the research and development cost and hardware greatly reduced costs, so as to get the use of many companies.

Load balancing algorithm

The main equalization algorithms are:

Load balancing algorithm can be divided into two types: static load balancing algorithm and dynamic load balancing algorithm.

  • Static load balancing algorithms include polling, ratio, and priority.

  • Dynamic load balancing algorithm includes: minimum number of connections, fastest response speed, observation method, prediction method, dynamic performance allocation, dynamic server replenishment, quality of service, service type, rule mode.

  • Round Robin: ** Sequential loops will request each server to be connected sequentially once. When one of the servers fails at levels 2 through 7, big-IP removes it from the sequential loop queue and does not participate in the next poll until it recovers.

Implementation, generally for the server with weight; This has two advantages: the ability to distribute different loads based on server performance differences; When a node needs to be removed, it only needs to set its weight to 0.

  • Advantages: simple and efficient implementation; Easy horizontal expansion

  • Disadvantages: The request to the destination node is uncertain, so it is not suitable for write scenarios (cache, database write)

  • Application scenario: The database or application service layer only has read data

  • Random mode ** : ** requests are randomly distributed to each node; A balanced distribution can be achieved in the scene of large enough data;

    • Advantages: simple and easy horizontal expansion
    • Disadvantages: Same as Round Robin, cannot be used in written scenarios
    • Application scenario: Database load balancing is also a read-only scenario
  • Hash method: ** according to the key to calculate the need to fall on the node, can ensure that a key must fall on the same server;

    • Advantages: The same key must reside on the same node, which can be used in cache scenarios with write and read
    • Disadvantages: Hash keys will be redistributed after a node failure, poor scalability.
    • Solution: Consistent hashing or using Keepalived to ensure the high availability of any node, after failure will have other nodes to top
    • Application scenario: Cache, read and write
  • Consistency hashing: when a node of the server fails, only the key on the node is affected, ensuring the maximum accuracy; For example, ketama scheme in TwemProxy; The production implementation can also plan to specify the sub-key hash, so as to ensure that locally similar key energy distribution on the same server;

  • Load by key range: Load by key range, the first 100 million keys are stored in the first server, 100-200 million in the second node.

    • Advantages: Easy horizontal expansion, when the storage is insufficient, add a server to store the subsequent new data
    • Disadvantages: uneven load; Database distribution is uneven;
    • (The data is hot and cold. Generally, the recently registered users are more active, which causes the subsequent servers to be very busy, while the early nodes are very idle.)
    • Application scenario: Database fragment load balancing
  • Load by modulo key to server nodes: load by modulo key to server nodes; For example, if there are four servers, key modulo 0 falls on the first node and key 1 falls on the second node.

    • Advantages: data hot and cold distribution balance, database node load balance distribution;
    • Disadvantages: Difficult to scale horizontally;
    • Application scenario: Database fragment load balancing
  • Pure dynamic node load balancing: based on CPU, IO, and network processing capacity to decide how to schedule the next request.

    • Advantages: make full use of server resources, ensure the load processing balance on each node
    • Disadvantages: complex implementation, less real use
  • Ratio: assign a weighted Ratio to each server, and assign user requests to each server according to this Ratio. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.

  • ** Group all servers, assign Priority to each group, big-IP user requests, and assign them to higher Priority server groups (within the same group, polling or ratio algorithm is used to allocate user requests); Big-ip sends requests to the lower-priority server group only when all of the higher-priority servers fail. In this way, users are actually provided with a hot backup mode.

  • Least Connection: Pass new connections to the server that does the Least Connection processing. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.

  • Fastest mode: Transfers connections to servers that are Fastest in response. When a layer 2 through 7 failure occurs on one of these servers, big-IP removes it from the server queue and does not participate in the allocation of the next user request until it recovers.