How to gracefully scale, consistent hashing algorithm

background

At present, the background micro-service architecture is booming, and Docker technology is becoming increasingly mature. The two are just like meeting in boya’s childhood, a match made in heaven. However, service containerization will encounter dynamic expansion and shrinkage in any business background. With the fluctuation of service access magnitude, the automatic increase and reclamation of container resources can reduce the pressure for operation and maintenance.

In addition, before and after expansion, load balancing is required to maintain the load pressure on each node, thus making expansion more “elegant”. The common algorithm model in load balancing technology involves consistent hashing algorithm.

Cohashing algorithm

Each node in a service cluster has a “hash address” as a unique identifier. Its calculation formula is abbreviated as follows: Add =hash(object)mod Nadd=hash(object)modN If a node is added or reduced due to capacity expansion, Add =hash(object)mod(N±1) Add =hash(object)mod(N±1) As a result, data on all nodes needs to be migrated, which costs too much

Consistent hashing algorithm

Consistent hashing designs the hash space as a “ring” with a range of [0,2^32-1], and the domain can be calculated using the IP or host name of the node.

Suppose we have 3 nodes and 4 data, whose addresses are distributed in the hash space as follows:

According to the consistency hash algorithm, each data is bound to the nearest node clockwise. For example, data A is bound to node 01, data D is bound to node 02, and data B and C are bound to node 03.

If node 03 is reduced, data B and C need to be bound to node 01, and other data and nodes do not need to be migrated.

Suppose a node 04 is added, as shown in the following figure. According to the consistent hash algorithm, data B only needs to be migrated to node 04 again.

conclusion

Consistent hashing algorithm, because of its special data structure and data binding algorithm, greatly reduces the amount of data migration that can occur when nodes are added or reduced, which brings the application value is to reduce the time of dynamic expansion and shrinkage.

At present, there are very few “second scale” expansion and shrinkage products on the market, most of which are in “minute scale”. The more sensitive the expansion and shrinkage response is, the probability of service downtime or avalanche will be greatly reduced during large-scale business operations.

How to gracefully scale, consistent hashing algorithm

background

Cohashing algorithm

Consistent hashing algorithm

conclusion

Related Posts

Real combat development, using Spring Session and Spring Security to complete the website login transformation!

How to build the RTMP live server on Debian 10

MyBatisPlus API is commonly used