Understand the principle and implementation of consistent hash

In Go-Zero’s Distributed Cache System Sharing, Kevin focused on the principles of consistent hashing and practices in distributed caches. In this article, we discuss in detail the principle of consistent hash and its implementation in Go-Zero.

Take storage as an example. In the whole microservice system, our storage cannot be said to be just a single node.

One is to improve stability, in the case of a single node down, the entire storage will face service unavailable;
The second is data fault tolerance. In the same case, the data of single node is physically damaged, while in the case of multiple nodes, the nodes have backups, unless the nodes that are backup to each other are damaged at the same time.

So the question is, in the case of multiple nodes, which node should the data be written to?

hash

So in essence, we need a value that can be “compressed” and converted to a smaller value, which is usually unique and extremely compact, such as UINT64:

Idempotency: Each time a hash is computed with the same value, it is guaranteed to yield the same value

That’s what the hash algorithm does.

However, common hash algorithm is adopted for routing, such as key % N. If a node exits the cluster due to an exception or the heartbeat is abnormal, then hashing the route will cause a large amount of data to be redistributed to different nodes. When a node accepts a new request, it needs to reprocess the logic of fetching data: if it’s in the cache, it can cause a cache avalanche.

At this point, the Consistent Hash algorithm needs to be introduced.

consistent hash

Let’s take a look at how Consistent Hash addresses these issues:

rehash

First solve a lot of rehash problems:

As shown in the figure above, when a new node is added, only the key key31 will be affected; after a new node is added (removed), only the data near the node will be affected. Data on other nodes will not be affected, thus solving the problem of node changes.

This is exactly what it is: monotonicity. This is also the reason why Normal Hash algorithm cannot satisfy distributed scenarios.

Data skew

In fact, as you can see from the above diagram, most of the current keys are concentrated on Node 1. If the number of nodes is relatively small, and most of the keys can be clustered in one node, the problem found during monitoring is uneven load among nodes.

To solve this problem, Consistent Hash introduces the concept of Virtual Node.

Since the load is uneven, we create an artificial balanced scenario, but there are only so many nodes. So the virtual node is used to partition the region, and the actual node served is still the previous node.

The specific implementation

Let’s look at Get() first:

Get

Let’s start with how it works:

To calculatekeyThe hash
Find the first matchvirtual nodeIndex of and to the correspondingh.keys[index]: Virtual node hash value
Corresponding to thisringTo find a matchactual node

In fact, we can see that ring gets a []node. This is because when computing virtual node hashes, hash conflicts may occur, with different virtual node hashes corresponding to one actual node.

This also means that Node and Virtual Node are one-to-many. And the ring inside is the following design:

This indicates the consistent hash allocation policy:

virtual nodeAs a range partition.keyTo obtainnode, according to the dividing basis isvirtual nodeAs a boundary
virtual nodethroughhash, which ensures that the keys allocated by different nodes are roughly uniform in the corresponding relation. That isBreak up the binding
When a new node is added, multiple nodes are allocatedvirtual node. The new node can load the pressure of several original nodes, and it is easier to realize the load balance during capacity expansion from a global perspective.

Add Node

Get gives you a rough idea of the entire consistent hash design:

Type ConsistentHash struct {hashFunc Func // Hash function replicas int // virtual node zoom factor keys []uint64 // stores virtual node hash ring // empty empty empty nodes map[string]lang. Placeholdertype // empty empty empty nodes map[string]lang. Placeholdertype So use map] lock sync.RWMutex}

All right, so basically a consistent hash is complete.

Specific code:
https://github.com/tal-tech/g…

Usage scenarios

Consistent hashes can be widely used in distributed systems:

Distributed caching. Can be found inredis clusterBuild one on this storage systemcache proxy, free control of routing. This routing rule can then use the consistent hash algorithm
Service discovery
Distributed Scheduling Task

These distributed systems above can be used in the load balancing module.

The project address

https://github.com/tal-tech/go-zero

Welcome to Go-Zero and Star Support!

WeChat communication group

Follow the public account of “micro-service practice” and click on the exchange group to get the QR code of the community group.