In a distributed cache service, there may be many nodes that provide the cache service. In a stand-alone caching service, data is cached like this:

When the data is first queried, it is first queried from the source data (such as the database), and then stored in the cache server. The next time the same data is queried, it is directly queried from the cache server.

But cache servers are unlikely to be stand-alone and often have multiple nodes. When you switch to distributed, there are some problems.

Problem 1: Data redundancy

Consider that, in the case of stand-alone service, LRU algorithm is used to realize cache access, and a key corresponds to a data value. Under distributed conditions, if nodes are simply added, the data corresponding to the search key will be on node A this time, but on server B the next time. There is no need to have multiple caches for the same key. This is data redundancy.

How to solve it?

Using hashing. First hash the key value, then mod the number of nodes.

h = hash(key) %len(node)
Copy the code

Data for the same key will be cached by only one node. It’s awesome.

But, I can’t be the same nodes all the time, what if some nodes fail, or I need to add nodes?

Problem two: Fault tolerance and scalability

If a node hangs or a new node is added, len(node) will change and the hash value will be different from the previous one. This results in the addition or deletion of a node and invalidates all previous caches! Oh my God!!

The problem is a cache avalanche.

What to do? Use consistent hash algorithms.

Consistent Hashing algorithm (Consistent Hashing) was first published in the paper Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web was proposed.

It works by treating the entire hash space (the h calculated by the formula above) as a ring ranging from 0 to 2^32-1. Each server node is mapped to the ring using the hash function, and the data key is also mapped to the ring using the hash function. Clockwise, the data key belongs to the node that is close to the node.

For example, we now have three cache server nodes 2, 4, and 6. Assuming the hash algorithm is output as is, we hash the nodes and data (1,3,7,9) onto the ring:

Clockwise, data 1 belongs to Node2, data 3 belongs to Node4, and data 7 and 9 enter Node6.

It looks good, but there’s a problem. Data skew occurs when there are fewer nodes. As shown in the figure above, data can be piled up between Node6 and Node2.

The solution is to add virtual nodes and use the virtual nodes to load balance each data. For virtual nodes, you compute multiple hashes on a real node, put them on a ring, and all the data for these virtual nodes belongs to the real node.

So all the data is evenly distributed across the ring.

Algorithm implementation

Knowing the principle, let’s implement the consistent hash algorithm. The whole algorithm imitates the implementation of the distributed cache service Groupcache of go language, which can be said to be the implementation of memcached in Go language.

First, define a consistent hash ring structure:

type Hash func(data []byte) uint32

// ConHashconsistencyhash
type ConHash struct {
	hash     Hash           / / the hash algorithm
	replicas int            // Number of virtual nodes
	nodes    []int          // Hash link points
	hashMap  map[int]string // Virtual node - Real node
}
Copy the code

As you can see, the type Hash is a callback function and the user can customize the Hash algorithm.

Then you need to add nodes to the hash ring using the hash algorithm based on the specified number of virtual nodes.

// Add Adds the node to the hash ring
func (m *ConHash) Add(nodes ...string) {
	for _, node := range nodes {
		// Place the node value into the ring using the hash algorithm based on the specified number of virtual nodes
		for i := 0; i < m.replicas; i++ {
			h := int(m.hash([]byte(strconv.Itoa(i) + node)))
			m.nodes = append(m.nodes, h)
			// Map a virtual node to a real node
			m.hashMap[h] = node
		}
	}
	sort.Ints(m.nodes)
}
Copy the code

It also needs to obtain the corresponding node from the ring according to the key value, and then search for data from the node after obtaining the node.

// Get retrieves the node corresponding to the key from the hash ring
func (m *ConHash) Get(key string) string {
	if len(m.nodes) == 0 {
		return ""
	}
	// Computes the hash value of the key
	h := int(m.hash([]byte(key)))
	// Find the first matching virtual node clockwise
	idx := sort.Search(len(m.nodes), func(i int) bool {
		return m.nodes[i] >= h
	})

	// Search from the hash ring
	// Returns the actual node of the hash map
	return m.hashMap[m.nodes[idx%len(m.nodes)]]

}
Copy the code

Some people say wrong ah, why add all server nodes, isn’t the data also placed in the ring?

Groupcache divides data into groups, which are stored internally using hash+ bidirectional linked lists. Cached data is stored in linked lists.

The whole process is as follows: when searching the data corresponding to the key value, the node is determined according to the group and key values in the URL link. How is the node determined? As the code above explains, compute the hash of the key value to see which node it belongs to.

Then look it up from the bidirectional linked list of that node. If the node does not have the key, look it up from a user-defined data source (such as a database) and store the data to the group.

Above, hope helpful.

Reference article:

  1. Blog.codinglabs.org/articles/co…

My blog: blog.shiniao.fun/