Interviews often ask questions about caching. I know these problems exist, and there are many organized answers on the Internet, but I have no real feeling about them. Recently in the understanding of cache processing, know a library groupcache, just looked at the source code, intends to understand the processing. One module, SingleFlight, with 65 lines of comments, does a better job of solving the cache breakdown problem.

Cache breakdown

In the case of high concurrency, when a large number of requests query the same key at the same time, this key happens to be invalid, which will lead to the same time, these requests will query the database, easy to crash the database, such a phenomenon is called cache breakdown.

The instance

Let’s take a look at a simple demo to see how it works.

package main

import (
	"fmt"
	"github.com/golang/groupcache/singleflight"
	"sync"
	"time"
)

func search(a) (interface{}, error) {
	fmt.Println("start searching")
	time.Sleep(time.Millisecond * 200)
	return 1000.nil
}

func main(a) {
	g := singleflight.Group{}
	wg := sync.WaitGroup{}
	for i := 0; i < 1000; i++ {
		wg.Add(1)
		go func(a) {
			res, err := g.Do("multi search", search)
			fmt.Println(res, err)
			wg.Done()
		}()
	}
	wg.Wait()
}
Copy the code

The search function pretends to be a time-consuming database query that returns data after 200ms. If there are 1000 queries at this time, if the search operation is carried out, it will cause relatively large pressure on the database. However, only one line of start searching was printed in the result of the above code operation, that is, only 1 out of 1000 queries was actually performed. Awesome!!

The source code

There are only two structures involved in the source code, one is the entity called, val and err are used to save the result returned by the function, and WARgaming is used to indicate the state of the function running (whether the function has been called).

type call struct {
	wg  sync.WaitGroup
	val interface{}
	err error
}
Copy the code

The other is to call the entity’s hash table, which contains a mutex, for concurrent control of m’s access.

type Group struct {
	mu sync.Mutex       // protects m
	m  map[string]*call // lazily initialized
}
Copy the code

In the actual query, the following function Do is used.

func (g *Group) Do(key string, fn func(a) (interface{}, error)) (interface{}, error) {
	g.mu.Lock()
	if g.m == nil {
		g.m = make(map[string]*call)
	}
    	// Whether the key has already been called, if so, wait for the result
	if c, ok := g.m[key]; ok {
         	// The value has been retrieved. Release the lock
		g.mu.Unlock()
         	// Wait for the result of the first call
		c.wg.Wait()
		return c.val, c.err
	}
    	// First call
	c := new(call)
	c.wg.Add(1)
	g.m[key] = c
    	/ / releases the lock
	g.mu.Unlock()
	// The result of the call is stored in call
	c.val, c.err = fn()
	c.wg.Done()
    	// During function execution, repeated queries will remain in the c.w.wait () state until the c.W.One () call completes
	// Use a mutex when deleting
	g.mu.Lock()
	delete(g.m, key)
	g.mu.Unlock()

	return c.val, c.err
}
Copy the code

In simple terms, a key corresponds to a call. When a key is called for the first time, the key is placed in group. m, so that when the same key comes in later, the key already exists, and the result of execution is awaited via call.wg.

Feel the power of code for the first time. In such a simple way, we solved the problems in practice.