How does Go open source project BigCache speed up concurrent access and avoid high GC overhead

A few days ago, I was reading an article about BigCache, and I was intrigued by how it does two things:

Accelerating concurrent access
Avoid high GC overhead

So I went and read the code. I thought it was awesome, so I wrote this post to share it with you.

BigCache is a fast, concurrent access, self-deprecating in-memory cache that can maintain high performance while storing large numbers of elements. BigCache stores elements on the heap without GC overhead. — From Introduction to BigCache README

/ / BigCache README simple using the sample of the import "github.com/allegro/bigcache" cache, _ := bigcache.NewBigCache(bigcache.DefaultConfig(10 * time.Minute)) cache.Set("my-unique-key", []byte("value")) entry, _ := cache.Get("my-unique-key") fmt.Println(string(entry))Copy the code

BigCache making address

Concurrent access

A cache is required to support concurrent access, whether the program uses a coroutine or the HTTP server allocates a coroutine for each request. The most common solution is to use a read/write lock (sync.rwmutex) to ensure that only one coroutine is allowed to modify the cache contents at a time. However, if you do this, the lock may block subsequent operations, causing cache performance to degrade.

To solve this problem, BigCache uses shards. What are shards? A shard is a structure that contains a locked cache instance.

BigCache uses an array of N shards and splits the data into different shards, so that when you read or write data from the cache, the cache selects one of the shards to use (the selection strategy is described below). In this way, lock granularity is greatly reduced because the lock scope is narrowed from the global cache to a single SHard. (Yoko notes: There is no need to lock the access to the shard array, because the array is created when the initialization, the subsequent size will not change, that is, the subsequent access to the array only read, of course, the array element shard is read and write, but this has no relation to the array.)

High GC overhead

var map[string]Item
Copy the code

The simplest and most common way to implement caching in Go is to use a map to store elements, but if you use a map, the GC (garbage collector) accesses every element in the map during the markup phase, which can be very costly to program performance when the map is very large.

After Go 1.5, if you use a map that does not have Pointers in its key or value, the GC ignores the map.

var map[int]int
Copy the code

To avoid this problem, BigCache uses a map with no Pointers for keys and values, so that the GC ignores the map. Specific measures are as follows:

The map key is the hash value of the key stored in the cache.

BigCache serializes cached values into binary [] bytes, instead of storing cached values as map values. The serialized []byte is then appended to a global []byte (a shard contains a global []byte). Map stores the subscript of the serialized data in the global []byte.

Using global [] bytes is very clever, because it simply adds an extra object to the GC, and since the byte slice contains no pointer data other than its own object, the GC takes O(1) to mark the entire object.

The translator yoko note

Basically, the idea of optimization has been clarified. Basically the following two points:

First, we hash containers with a lot of elements into buckets (subcontainers). This is a common way to reduce lock granularity and improve performance.

The second is to make the map key and value without Pointers, as described above. This approach is essentially a specific optimization for the Go GC’s characteristics.

Also, it is worth mentioning that on the second point, since the global []byte in the bucket uses an array type, it is obviously expensive to remove elements from the middle. I looked at the implementation of BigCache, and it does not provide an interface to delete a specified key. In this type of cache, elements are typically removed either by a global expiration date (note that it is a first-in-first-out expiration date; you can’t specify a different expiration date for each key) or when the total cache size reaches a certain threshold, that is, the array is used as a queue. Therefore, the premise of this implementation is that the cache is self-deprecating rather than manually deleting the specified element type.

In the following part of the English text, the English author wrote a simple version of the cache on the basis of BigCache, and then through the code to illustrate the above principle. If you’re okay with this, you don’t have to look at the rest.

Look at the code

Here is a simple implementation of a cache I wrote, with the expiration, capacity, and other features removed to better illustrate what I said above.

First, the hash function is copied from BigCache. The hash implementation is zero-heap memory requirable.

hasher.go

package main // newDefaultHasher returns a new 64-bit FNV-1a Hasher which makes no memory allocations. // Its Sum64 method will lay the value out in big-endian byte order. // See https://en.wikipedia.org/wiki/Fowler - Noll - Vo_hash_function func newDefaultHasher fnv64a () {return fnv64a {}} type fnv64a struct{} const ( // offset64 FNVa offset basis. See https://en.wikipedia.org/wiki/Fowler - Noll - Vo_hash_function # FNV - 1 a_hash offset64 = 14695981039346656037 / / prime64 FNVa Prime value. See https://en.wikipedia.org/wiki/Fowler - Noll - Vo_hash_function # FNV - 1 a_hash prime64 / / = 1099511628211) Sum64 gets the string and returns its uint64 hash value. func (f fnv64a) Sum64(key string) uint64 { var hash uint64 = offset64 for i := 0; i < len(key); i++ { hash ^= uint64(key[i]) hash *= prime64 } return hash }Copy the code

The cache structure then contains the logic to get the shard, as well as the GET and set methods.

In the previous section on concurrent access, we saw a function that selects a particular shard for data. This function is implemented by first converting the key to a hash value using the previous hash function, and then using the hash value and the number of shards to calculate the index of the shard array. And it’s worth noting that instead of doing the mod, you’re doing the bitwise sum.

hashedkey&mask

    0111
AND 1101  (mask)
  = 0101
Copy the code

cache.go

package main var minShards = 1024 type cache struct { shards []*cacheShard hash fnv64a } func newCache() *cache { cache := &cache{ hash: newDefaultHasher(), shards: make([]*cacheShard, minShards), } for i := 0; i < minShards; i++ { cache.shards[i] = initNewShard() } return cache } func (c *cache) getShard(hashedKey uint64) (shard *cacheShard) {  return c.shards[hashedKey&uint64(minShards-1)] } func (c *cache) set(key string, value []byte) { hashedKey := c.hash.Sum64(key) shard := c.getShard(hashedKey) shard.set(hashedKey, value) } func (c *cache) get(key string) ([]byte, error) { hashedKey := c.hash.Sum64(key) shard := c.getShard(hashedKey) return shard.get(key, hashedKey) }Copy the code

Finally, and best of all, there is a character array []byte and a map[uint64]uint32 in each shard. In map, the values of each key-value pair are stored as subscripts in a global character array, and the values of key-value pairs are stored in the character array.

Use the tail variable to hold the tail subscript of the character array.

shard.go

package main

import (
    "encoding/binary"
    "errors"
    "sync"
)

const (
    headerEntrySize = 4
    defaultValue    = 1024 // For this example we use 1024 like default value.
)

type cacheShard struct {
    items        map[uint64]uint32
    lock         sync.RWMutex
    array        []byte
    tail         int
    headerBuffer []byte
}

func initNewShard() *cacheShard {
    return &cacheShard{
        items:        make(map[uint64]uint32, defaultValue),
        array:        make([]byte, defaultValue),
        tail:         1,
        headerBuffer: make([]byte, headerEntrySize),
    }
}

func (s *cacheShard) set(hashedKey uint64, entry []byte) {
    w := wrapEntry(entry)
    s.lock.Lock()
    index := s.push(w)
    s.items[hashedKey] = uint32(index)
    s.lock.Unlock()
}

func (s *cacheShard) push(data []byte) int {
    dataLen := len(data)
    index := s.tail
    s.save(data, dataLen)
    return index
}

func (s *cacheShard) save(data []byte, len int) {
    // Put in the first 4 bytes the size of the value
    binary.LittleEndian.PutUint32(s.headerBuffer, uint32(len))
    s.copy(s.headerBuffer, headerEntrySize)
    s.copy(data, len)
}

func (s *cacheShard) copy(data []byte, len int) {
    // Using the tail to keep the order to write in the array
    s.tail += copy(s.array[s.tail:], data[:len])
}

func (s *cacheShard) get(key string, hashedKey uint64) ([]byte, error) {
    s.lock.RLock()
    itemIndex := int(s.items[hashedKey])
    if itemIndex == 0 {
        s.lock.RUnlock()
        return nil, errors.New("key not found")
    }

    // Read the first 4 bytes after the index, remember these 4 bytes have the size of the value, so
    // you can use this to get the size and get the value in the array using index+blockSize to know until what point
    // you need to read
    blockSize := int(binary.LittleEndian.Uint32(s.array[itemIndex : itemIndex+headerEntrySize]))
    entry := s.array[itemIndex+headerEntrySize : itemIndex+headerEntrySize+blockSize]
    s.lock.RUnlock()
    return readEntry(entry), nil
}

func readEntry(data []byte) []byte {
    dst := make([]byte, len(data))
    copy(dst, data)

    return dst
}

func wrapEntry(entry []byte) []byte {
    // You can put more information like a timestamp if you want.
    blobLength := len(entry)
    blob := make([]byte, blobLength)
    copy(blob, entry)
    return blob
}
Copy the code

main.go

package main import "fmt" func main() { cache := newCache() cache.set("key", []byte("the value")) value, err := cache.get("key") if err ! = nil { fmt.Println(err) } fmt.Println(string(value)) // OUTPUT: // the value }Copy the code

How BigCache avoids expensive GC cycles and speeds up concurrent access in Go

Copyright: Yoko Blog: pengrl.com/p/35302/ This article is welcome to be reproduced in any form, and the information of this statement (including the link to the original text, the source of the original text, the author of the original text, and the copyright notice) can be retained completely when reproduced. All subsequent changes to this article will be updated at the original address as soon as possible.

How does Go open source project BigCache speed up concurrent access and avoid high GC overhead

Concurrent access

High GC overhead

Look at the code

Related Posts

Go+Redis implements distributed locking in the simplest way possible

232. Implement queues with stacks

Local backup of Chrome browser extensions