Programmer’s three high points

Some time ago, there was a colleague physical examination, the doctor said he was three high.

I joked, aren’t programmers high performance, high concurrency, and high availability? Which three heights are you?

Every performance developer is committed to high performance, and caching is the only way to get there.

From CPU design to distributed caching, we’ve been working with caches all the time. Today we’re going to take a look at the evolution of caches and how to write a specific size cache.

Cache development path

Ancient Societies – HashMap

We can use our own Java HashMap or ConcurrentHashMap when the application has a certain amount of traffic or when the database is being queried very frequently.

We can write this in code:

public class CustomerService {
    private HashMap<String,String> hashMap = new HashMap<>();
    private CustomerMapper customerMapper;
    public String getCustomer(String name){
        String customer = hashMap.get(name);
        if ( customer == null){
            customer = customerMapper.get(name);
            hashMap.put(name,customer);
        }
        returncustomer; }}Copy the code

The problem with this was that hashmaps could not be weeded out, and memory grew indefinitely, so hashMaps were quickly weeded out as well.

For example, the previous query, query redis, but want to cache some hot data locally, using HashMap obviously cannot meet this requirement.

Of course, you can use weak references here to solve the problem of ever-growing memory.

Of course, it is not to say that he is completely useless, just as not everything in our ancient society is outdated, for example, the traditional virtues of our Chinese nation are never out of date.

Just like this hashMap, it can be used as a cache in some scenarios. When there is no need to eliminate the mechanism, for example, we use reflection. If we search Method and Field through reflection every time, the performance will be inefficient.

Modern Society – LRUHashMap

The problems that plagued us in ancient societies cannot be eliminated by data, which would cause our memory to swell indefinitely, which is obviously unacceptable to us.

Some people say that I eliminated some data, so not right, but how to eliminate? Random elimination?

Of course not. Imagine that you just load A into the cache, and the next time you try to access it, it gets knocked out, and it accesses our database again, so why do we cache?

So smart people have invented several elimination algorithms, the following list of three common FIFO,LRU,LFU (and some ARC,MRU interested can search by themselves) :

FIFO

First in, first out, in this elimination algorithm, the first one in the cache is eliminated first. This is the simplest, but it leads to a very low hit rate.

Imagine if we had a very frequently accessed data that was the first to be accessed, and the less frequently accessed data that was accessed later, that would crowd out our first data that was accessed very frequently.

LRU

Least recently used algorithm.

In this algorithm, the above problem is avoided. Every time the data is accessed, it will be placed at the end of our team. If the data needs to be eliminated, only the first team can be eliminated.

However, there is still a problem with this. If a data is accessed 10,000 times in the first 59 minutes of an hour (it can be seen that this data is a hot data), and the data is not accessed in the next minute, but there are other data accesses, our hot data will be eliminated.

LFU

Minimum frequency of recent use.

In this algorithm, the above is optimized, using extra space to record the use frequency of each data, and then selecting the lowest frequency for elimination. This avoids the problem of LRU not being able to handle time periods.

There are three elimination strategies listed above, each of which has a higher implementation cost and a better hit rate than the other.

And we generally choose the solution in the middle, that is, the implementation cost is not too high, and the hit rate is also good LRU, how to implement a LRUMap?

We can create a simple LRUMap by inheriting the LinkedHashMap and overriding the removeEldestEntry method.

class LRUMap extends LinkedHashMap {
    private final int max;
    private Object lock;
    public LRUMap(int max, Object lock) {
        // No capacity expansion required
        super((int) (max * 1.4 f), 0.75 f.true);
        this.max = max;
        this.lock = lock;
    }

    /** * Override the removeEldestEntry method of LinkedHashMap to determine when Put, if true, the oldest * will be removed@param eldest
     * @return* /
    @Override
    protected boolean removeEldestEntry(Map.Entry eldest) {
        return size() > max;
    }
    public Object getValue(Object key) {
        synchronized (lock) {
            returnget(key); }}public void putValue(Object key, Object value) {
        synchronized(lock) { put(key, value); }}public boolean removeValue(Object key) {
        synchronized (lock) {
            returnremove(key) ! =null; }}public boolean removeAll(a){
        clear();
        return true; }}Copy the code

A LinkedHashMap maintains a list of entries (objects that hold keys and values). Every time we get or put, we will put the new entry inserted or the old entry queried at the end of our linked list.

Notice that in the constructor we purposely set the size to Max *1.4,

In the following removeEldestEntry method, we only need size> Max to eliminate the map, so that we can never follow the logic of expanding the map.

We implemented our LruMap in a few simple ways by rewriting LinkedHashMap.

Modern society – Guava Cache

LRUMap has been invented in modern times to flush out cached data, but there are several problems:

  • Lock contention is serious, as can be seen in my code, Lock is a global Lock, above the method level, when the call volume is large, performance will be relatively low.

  • No expiration time is supported

  • Automatic refresh is not supported

So the googlers got fed up with these problems and created Guava Cache, which you can use as easily as the following code:

public static void main(String[] args) throws ExecutionException {
    LoadingCache<String, String> cache = CacheBuilder.newBuilder()
            .maximumSize(100)
            // Expire 30ms after write
            .expireAfterWrite(30L, TimeUnit.MILLISECONDS)
            // Expire 30ms after access
            .expireAfterAccess(30L, TimeUnit.MILLISECONDS)
            // Refresh after 20ms
            .refreshAfterWrite(20L, TimeUnit.MILLISECONDS)
            WeakKey Key When garbage collection is enabled, the cache is also reclaimed
            .weakKeys()
            .build(createCacheLoader());
    System.out.println(cache.get("hello"));
    cache.put("hello1"."I am hello1");
    System.out.println(cache.get("hello1"));
    cache.put("hello1"."I am hello2");
    System.out.println(cache.get("hello1"));
}

public static com.google.common.cache.CacheLoader<String, String> createCacheLoader(a) {
    return new com.google.common.cache.CacheLoader<String, String>() {
        @Override
        public String load(String key) throws Exception {
            returnkey; }}; }Copy the code

Of course, there is no limit to the pursuit of performance.

There are:

Caffeine: houbb. Making. IO / 2018/09/09 /…

LevelDB: houbb. Making. IO / 2018/09/06 /…

We can learn more about these better implementations later.

In this article, we’ll look at how to implement a fixed-size cache.

Code implementation

The interface definition

To be Map compatible, we define the cache interface to inherit from the Map interface.

/** * Cache interface *@author binbin.hou
 * @since0.0.1 * /
public interface ICache<K.V> extends Map<K.V> {}Copy the code

The core to realize

Let’s look at the implementation of put:

@Override
public V put(K key, V value) {
    //1.1 Attempt to dislodge
    CacheEvictContext<K,V> context = new CacheEvictContext<>();
    context.key(key).size(sizeLimit).cache(this);
    cacheEvict.evict(context);
    //2. Determine the information after removal
    if(isSizeLimit()) {
        throw new CacheRuntimeException("Current queue is full, data addition failed!");
    }
    //3. Perform add
    return map.put(key, value);
}
Copy the code

Here we can let the user specify the size dynamically, but specifying the size requires a corresponding elimination strategy.

Otherwise, a fixed-size map would definitely not fit elements.

Elimination strategy

There can be a variety of elimination strategies, such as LRU/LFU/FIFO and so on, we here to achieve a basic FIFO.

All implementations are implemented in the form of interfaces for flexible replacement.

public class CacheEvictFIFO<K.V> implements ICacheEvict<K.V> {

    /** * queue information *@sinceHundreds * /
    private Queue<K> queue = new LinkedList<>();

    @Override
    public void evict(ICacheEvictContext<K, V> context) {
        final ICache<K,V> cache = context.cache();
        // Perform removal when the limit is exceeded
        if(cache.size() >= context.size()) {
            K evictKey = queue.remove();
            // Remove the initial element
            cache.remove(evictKey);
        }

        // Put the new element at the end of the queue
        finalK key = context.key(); queue.add(key); }}Copy the code

FIFO is relatively simple. We use a queue to store each element we put in, and when the queue exceeds the maximum, we delete the earliest element.

The bootstrap class

For user convenience, we implement a bootstrap class similar to Guava.

All parameters are provided with default values, using fluent flow writing method to improve user experience.

/** * Cache boot class *@author binbin.hou
 * @sinceHundreds * /
public final class CacheBs<K.V> {

    private CacheBs(a){}

    /** * Create object instance *@param <K> key
     * @param <V> value
     * @return this
     * @sinceHundreds * /
    public static <K,V> CacheBs<K,V> newInstance(a) {
        return new CacheBs<>();
    }

    /** * map implementation *@sinceHundreds * /
    private Map<K,V> map = new HashMap<>();

    /** * Size limit *@sinceHundreds * /
    private int size = Integer.MAX_VALUE;

    /** * Remove policy *@sinceHundreds * /
    private ICacheEvict<K,V> evict = CacheEvicts.fifo();

    /** * map implementation *@param map map
     * @return this
     * @sinceHundreds * /
    public CacheBs<K, V> map(Map<K, V> map) {
        ArgUtil.notNull(map, "map");

        this.map = map;
        return this;
    }

    /** * Set the size information *@param size size
     * @return this
     * @sinceHundreds * /
    public CacheBs<K, V> size(int size) {
        ArgUtil.notNegative(size, "size");

        this.size = size;
        return this;
    }

    /** * Set the drive policy *@paramEvict removal strategy *@return this
     * @sinceHundreds * /
    public CacheBs<K, V> evict(ICacheEvict<K, V> evict) {
        this.evict = evict;
        return this;
    }

    /** * Build cache information *@returnCache information@sinceHundreds * /
    public ICache<K,V> build(a) {
        CacheContext<K,V> context = new CacheContext<>();
        context.cacheEvict(evict);
        context.map(map);
        context.size(size);

        return newCache<>(context); }}Copy the code

Test using

ICache<String, String> cache = CacheBs.<String,String>newInstance()
        .size(2)
        .build();
cache.put("1"."1");
cache.put("2"."2");
cache.put("3"."3");
cache.put("4"."4");
Assert.assertEquals(2, cache.size());
System.out.println(cache.keySet());
Copy the code

The default is first in, first out (FIFO).

[3, 4]
Copy the code

summary

At this point, a simplified version of the cache can be specified size.

Complete code temporarily this project has not open source, you can pay attention to [Lao Ma Xiao Xifeng], background reply cache, obtain the source code.

So far, this cache implementation is relatively simple, and obviously not suitable for our usual more flexible application scenarios.

We’ll learn how to implement a cache that can specify expiration together in the next section.