Redis Hash conflicts and progressive ReHash source code analysis

Redis DB data

Redis is a DB one by one, so what kind of data structure is this DB?

Here is the official Redis source code (5.0)

/* Redis database representation. An integer database that has multiple databases identifying it from 0(the default database) to the maximum configured. The database number is the "ID" field in the structure */
typedef struct redisDb {
    dict *dict;                 /* The key space of this database (dictionary type) */
    dict *expires;              /* Set the timeout key timeout */
    dict *blocking_keys;        /* The client waits for the data key (BLPOP) */
    dict *ready_keys;           /* Received push blocking key */
    dict *watched_keys;         /* EXEC CAS monitor key */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, used only for statistics */
    list *defrag_later;         /* List of key names to disk-by-disk collation */
} redisDb;
Copy the code

We can see that Redis database mainly stores data in dictionaries

Redis data Dict dictionary

Official website source address: github.com/redis/redis…

We found the dict dictionary definition:

// Dictionary type data definition
typedef struct dict {
    dictType *type; /* Dictionary type array */
    void *privdata; /* Private data */
    dictht ht[2]; /* Dictionary Hash table array */
    long rehashidx; /* If rehashidx == -1, no Rehash has been performed; if it is positive, Rehash has been performed */
    unsigned long iterators; /* The number of iterators currently running */
} dict;
Copy the code

The main data is stored in our array of dictionary Hash tables and we’re going to look at this dictht dictionary Hash table

// Dictionary Hash table type data definition
typedef struct dictht {
    dictEntry **table; /* Hash table, which holds one dictionary element after another, is actually an array */
    unsigned long size; /* Hash table size, that is, hash table array size */
    unsigned long sizemask; /* Hash table size mask, always equal to size-1, mainly used to calculate index */
    unsigned long used; /* Number of nodes in use, that is, the key-value logarithm */ has been used
} dictht;
Copy the code

So what’s more important is the element of each of our dictionaries, which represents the element data that we store

// Dictionary element type data definition
typedef struct dictEntry {
  	// No type pointer. Key points to Val
    void *key;
    // The value, which is a common object, may be a pointer, a 64-bit positive integer, or a 64-bit int, a floating point number
    union {
       	/ / value is a pointer
        void *val;
      	// A 64-bit positive integer
        uint64_t u64;
      	/ / 64 int
        int64_t s64;
      	/ / floating point number
        double d;
    } v;
  	// Next node, where each dictEntry is a linked list to handle Hash collisions
    struct dictEntry *next;
} dictEntry;
Copy the code

Redis handles Hash conflicts

We know that the dict at the top has two Hash tables, so why do we put two Hash tables?

The answer is that the Redis Hash table needs to be used for scaling, so let’s look at the source code.

int dictRehash(dict *d, int n);

Source location: github.com/redis/redis…

First of all, we definitely need to know where we did the expansion. It must be the method we located to Add when we did the Add operation:

/* Add an element to the target hash table */
int dictAdd(dict *d, void *key, void *val)
{
  	// Add keys to the dictionary
    dictEntry *entry = dictAddRaw(d,key,NULL);

    if(! entry)return DICT_ERR;
  	// Then set the value of the node
    dictSetVal(d, entry, val);
    return DICT_OK;
}
Copy the code

We then locate dictAddRaw, which uses the linked list to resolve Hash conflicts

/* Low-level add or find: * This function adds elements, but instead of setting values it returns the dictEntry structure to the user, which ensures that the value fields are filled in as needed. * * This function is also exposed directly to the user API to be called mainly to store non-pointers inside hash values, for example: * entry = dictAddRaw(dict,mykey,NULL); * if (entry ! = NULL) dictSetSignedIntegerVal(entry,1000); * * Return value: * * Returns NULL if the key already exists, or populates "* existing" with an existing entry if it does not. * If a key is added, the hash entry is returned for manipulation by the caller. * /
dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    long index;
    dictEntry *entry;
    dictht *ht;
	// Determine if ReHash is in progress, calling _dictRehashStep (step in subsequent ReHash) if necessary, ReHash one data piece at a time until the entire ReHash is complete
    if (dictIsRehashing(d)) _dictRehashStep(d);

    ReHash(!!!) ReHash(!!! Key point) (first ReHash call) */
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == - 1)
        return NULL;

  	/* Resolve Hash conflicts and ReHash efficiency issues */
    /* Allocate memory and store new entries. The element is inserted at the top */, assuming that the recently added entry is more likely to be accessed more frequently in the database system
  	If yes, the current HashTable is the second in the dictionary. If no expansion is needed, use the original one
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    // Create an element and allocate memory
    entry = zmalloc(sizeof(*entry));
  	The next node of the element points to the corresponding index in the Hash table. If there was an element in the previous index, it is chained to the current element
    entry->next = ht->table[index];
  	// Set the Hash table node index to itself, replacing the original element
    ht->table[index] = entry;
    ht->used++;

    /* Sets the Hash element's Key. */
    dictSetKey(d, entry, key);
    return entry;
}
Copy the code

So once we’ve looked at the resolution of Hash conflicts let’s take a look at expansion. First let’s look at dictIsRehashing. How does dictIsRehashing say you need to ReHash

ReHash The ReHash process for expansion

 // If the dictionary's rehashidx is not -1, then a Hash expansion is requireddictIsRehashing(d) ((d)->rehashidx ! =- 1)
Copy the code

So where did we change rehashidx, when we evaluated Index

/* Returns the available slot-filled index based on the hash of "Key", or -1 if the Key already exists. Note that if we are rehashing the table, the index is always returned in the context of the second (new) hash table, i.e. Ht [1] */
static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
{
    unsigned long idx, table;
    dictEntry *he;
    if (existing) *existing = NULL;

    /* Expand the hash table if necessary, or return -1 (ReHash expansion mechanism) */
    if (_dictExpandIfNeeded(d) == DICT_ERR)
        return - 1;
  	/* Query from two Hash tables, maybe this Key is in the second Hash table */
    for (table = 0; table <= 1; table++) {
  			/* Calculate the card slot */ according to the array length -1 and then take the module
        idx = hash & d->ht[table].sizemask;
        /* Retrieve the element based on the Hash table and determine if the Key is present in the Hash table. If so, return -1 */
        he = d->ht[table].table[idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                if (existing) *existing = he;
                return - 1;
            }
            he = he->next;
        }
        // If it is not ReHash, return the index slot of the first Hash, if it is ReHash then put idX into the second Hash
        if(! dictIsRehashing(d))break;
    }
    return idx;
}
Copy the code

You need to determine whether to expand the capacity

/* Expand the Hash table */ if necessary
static int _dictExpandIfNeeded(dict *d)
{
    /* If it is already in ReHash, return */
    if (dictIsRehashing(d)) return DICT_OK;

    If the hash table is empty, expand it to its initial size. Initial size 4 */
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

    /* If we have a 1:1 ratio between the number of elements already in use and the length of the Hash array, then we need to expand (global setting) or we should avoid it, but the ratio between elements/buckets exceeds the "safe" threshold, we resize and double the number of buckets */
  	/* The element we are using is equal to the length of the array
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2);
    }
    return DICT_OK;
}
Copy the code

If the second Hash table is reinitialized, the elements in subsequent rehashes will be added to the second Hash table

/* Expand capacity or create Hash table */
int dictExpand(dict *d, unsigned long size)
{
    /* Returns -1 */ if ReHash is being made, or if more than size * 2 is used
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    dictht n; /* New hash table */
    unsigned long realsize = _dictNextPower(size);

    /* Resize to the same table size is not useful, return -1 */
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* Allocate a new hash table and initialize all Pointers to NULL */
    n.size = realsize;
    n.sizemask = realsize- 1;
  	/* Allocate memory to expand space */
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;

    /* Is this the first initialization? If so, it's not really a restatement, we're just setting up the first hash table so that it can accept keys. * /
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    /* Prepare a second Hash table for incremental ReHash, reinitialize the second temporary Hash table, and start ReHash */
    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}
Copy the code

The ReHash process is when we set the state to ReHash and write the new element to the second Hash table, and then we need to combine the second and first Hash tables

/* Dictionary ReHash operation, each time the first parameter indicates the dictionary, the second parameter indicates the number of ReHash each time, such as 100 to 200, if there is no Hash conflict, we need to pass 100 to complete the ReHash */
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* Maximum number of empty buckets that can be accessed */
  	/* Does not return */ directly during ReHash
    if(! dictIsRehashing(d))return 0;
		/* ReHash the second table first */
    while(n-- && d->ht[0].used ! =0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx does not overflow because we are sure there are more elements because ht [0].used! = 0 * /
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
      	Rehashidx is set to 0 by default from the beginning. If you want to ReHash the original Hash table, you need to ReHash the entire Hash table from 0
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
          	// If n * 10 slots are still empty then return 1 and do not perform the operation
            if (--empty_visits == 0) return 1;
        }
      	// Get the Hash table for ht[0]
        de = d->ht[0].table[d->rehashidx];
        /* Then put all Key values in the card slot into HT [1] to move data from HT [0] to HT [1]*/
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
      	// If it is empty then continue +1 until ht[0] becomes empty
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

  	// To complete ReHash, all data in HT [0] has been moved to HT [1], ht[1] is assigned to HT [0], and HT [1] is cleared in one ReHash operation
    /* Check if we have rehashed the first Hash table.. * /
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = - 1;
        return 0;
    }

    /* Returns data. This step is usually partially ReHash (incomplete ReHash) */ because the ReHash has not been completed
  	// Returns 1 to indicate that a scheduled task loop is scheduled, while condition to indicate that no ReHash has completed
    return 1;
}
Copy the code

And there is a task scheduling ReHash

In Server github.com/redis/redis…

/* Database scheduled task */
void databasesCron(void) {
/* Rehash */
        if (server.activerehashing) {
            for (j = 0; j < dbs_per_call; j++) {
              	// Database ReHash
                int work_done = incrementallyRehash(rehash_db);
                if (work_done) {
                    break;
                } else {
                    /* If this db didn't need rehash, we'll try the next one. */rehash_db++; rehash_db %= server.dbnum; }}}// ReHash is performed 1 millisecond at a time per database
int incrementallyRehash(int dbid) {
    /* Keys dictionary */
    if (dictIsRehashing(server.db[dbid].dict)) {
        dictRehashMilliseconds(server.db[dbid].dict,1);
        return 1; /* Milliseconds have been used as the cycle. . * /
    }
    /* Expires */
    if (dictIsRehashing(server.db[dbid].expires)) {
        dictRehashMilliseconds(server.db[dbid].expires,1);
        return 1; /* Milliseconds have been used as the cycle. . * /
    }
    return 0;
}
  
/* Rehash in ms+"delta" ms. The delta value is large, less than 0, and in most cases less than 1. The exact upper bound depends on the running time of dictRehash(d,100) */
int dictRehashMilliseconds(dict *d, int ms) {
    if (d->iterators > 0) return 0;
		// Records start to synchronize
    long long start = timeInMilliseconds();
  	// Record the number of rehashes
    int rehashes = 0;
		// ReHash100 data at a time
    while(dictRehash(d,100)) {
        rehashes += 100;
      	// If the current time - start time > 1 ms, Break the command
        if (timeInMilliseconds()-start > ms) break;
    }
    return rehashes;
}
Copy the code

Redis Hash conflicts and progressive ReHash source code analysis

Redis DB data

Redis data Dict dictionary

Redis handles Hash conflicts

ReHash The ReHash process for expansion

Related Posts

Understanding and using Spring AOP

From zero to one, watch me play with the log

Log – based link tracking implementation scheme