Let Redis Zset support multi-conditional sorting

Some requirements often want us to achieve a ranking, if the amount of data is small, you can use RDB database sort, large amount of data can achieve their own algorithm or use NoSQL database sort, NoSQL database in the most convenient may be the use of Redis zset to achieve. For example, to implement a leaderboard of player achievement points:

>zadd r 100 A "1" >zadd r 200 B "1" >zadd r 200 C "1" >zadd r 300 D "1" >zrange r 0 -1 withscores 1) "A" 2) "100" 3) "B"  4) "200" 5) "C" 6) "200" 7) "D" 8) "300"Copy the code

The scores of B and C are the same. In this case, we may want to put the person who achieves the corresponding score first in the front row, which is equivalent to adding a ranking dimension. In this case, you can’t use score directly. You need to do some transformation. Here are a few methods THAT I will use.

Implementation method

Suppose we want a sort of three dimensions, the first dimension is the specific value, the second dimension is a flag bit, and the third dimension is the timestamp. Of these, krypton (0 no 1 yes) and early time are ranked first.

Here’s the raw data:

A 100 1 1571819021259 B 200 0 1571819021259 C 200 1 1571819021259 D 400 0 1571819021259 E 200 1 1571810001259Copy the code

1. Sort by key

If the last two dimensions do not change after initialization, you can use the sorting feature of Redis to write the invariant dimensions into the key, so that when the score is identical, the key will be used for sorting.

# reverse timestamp here (99999999999-1571810001259) Zadd r 100 A-1-8428180978740 "1" >zadd R 200 B-0-8428180978740 "1" >zadd R 200 C-1-8428180978740 "1" > Zadd R 100 A-1-8428180978740 "1" > Zadd R 100 B-0-8428180978740 "1" > Zadd R 100 C-1-8428180978740 "1" > Zadd R 400 D-0-8428180978740 "1" >zadd r 200 E-1-8428189998740 "1" >zrange r 0 -1 withscores 1) "A-1-8428180978740" 2) "100" 3)  "B-0-8428180978740" 4) "200" 5) "C-1-8428180978740" 6) "200" 7) "E-1-8428189998740" 8) "200" 9) "D-0-8428180978740" 10) "400"Copy the code

So that’s the first method, and it’s pretty simple to implement.

Score storage format

Before introducing the next two methods, let’s take a look at how zset score is stored and how much accuracy can be retained. Let’s look directly at the structure of the hop table node.

// struct zskiplistNode {// robj *obj; // double score; Struct zskiplistNode *backward; Struct zskiplistNode *forward; struct zskiplistNode *forward; // span unsigned int span; } level[]; } zskiplistNode;Copy the code

We can see that score is stored as a double, which is an IEEE 754 double floating-point number. (The detailed storage rules of the IEEE 754 standard can be searched for in other articles, focusing here on the accuracy of the reservation)

S: sign bit, first paragraph, 1 bit, 0 for positive number, 1 for negative number.
Exp: exponential word, second segment, accounting for 11 bits, the code shift represents positive and negative numbers, excluding the special cases where all zeros and all 1s represent zero and infinity, the exponential part is −1022 to +1023 plus 1023, and the index value ranges from 1 to 2046
Fraction: a valid number, consisting of 52 digits, with the decimal point at the beginning of the last digit

Actual number = (-1)^S * Fraction * 2^Exp

Double stores 52 valid digits in the third paragraph. If you go beyond that, you lose precision. Since the standard states that the decimal point is the first significant digit, you can actually store 53 digits. For example, if a number is 111___(53 ones), the significant digit is stored in.11___(52 ones). The effective digit will be added to the ones bit to get 1.11___(53 ones). The maximum value is 2^53 = 9007199254740991, or about log2^53 = 15.95 bits.

As you can see, as long as the significant number is less than or equal to 9007199254740991, the accuracy will not be lost.

Ok, now that we have the theory, let’s see how we can do this.

2. Binary segmentation

Second, you can fragment dimensions using binary 64-bit longs.

First of all, the timestamp is too long to store all of it. You can reduce it by doing some calculations, ignoring milliseconds and then calculating the difference with a large number.

int MAX_SECOND = 1861891200; Int STS = (MAX_SECOND - (int)(ts / 1000))Copy the code

That Narrows the time to less than 300 million.

Next, I partition the first flag bit, leaving 63 bits, then I partition the highest 33 bits to store the score, 1 bit to store the flag bit, and the last 29 bits to store the timestamp. The storage structure is like this:

If there is no loss of accuracy, the first segment can store bits = 53-1-29 = 23.

The maximum value of the first section = 2^33 = 8589934591, without loss of precision = 2^23 = 8388607.

The third maximum value = 2^29 = 536870911.

It can be used to shorten the length of the third timestamp, or if the timestamp is not accurate, the first fraction can be exceeded without losing precision, and only a little of the timestamp accuracy can be lost.

Then you can use the following method to calculate score, because it is binary, you can use all bits.

int size1 = 33; int size2 = 1; int size3 = 29; long d1Max = maxBySize(size1); long d2Max = maxBySize(size2); long d3Max = maxBySize(size3); Score public long genScore(long d1, long d2, long d3) { return ((d1 & d1Max) << (size2 + size3)) | ((d2 & d2Max) << size3) | (d3 & d3Max); } / / calculated to increase the value of public void incValue (d1) {return ((d1 & d1Max) < < | (size2 + size3)) ((0 & d2Max) < < size3) | (0  & d3Max); } private long maxBySize(int len) {long r = 0; while (len-- > 0) { r = ((r << 1) | 1); } return r; }Copy the code

To increase the high score, just set the low dimension to 0, calculate the score and call ZINCRBY.

Instead of calling ZINCRBY, we need to call ZADD, which we update with spin plus CAS to prevent overwriting the newly updated value.

/** * set dimension value ** @param key zset key * @param value zset value * @param d2 second bit * @param d3 third bit * @return */ public boolean setD(String key, String value, Long d2, Long d3) { long start = System.currentTimeMillis(); Long oldScore; long score; byte[] originBytes; do { if ((System.currentTimeMillis() - start) > 2000) { return false; } long d1 = 0; originBytes = zscoreBytes(key, value); oldScore = convertZscore(originBytes); if (originBytes ! Dscore = getD1ByScore(oldScore); If (d2 == null) {d2 = getD2ByScore(oldScore); } if (d3 == null) {d3 = getD3ByScore(oldScore); Score = genScore(d1, d2, d3); } while (! compareAndSetDimension(key, value, originBytes, score)); return true; } Long SUCCESS = 1L; String script = "if (((not (redis.call('zscore', KEYS[1], ARGV[1]))) and (ARGV[2] == '')) " + "or redis.call('zscore', KEYS[1], ARGV[1]) == ARGV[2]) " + " then redis.call('zadd',KEYS[1],ARGV[3],ARGV[1]) return 1 else return 0 end"; Private Boolean compareAndSetDimension(String setKey, String key, byte[] oldScore, long newScore) { Long result = redisTemplate.execute((RedisCallback<Long>) connection -> { try { return connection.eval(script.getBytes(), ReturnType.fromJavaType(Long.class), 1, setKey.getBytes(), key.getBytes(), oldScore, om.writeValueAsBytes(newScore)); } catch (JsonProcessingException e) { throw new RuntimeException(e); }}); return SUCCESS.equals(result); } public byte[] zscoreBytes(String key, String value) {String script = "return redis. Call ('zscore', KEYS[1], ARGV[1])"; return redisTemplate.execute((RedisCallback<byte[]>) connection -> connection.eval(script.getBytes(), ReturnType.fromJavaType(String.class), 1, key.getBytes(), value.getBytes())); }Copy the code

The initial score is obtained using zscoreBytes() instead of redistemplate.opsforzset ().score() in order to prevent loss of accuracy.

Go straight to the lowest execution method, zScore

// class RedisCommandBuilder

Command<K, V, Double> zscore(K key, V member) {
    notNullKey(key);

    return createCommand(ZSCORE, new DoubleOutput<>(codec), key, member);
}
Copy the code

DoubleOutput converts the return value of Redis to Double

Public class DoubleOutput<K, V> extends CommandOutput<K, V, Double> { public DoubleOutput(RedisCodec<K, V> codec) { super(codec, null); @override public void set(ByteBuffer bytes) {output = (bytes == null)? null : parseDouble(decodeAscii(bytes)); }}Copy the code

The original Redis return value is a string similar to 1.3094266712875436 +18, and DoubleOutput may lose precision when converted to Double, which is not lost when converted to BigDecimal.

String a = "1.3094266712875436 e+18"; long value = new BigDecimal(a).longValue();Copy the code

So if you don’t want to lose precision, you can modify the original method and re-package or use Lua to retrieve the original string and convert it to the value you want.

Finally, genScore method can be used to generate score value and sort

# genScore(100, 1, 290072179) >zadd r 108201125491 A "1" # genScore(200, 0, 290072179) >zadd r 215038436979 B "1" # genScore(200, 1, 290072179) >zadd r 215575307891 C "1" # genScore(400, 0, 290072179) >zadd r 429786801779 D "1" # genScore(200, 1, 290081199) >zadd r 215575316911 E "1" >zrange r 0 -1 withscores 1) "A" 2) "108201125491" 3) "B" 4) "215038436979" 5) "C"  6) "215575307891" 7) "E" 8) "215575316911" 9) "D" 10) "429786801779"Copy the code

3. Split double directly

You can split binary numbers and of course you can split decimal numbers directly. For convenience, you can also divide dimensions by decimals, such as fractions in integer places, markers and timestamps in decimal places.

A 100.1290072179
B 200.0290072179
C 200.1290072179
D 400.0290072179
E 200.1290081199
Copy the code

However, this does not lose precision storage score than binary split.

The principle is the same as binary splitting.

conclusion

In the case of small data volume, RDB can be used directly, which is more convenient. Use Redis, under the condition of invariable level 1 outside dimension can directly use the key sorting, relatively simple, if the dimension is updated, you can use the break up, stored in a binary or decimal method the advantage of the binary is stored number is larger, and can use bit operations, decimal calculation of the method is simple, better readability. The length of each dimension can also be used as a configuration item, so that different business requirements can be met.

If you find any errors or other methods, please leave a message

At the same time, welcome to follow my public number ~

Let Redis Zset support multi-conditional sorting

Implementation method

1. Sort by key

Score storage format

2. Binary segmentation

3. Split double directly

conclusion

Related Posts

Journaling like an expert

Manage and mount hard disks in Linux

Of the 15 potholes Redis will encounter, how many have you stepped on?