preface

Each HTTP request, the database transaction execution, we track the code execution process, need a unique value and these business operations associated with the standalone system, you can use the database increment ID or timestamp to add a local increment value, you can achieve a unique value. But how do you achieve a unique ID in distributed mode

  • Features of distributed ids
  • ID of the database
  • Redis Distributed ID
  • Zookeeper Distributed ID
  • Advantages and disadvantages of globally unique UUID
  • Twitter’s snowflake algorithm generates distributed ids

Pay attention to the public account, communicate together, wechat search: sneak forward

Making the addressThank you, star,

Features of distributed ids

  • Global uniqueness, necessity
  • Idempotent, if it’s generated from some information, you need to guarantee idempotent
  • Security. There’s something hidden in an ID that can’t be guessed, and it can’t be guessed. How does an ID get generated
  • Trend increasing. You can judge the time sequence of service operations during query and comparison

ID of the database

  • The implementation is simple, the ID is monotonously increased, and the value type query speed is fast. However, the single point DB has the risk of downtime, which cannot withstand high concurrency scenarios
CREATE TABLE FLIGHT_ORDER (
    id int(11) unsigned NOT NULLAuto_increment, # increment IDPRIMARY KEY (id),
) ENGINE=innodb;
Copy the code

How to ensure unique database ids in a cluster

  • With the development of the business, the service is extended to a large cluster of multiple servers. In order to solve the pressure of the single point database, the database will be transformed into a cluster accordingly. How to ensure the uniqueness of the database ID in the cluster
  • Each database instance sets a starting value and a growth step

  • Disadvantages: It is not conducive to subsequent expansion. If subsequent expansion is required, manual intervention is required to modify the start value and increase the step

Redis Distributed ID

  • If the system has hundreds of millions of data, rely on the self-increasing ID of the database in the sub-table sub-library, need to manually modify each database instance, poor capacity expansion, poor maintenance

Generate distributed global unique IDS based on Redis INCR command

  • The service obtains Id from Redis, and Id is decoupled from the database, which can solve the problem of Id and table and library. Moreover, Redis is faster than the database, and can support cluster services to obtain Id concurrently
  • The INCR command of Redis has atomic operation of INCR AND GET. Redis is a single-process single-thread architecture, INCR command does not have ID duplication
 @Autowired
 private StringRedisTemplate stringRedisTemplate;
 private static final String ID_KEY = "id_good_order";
 public Long incrementId(a) {
     return stringRedisTemplate.opsForValue().increment(ID_KEY);
 }
Copy the code

HINCRBY command

  • In fact, to store more information about serial numbers, you can use Redis’s Hash data structure, which also provides the HINCRBY command for Hash to implement the “INCR AND GET” atomic operation
//KEY_NAME is the hash Key,FIELD_NAME is the hash field, and INCR_BY_NUMBER is the incremental value
redis 127.0. 01.:6379> HINCRBY KEY_NAME FIELD_NAME INCR_BY_NUMBER 
Copy the code

Failed to recover the serial number

  • Redis is an in-memory database. If RDB or AOF persistence is not enabled, ID data will be lost in the event of an outage. Even if RDB persistence is enabled, the time difference between the last snapshot time and the time of the latest HINCRBY command may occur. As a result, the ID of a data set may be repeated after the RDB snapshot is used to restore the data set after a breakdown
  • Redis down serial number recovery scheme
    • A relational database is used to record the maximum number of serial numbers available in a short time MAX_ID. Only the number less than MAX_ID can be obtained from Redis
    • To calculate the maximum value, a scheduled task is required to periodically calculate the ID consumption RATE, which is stored in Redis. When the client obtains CUR_ID, RATE, and MAX_ID, it calculates whether CUR_ID is close to MAX_ID based on the ID consumption RATE. If so, it updates the MAX_ID of the database

Zookeeper Distributed ID

  • Zookeeper is a highly available cluster service, and the message submitted successfully is persistent, so there is no problem with machine downtime or stand-alone problems
< the dependency > < groupId > org. Apache. Curator < / groupId > < artifactId > curator - framework < / artifactId > < version > 4.2.0 < / version > </dependency> <dependency> <groupId>org.apache.curator</groupId> <artifactId>curator-recipes</artifactId> The < version > 4.2.0 < / version > < / dependency >Copy the code
  • The sample
RetryPolicy retryPolicy = new ExponentialBackoffRetry(500.3);
CuratorFramework client = CuratorFrameworkFactory.builder()
      .connectString("localhost:2181")
      .connectionTimeoutMs(5000)
      .sessionTimeoutMs(5000)
      .retryPolicy(retryPolicy)
      .build();
client.start();  
String sequenceName = "root/sequence/distributedId";
DistributedAtomicLong  distAtomicLong = new DistributedAtomicLong(client, sequenceName, retryPolicy);
// Use DistributedAtomicLong to generate an increment sequence
public Long sequence(a) throws Exception {
    AtomicValue<Long> sequence = this.distAtomicLong.increment();
    if (sequence.succeeded()) {
        return sequence.postValue();
    } else {
        return null; }}Copy the code

Advantages and disadvantages of UUID

  • Distributed ids based on databases, Redis and ZooKeeper are highly dependent on an external service. For some scenarios, how can distributed ids be generated if these external services do not exist
  • Take a unique ID in the JDK generator, has uniqueness in the world, this is the UUID, but it is a meaningless string, storage performance is poor, the query is also very time consuming, for ordering system, not suitable for as a unique ID, common optimization is transformed into two uint64 integer or binary storage (binary cannot guarantee uniqueness) after
  • But UUID can be used for logging systems, or simply as an associated attribute in the data that uniquely identifies the serial number
String uuid = UUID.randomUUID().toString().replaceAll("-"."");
Copy the code

Twitter’s snowflake algorithm generates distributed ids

  • Like UUID, the Snowflake algorithm does not rely on external services
  • Snowflake algorithm is the ID generation algorithm adopted in Twitter’s internal distributed project, which is widely praised by domestic companies. High efficiency without relying on third-party services

  • Snowflake ID is a 64-bit Long consisting of positive digits (1 bit), timestamp (41 bits), machine ID (5 bits), data center (5 bits), and auto-increment (12 bits).

1: the first bit (1bit) : In Java, the highest bit of long is the sign bit representing positive and negative. The positive number is 0, and the negative number is 1. Generally, the generated ID is a positive number, so the default is 0. 2: timestamp part (41bit) : millisecond time. It is not recommended to save the current timestamp, but to use the difference (current timestamp – fixed start timestamp), so that the generated ID can start from a smaller value. 3: working machine ID (10bit) : Also known as workId, this can be configured flexibly, either in machine room or machine number combination. 4: serial number part (12bit), self-value-added support a node within the same millisecond can generate 4096 ids

//Twitter's SnowFlake algorithm uses the SnowFlake algorithm to generate an integer
public class SnowFlakeShortUrl {
    // Start time stamp
    static long START_TIMESTAMP = 1624698370256L;
    // The number of bits occupied by each part
    static long SEQUENCE_BIT = 12;   // The number of digits occupied by the serial number
    static long MACHINE_BIT = 5;     // The number of bits occupied by the machine identifier
    static long DATA_CENTER_BIT = 5; // The number of bits occupied by the data center
    // The maximum value of each part
    static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
    static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
    static long MAX_DATA_CENTER_NUM = -1L ^ (-1L << DATA_CENTER_BIT);
    // Shift each part to the left
    static long MACHINE_LEFT = SEQUENCE_BIT;
    static long DATA_CENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
    static long TIMESTAMP_LEFT = DATA_CENTER_LEFT + DATA_CENTER_BIT;
    //dataCenterId + machineId = 10bit working machineId
    private long dataCenterId;  // Data center
    private long machineId;     // Machine id
    private volatile long sequence = 0L; / / serial number
    private volatile long lastTimeStamp = -1L;  // Last time stamp
    private volatile long l currTimeStamp = -1L; // The current timestamp
    
    private long getNextMill(a) {
        long mill = System.currentTimeMillis();
        while (mill <= lastTimeStamp) mill = System.currentTimeMillis();
        return mill;
    }
    // Generate the specified serial number based on the specified data center ID and machine flag ID
    public SnowFlakeShortUrl(long dataCenterId, long machineId) {
        Assert.isTrue(dataCenterId >=0 && dataCenterId <= MAX_DATA_CENTER_NUM,"dataCenterId is illegal!");
        Assert.isTrue(machineId >= 0 || machineId <= MAX_MACHINE_NUM,"machineId is illegal!");
        this.dataCenterId = dataCenterId;
        this.machineId = machineId;
    }
    // Generate the next ID
    public synchronized long nextId(a) {
        currTimeStamp = System.currentTimeMillis();
        Assert.isTrue(currTimeStamp >= lastTimeStamp,"Clock moved backwards");
        if (currTimeStamp == lastTimeStamp) {
            // The serial number increases within the same millisecond
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0L) { // The number of sequences in the same millisecond has reached the maximum, get the next millisecondcurrTimeStamp = getNextMill(); }}else { 
            sequence = 0L; // Set the serial number to 0 for different milliseconds
        }
        lastTimeStamp = currTimeStamp;
        return (currTimeStamp - START_TIMESTAMP) << TIMESTAMP_LEFT // The timestamp part
                | dataCenterId << DATA_CENTER_LEFT       // Data center section
                | machineId << MACHINE_LEFT             // Machine identifier section
                | sequence;                             // Serial number section
    }
    
    public static void main(String[] args) {
        SnowFlakeShortUrl snowFlake = new SnowFlakeShortUrl(10.4);
        for (int i = 0; i < (1 << 12); i++) {
            / / decimalSystem.out.println(snowFlake.nextId()); }}}Copy the code

Corrections are welcome

Refer to the article

  • Comparison of common distributed global unique ID generation strategies and algorithms
  • Design of serial number service (distributed ID) based on Redis
  • 9 distributed ID generation schemes, let you learn enough at a time