1, the target

1.1. Globally Unique

There should be no duplicate ids, and global uniqueness is a basic requirement.

1.2. Orderly trend

Service paging query requirements, sorting requirements, if the ID is directly ordered, there is no need to create more indexes, add query conditions. In addition, Mysql InnoDB storage engine uses clustered indexes for primary keys, so the write performance is higher if the primary keys are ordered.

1.3. High availability

An ID is the unique identifier of a piece of data. If an ID fails to be generated, services cannot be performed. So a good ID scheme needs to be highly available.

1.4. Information security

Although the trend of ID is orderly, it cannot be seen as a rule, so as not to be crawled information. One interesting thing to learn about the MAC address leak caused by the algorithm that generates UUID based on the MAC address was used to find the creator of Melissa’s virus.

2. Introduction of common schemes

2.1, UUID

Universally Unique Identifier (UUID) is the simplest generation scheme:

UUID.randomUUID().toString()
Copy the code

Generate a string of the form 8-4-4-12 of e811b49b-9AC1-47DC-8AB9-98FA7DD861d0.

advantages
  • simple
  • Performance is good
  • The only global
disadvantages
  • A disorderly
  • Cannot identify the meaning of this ID and is unreadable.
  • The string is too long and unordered. As the primary key of MySQL, the performance is affected.

2.2. Snowflake

Snowflake is an open-source distributed ID generation algorithm for Twitter. The core idea is a Long ID with 41bit as the number of milliseconds, 10bit as the machine code, and 12bit as the sequence number of milliseconds.

advantages
  • The number of milliseconds is high, the increment sequence is low, and the ID tends to increase.
  • Deployed as a service, it provides high availability.
  • Flexible bit allocation based on service.
disadvantages
  • Each machine has a different clock, when the clock callback may occur duplicate ID.
  • When there is a large amount of data, it is necessary to take the module sub-library sub-table for ID. When it crosses milliseconds, the serial number always returns to 0, resulting in unbalanced distribution after taking the module.

2.3. Flickr scheme based on database

MySQL auto_increment auto_increment_offset MySQL auto_increment auto_increment_offset

Get different ids by using the following SQL:

begin;
REPLACE INTO Tickets64 (stub) VALUES ('a');
SELECT 1571788;
commit;
Copy the code

In a distributed system, multiple Mysql servers are deployed, each machine has a different initial value, and the number of steps equals the number of machines. Assume that N machines are deployed and the number of steps is N. The initial values of each machine are 0, 1, and 2 in sequence