Introduction to the

Tinyid is a distributed ID generation system developed in Java. It is based on the database number segment algorithm. For this algorithm, please refer to The principle of Meituan Leaf or Tinyid. Tinyid extends leaf-segment algorithm, supports multiple dB (master), and provides Java-client (SDK) to localize ID generation, achieving better performance and availability. Tinyid is used in Didi’s customer service department and is accessed through TinyID-client, generating ids of 100 million levels every day.

Tinyid system architecture diagram

Here are some notes about this architecture diagram:

NextId and getNextSegmentId are two HTTP interfaces provided by TinyID-server

When nextId is invoked, bizType is passed in. The ID data of each bizType is isolated. IdGenerator generated by bizType is used for id generation.

GetNextSegmentId is to get the next available number segment. Tinyid-client obtains the available number segment through this interface

IdGenerator is the interface for id generation

IdGeneratorFactory is a factory that produces concrete IDGenerators, one IdGenerator instance per Biz_type. With the factory, we can add biz_type to the DB at any time without restarting the service

IdGeneratorFactory actually has two subclasses, IdGeneratorFactoryServer and IdGeneratorFactoryClient. The difference is that getNextSegmentId is DbGet and HttpGet

CachedIdGenerator is the concrete ID generator object that holds currentSegmentId and nextSegmentId objects and is responsible for the core process of nextId. The nextId is eventually generated by the atomicLong.andandGet (delta) method.

Performance and availability

performance

The performance of HTTP access depends on the capability of the HTTP server and network transmission speed

In java-client mode, the id is locally generated. The longer the number segment (step), the larger the QPS. If the number segment is large enough, the QPS can reach 1000w+

availability

When the DB is not available, the server can be used for a period of time because of the cache. If multiple DBS are configured, as long as one DB survives, the service is available

With tiny-client, as long as one server is alive, it is theoretically available, and all servers are suspended because the client has cache and can continue to be used for some time

The characteristics of Tinyid

A globally unique LONG ID

A trend increasing ID does not guarantee that the next ID will be larger than the previous one

discontinuity

Access through HTTP and Java Client

You can obtain ids in batches

Support for generating 1,3,5,7,9… Sequence id

Supports multiple dB configurations without a single point

Applicable scenario: the system only cares about the id as a number and the trend is increasing, which can tolerate the ID discontinuity and waste scenarios. Not applicable scenario: the business similar to the order ID (because most of the generated ids are continuous, it is easy to scan the database or calculate the order quantity)

Recommended usage

Tinyid-server Is recommended to be deployed on multiple machines in multiple equipment rooms

In a multi-room deployment, the availability is higher. However, HTTP access requires users to consider the delay

It is recommended to use tinyid-client to obtain the ID. The benefits are as follows:

The id is locally generated (calling the atomicLong.addandget method) and the performance is greatly improved

Client access to the server becomes low-frequency, reducing the pressure on the server

Because of low frequency, there is no need to worry about latency even if the client and server are not in the same machine room

Note: If the tinyid-client mode is used, a large number of ids may be wasted if the client machine restarts frequently. In this case, HTTP is recommended

Two or more DB configurations are recommended:

If multiple DBS are configured and only one DB exists, the service can be configured with multiple DBS. If two DBS are configured, data must be written into the two DBS for each new service

The principle of tinyid

Id generation system essentials

In a simple system, we often use the DB ID increment method to identify and save data. With the complexity of the system and the increase of data, database and table has become a common scheme, and DB increment can not meet the requirements. This is where a globally unique ID generation system comes in handy. Of course, this is just one application scenario for ID generation. So what are the requirements for an ID generation system?

Globally unique ID: cannot be repeated in any way, is that the basic requirement

High performance: The base service takes as little time as possible, preferably if it can be generated locally

High availability: 100% availability is hard to achieve, but it should be infinitely close to 100%

Easy to use: be able to use, easy to access, while the system design and implementation should be as simple as possible

Implementation principle of Tinyid

Let’s take a look at the most common id generation method, DB auto_increment, I believe everyone is very familiar with, I have also seen some students in the actual combat to use this scheme to obtain an ID, the advantage of this scheme is simple, but the disadvantage is that only one ID can be obtained from DB at a time, the performance is poor, db access is more frequent. Db is going to be under a lot of pressure. Is it possible to optimize this scheme and get a batch of ids from DB at a time? The answer, of course, is yes. A batch of ids can be regarded as a range of IDS, such as (1000,2000). The number from 1000 to 2000 can also be called a “number segment”. We apply for a number segment from DB at a time, load it into memory, and then use the method of increment to generate ID. This relieves much of the strain on the DB, while generating ids directly in memory improves performance. How to design a table that holds db numbers?

DB number segment algorithm description

Db stores a range of ids (start_id,end_id). When the set of ids is used up, we do an update. Update start_id=2000(end_id), end_id=3000(end_id+1000). If the update succeeds, the next ID range is obtained. If you think about it, start_id doesn’t actually work, and the new number segment is always (end_id,end_id+1000). So let’s make a change here. The DB design should look like this

Here we add biz_type, which represents the business type, to isolate the ids of different businesses

Max_id = end_id, which represents the maximum available ID

Step indicates the length of the number segment. You can set a reasonable length based on the QPS of each service

Version is an optimistic lock, and is added to each update to ensure the correctness of concurrent updates. Therefore, we can obtain an available number segment by following several steps.

A. Query the current max_id information: select id, biz_type, max_id, step, version from tiny_id_info where biz_type=’test’;

B. Calculate the new max_id: new_max_id = max_id + step

C. Update DB max_id: update tiny_id_info set max_id=#{new_max_id} , verison=version+1 where id=#{id} and max_id=#{max_id} and version=#{version}

D. If the update is successful, the available number segment is successfully obtained. The new available number segment is (max_id, new_max_id].

E. If the update fails, the number segment may be obtained by another thread. Go to Step A and try again

A simple architecture for a segment generation scheme

Having completed the number segment generation logic above, our ID generation service architecture might look like this

The ID generation system provides HTTP service to the outside, and the request goes through our load balancing router to one of the TinyID-server, and obtains an ID from the pre-loaded number segment. If the number segment is not loaded or has been used up, it applies for a new available number segment from db. Due to the atomicity of number segment generation algorithm between multiple servers, the available number segment on each server is not heavy, so that id generation is not heavy. You can see that if tinyid-server is restarted, the number segment will be invalid and some id will be wasted. And the ID is not continuous; Each request may be made to a different machine, and the ID is not monotonically increasing, but trending increasing, although this is acceptable for most businesses.

Simple architecture problems

At this point, a simple ID generation system is complete, so is there still a problem? If you think back to our original id generation system requirements, high performance, high availability, and ease of use, there were at least the following problems with the above architecture:

When the ID is used up, you need to access the DB to load a new number segment, and version conflicts may exist in DB update. In this case, the id generation time increases significantly

Db is a single point, and although DB can be built with high availability architectures such as master and slave, it is always a single point

Using HTTP to obtain an ID, there is network overhead, performance and availability are not good

The optimization methods are as follows:

(1) Double-number segment cache

When the number segment is used up and needs to access DB, it is easy to think of asynchronously loading the next number segment when the number segment is used to a certain extent to ensure that there is always available number segment in memory, so as to avoid performance fluctuations.

(2) Add more DB support

When the DB has only one master, if the DB is unavailable (down or the master/slave latency is large), the number obtaining segment is unavailable. In fact, we can support multiple DBS, such as 2 DBS, A and B, and we can get the number segment randomly from one of them. So if A and B both get the same number segment, how do we make sure that the generated ID is not heavy? Tinyid does this by having A produce only even ids and B produce only odd ids, and the corresponding DB design adds two fields, as shown below

Delta represents the increment of id each time, and remainder represents the remainder. For example, if delta is set to 2 for both A and B, remainder is set to 0,1, respectively. The number segment of A generates only an even number segment, and B an odd number segment. With the two fields of delta and REMAINDER, we can flexibly design the number of dB according to the needs of the user, and also provide the user with an ID sequence that produces only odd numbers.

(3) increase the tinyid – client

Using HTTP to obtain an ID has network overhead. Can ID be generated locally? For this purpose, we provide tinyid-client, we can send a request to tinyid-server to obtain the available number segment, and then build a local double number segment and id generation. In this way, ID generation becomes a purely local operation and performance is greatly improved, because there is a local double number cache. Tinyid-server can be tolerated for a period of time down, availability also has a relatively large improvement.

(4) Final architecture of TinyID

Eventually our architecture might look something like this

Tinyid The tinyID client can be accessed through HTTP or tinyID-client

Tinyid-server Internal cache two number segments

Number segment is generated based on DB and has atomicity

Db supports multiple

Tinyid-server has built-in Easy-Router to select db

Need project address can pay attention to + private message “didi”