Pure! Developing database routing involves disturbance functions, hashes, and data source switching.

⚠️ this article for nuggets community first contract article, not authorized to forbid reprint

I blog at bugstack.cn

Precipitate, share, grow, let oneself and others can have a harvest! 😄

One, foreword

What? Java interviews are like building rockets 🚀

Pure! Before I have always wanted to Java interview on the good interview bai, ga ha yao always test some work is not used in the thing, will use Spring, MyBatis, Dubbo, MQ, the business needs to achieve not on the line!

But after a few years of work, when I need to improve myself (to add money), I started to feel that I am just a tool to call API and save interfaces. There is no breadth of knowledge, no depth of technology, and no consciousness to extract the common logic in the daily development of business code and develop it into common components. There is no thinking about what technology is used to realize some components in daily use.

So sometimes when you say that an interview is like building a rocket that you don’t use every day, it’s often not because the technology isn’t used, it’s because you didn’t use it (well, neither did I). When you have the idea to break through their own salary bottleneck, you need to know to understand the necessary data structures, learning how to learn the algorithm logic of Java, familiar with familiar with the design of general pattern, combining like Spring, ORM, RPC, this source implementation logic, assign the corresponding technical solution to their day-to-day business development, Solving common problems in a focused and refined way is your ability to rise above CRUD.

How do? Seems to sound reasonable, so for example, to a database routing requirements analysis and logic!

Two, demand analysis

If you want to do a database routing, what technical points do you need to do?

First of all, we need to know why to use sub-database sub-table, in fact, because of the large business volume, data growth is fast, so we need to split the user data into different database tables to reduce the pressure on the database.

The operations of library and table are mainly vertical split and horizontal split:

Vertical splitting: Tables are classified by service and distributed to different databases. In this way, the data burden is divided among different databases. In the end, a database consists of many tables, and each table corresponds to a different business, which is dedicated to the database.
Horizontal split: If the single machine bottleneck is encountered after vertical split, you can use horizontal split. The difference between vertical split and horizontal split is that vertical split is to split different tables into different databases, while horizontal split is to split the same table into different databases. For example, user_001 and user_002

In this section, we also want to implement the layout of horizontal split routes, as shown in Figure 1-1

So, such a database routing design to include what technical knowledge?

Is about the use of AOP aspect interception because methods that use database routing need to be labeled to facilitate the processing of database routing logic.
Switching data sources, since there are separate libraries, involves switching links between multiple data sources in order to assign data to different databases.
Database table addressing operation, a data is allocated to which database, which table, the need for index calculation. In the course of the method call, it is eventually recorded through ThreadLocal.
In order to make the data evenly distributed to different database tables, we also need to consider how to hash the data. If the database cannot be divided into tables, the data should be concentrated in a certain table of a certain library, which will lose the significance of dividing the database into tables.

To sum up, it can be seen that to complete the data storage under the data structure of database and table, I need to use the following technologies: AOP, data source switching, hash algorithm, hash addressing, ThreadLocal, SpringBoot Starter development method and other technologies. And hashing, addressing, data storage, in fact, there are so many things that are similar to HashMap, so the opportunity to build a rocket after learning the source code comes up if you’ve done a deep analysis and studied the source code of HashMap, Spring source code, middleware development, So in the design of such a database routing component must have a lot of ideas out. Next we try to learn from the source code to build rockets!

Third, technical research

In the JDK source code, contains the design of the data structure: array, LinkedList, Queue, Stack, red and black tree, concrete implementation of ArrayList, LinkedList, Queue, Stack, and these in the data storage are sequential storage, and does not use the way of hash index processing. HashMap and ThreadLocal, on the other hand, use hash indexes, hash algorithms, and zipper addressing and open addressing when data is inflated, so our analysis and reference will focus on these two features.

1. ThreadLocal

@Test
public void test_idx(a) {
    int hashCode = 0;
    for (int i = 0; i < 16; i++) {
        hashCode = i * 0x61c88647 + 0x61c88647;
        int idx = hashCode & 15;
        System.out.println("Fibonacci hash:" + idx + "Normal hash:" + (String.valueOf(i).hashCode() & 15)); }} Fibonacci hash:7Ordinary hash:0Fibonacci hash:14Ordinary hash:1Fibonacci hash:5Ordinary hash:2Fibonacci hash:12Ordinary hash:3Fibonacci hash:3Ordinary hash:4Fibonacci hash:10Ordinary hash:5Fibonacci hash:1Ordinary hash:6Fibonacci hash:8Ordinary hash:7Fibonacci hash:15Ordinary hash:8Fibonacci hash:6Ordinary hash:9Fibonacci hash:13Ordinary hash:15Fibonacci hash:4Ordinary hash:0Fibonacci hash:11Ordinary hash:1Fibonacci hash:2Ordinary hash:2Fibonacci hash:9Ordinary hash:3Fibonacci hash:0Ordinary hash:4
Copy the code

Data structure: The array structure of a hash table
Hash algorithm: Fibonacci hash
Addressing mode: Fibonacci hashing allows data to be more dispersed, open addressing in the event of a data collision, and storing elements backward from the collision node. Formula:F (k) = ((k * 2654435769) >> X) << Y For common 32-bit integers, f(k) = (k * 2654435769) >> 28, the golden ratio:(√ 5-1) / 2 = 0.6180339887 1.618:1 = = 1-0. 618
What you learned: You can refer to addressing and hashing algorithms, but this data structure is quite different from what you’re designing to implement on a database, although ThreadLocal can be used to store and pass data index information.

2. HashMap

public static int disturbHashIdx(String key, int size) {
    return (size - 1) & (key.hashCode() ^ (key.hashCode() >>> 16));
}
Copy the code

Data structure: hash bucket array + linked list + red-black tree
Hash algorithm: perturbation function, hash index, can make the data more hash distribution
Addressing mode: The zipper addressing mode is used to solve data collision. When data is stored, the index address will be carried out. In case of collision, a data linked list will be generated.
What you learned: You can apply hashing algorithms, addressing methods to the design and implementation of database routing, and the whole array + list approach is similar to the library + table approach.

Iv. Design implementation

1. Define route annotations

define

@Documented
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE, ElementType.METHOD})
public @interface DBRouter {

    String key(a) default "";

}
Copy the code

use

@Mapper
public interface IUserDao {

     @DBRouter(key = "userId")
     User queryUserInfoByUserId(User req);

     @DBRouter(key = "userId")
     void insertUser(User req);

}
Copy the code

First we need to define a custom annotation to be placed on the method that needs to be routed by the database.
It is used by means of method configuration annotations, which can be intercepted by our designated AOP aspect, and then the corresponding database routing calculation and judgment, and switch to the corresponding operational data source.

2. Parse route configurations

The above is the configuration of a data source after the implementation of the database routing component. In the use of data sources under the database and table, the information configuration of multiple data sources needs to be supported, so as to meet the extension of different requirements.
For this customization of large information configuration, you need to useorg.springframework.context.EnvironmentAwareInterface to retrieve the configuration file and extract the required configuration information.

Data source configuration extraction

@Override
public void setEnvironment(Environment environment) {
    String prefix = "router.jdbc.datasource.";    

    dbCount = Integer.valueOf(environment.getProperty(prefix + "dbCount"));
    tbCount = Integer.valueOf(environment.getProperty(prefix + "tbCount"));    

    String dataSources = environment.getProperty(prefix + "list");
    for (String dbInfo : dataSources.split(",")) { Map<String, Object> dataSourceProps = PropertyUtil.handle(environment, prefix + dbInfo, Map.class); dataSourceMap.put(dbInfo, dataSourceProps); }}Copy the code

Prefix is the starting information for the data source configuration. You can customize the starting information as required.
DbCount, tbCount, dataSources, and dataSourceProps extract configuration information and store it in dataSourceMap for future use.

3. Data source switchover

A SpringBoot Starter provides an instantiation of a DataSourceAutoConfig. The DataSource is dynamically transformed. In other words, it supports dynamic switching of data sources.

Creating a data source

@Bean
public DataSource dataSource(a) {
    // Create the data source
    Map<Object, Object> targetDataSources = new HashMap<>();
    for (String dbInfo : dataSourceMap.keySet()) {
        Map<String, Object> objMap = dataSourceMap.get(dbInfo);
        targetDataSources.put(dbInfo, new DriverManagerDataSource(objMap.get("url").toString(), objMap.get("username").toString(), objMap.get("password").toString()));
    }     

    // Set the data source
    DynamicDataSource dynamicDataSource = new DynamicDataSource();
    dynamicDataSource.setTargetDataSources(targetDataSources);
    return dynamicDataSource;
}
Copy the code

Here is a simplified example of creating an instantiation based on data source information read from the configuration information.
After the data source is created, save it toDynamicDataSourceIt is an inherited AbstractRoutingDataSource implementation class, this class can be stored and read the corresponding specific call data source information.

4. Section interception

Needs to be done in AOP’s aspect interception; The database route is calculated, the perturbation function is enforced, the database table index is calculated, and the data source is set to ThreadLocal.

@Around("aopPoint() && @annotation(dbRouter)")
public Object doRouter(ProceedingJoinPoint jp, DBRouter dbRouter) throws Throwable {
    String dbKey = dbRouter.key();
    if (StringUtils.isBlank(dbKey)) throw new RuntimeException(Annotation DBRouter key is null!);

    // Calculate the route
    String dbKeyAttr = getAttrValue(dbKey, jp.getArgs());
    int size = dbRouterConfig.getDbCount() * dbRouterConfig.getTbCount();

    // Perturbation function
    int idx = (size - 1) & (dbKeyAttr.hashCode() ^ (dbKeyAttr.hashCode() >>> 16));

    // Library table index
    int dbIdx = idx / dbRouterConfig.getTbCount() + 1;
    int tbIdx = idx - dbRouterConfig.getTbCount() * (dbIdx - 1);   

    // Set to ThreadLocal
    DBContextHolder.setDBKey(String.format("%02d", dbIdx));
    DBContextHolder.setTBKey(String.format("%02d", tbIdx));
    logger.info(Method: {} dbIdx: {} tbIdx: {}", getMethod(jp).getName(), dbIdx, tbIdx);
   
    // Return the result
    try {
        return jp.proceed();
    } finally{ DBContextHolder.clearDBKey(); DBContextHolder.clearTBKey(); }}Copy the code

First we extracted the number of library table products and used them as long as a HashMap.
Next, use the same perturbation function logic as HashMap to scatter the data more hashed.
After calculating an index position on the total length, you need to convert that position into the library table to see which library and which table the total length index falls into.
Finally, the computed index information is stored in the ThreadLocal, which is used to pass the index information that can be extracted during the method call.

5. Test verification

5.1 Library table creation

create database `bugstack_01`;
DROP TABLE user_01;
CREATE TABLE user_01 ( id bigint NOT NULL AUTO_INCREMENT COMMENT 'on the ID', userId varchar(9)COMMENT 'username ', userNickNamevarchar(32)COMMENT 'user name ', userHeadvarchar(16)COMMENT 'user profile ', userPasswordvarchar(64)COMMENT 'user password ', createTime datetime COMMENT' createTime ', updateTime datetime COMMENT 'updateTime ', PRIMARYKEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE user_02;
CREATE TABLE user_02 ( id bigint NOT NULL AUTO_INCREMENT COMMENT 'on the ID', userId varchar(9)COMMENT 'username ', userNickNamevarchar(32)COMMENT 'user name ', userHeadvarchar(16)COMMENT 'user profile ', userPasswordvarchar(64)COMMENT 'user password ', createTime datetime COMMENT' createTime ', updateTime datetime COMMENT 'updateTime ', PRIMARYKEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE user_03;
CREATE TABLE user_03 ( id bigint NOT NULL AUTO_INCREMENT COMMENT 'on the ID', userId varchar(9)COMMENT 'username ', userNickNamevarchar(32)COMMENT 'user name ', userHeadvarchar(16)COMMENT 'user profile ', userPasswordvarchar(64)COMMENT 'user password ', createTime datetime COMMENT' createTime ', updateTime datetime COMMENT 'updateTime ', PRIMARYKEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DROP TABLE user_04;
CREATE TABLE user_04 ( id bigint NOT NULL AUTO_INCREMENT COMMENT 'on the ID', userId varchar(9)COMMENT 'username ', userNickNamevarchar(32)COMMENT 'user name ', userHeadvarchar(16)COMMENT 'user profile ', userPasswordvarchar(64)COMMENT 'user password ', createTime datetime COMMENT' createTime ', updateTime datetime COMMENT 'updateTime ', PRIMARYKEY (id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Copy the code

Create multiple inventory information of the same table structure, bugstack_01, bugstack_02

5.2 Statement Configuration

<select id="queryUserInfoByUserId" parameterType="cn.bugstack.middleware.test.infrastructure.po.User"
        resultType="cn.bugstack.middleware.test.infrastructure.po.User">
    SELECT id, userId, userNickName, userHead, userPassword, createTime
    FROM user_${tbIdx}
    where userId = #{userId}
</select>               

<insert id="insertUser" parameterType="cn.bugstack.middleware.test.infrastructure.po.User">
    insert into user_${tbIdx} (id, userId, userNickName, userHead, userPassword,createTime, updateTime)
    values (#{id},#{userId},#{userNickName},#{userHead},#{userPassword},now(),now())
</insert>
Copy the code

In MyBatis, the only change is to add a placeholder after the table name,${tbIdx}Used to write the current table ID.

5.3 Annotation Configuration

@DBRouter(key = "userId")
User queryUserInfoByUserId(User req);   

@DBRouter(key = "userId")
void insertUser(User req);
Copy the code

Add an annotation to a method that needs to use a separate table from the library, and then the method will be managed by the AOP aspect.

5.4 Unit Testing

22:38:20.067  INFO 19900- [the main] C.B.M.D b.r. Outer DBRouterJoinPoint: database routing method: queryUserInfoByUserId dbIdx:2TbIdx:3
22:38:20.594  INFO 19900- [the main] cn. Bugstack. Middleware. The test. The ApiTest: test results: {"createTime":1615908803000."id":2."userHead":"01 _50"."userId":"980765512"."userNickName":"Little Fuge."."userPassword":"123456"}
22:38:20.620  INFO 19900 --- [extShutdownHook] o.s.s.concurrent.ThreadPoolTaskExecutor  : Shutting down ExecutorService 'applicationTaskExecutor'1

Copy the code

This is a log that we performed using our own database routing component. You can see that this includes routing operations in table 2 of the database 3:Database route method: queryUserInfoByUserId dbIdx: 2 tbIdx: 3

Five, the summary

To sum up, we learned from HashMap, ThreadLocal, Spring, and other source code learning techniques, and applied such techniques to a database routing design. It is almost impossible to successfully develop such a middleware if you have not experienced the precipitation of technology that is always referred to as building rockets, so many times it is not the technology that is useless, but the chance to use it. Instead of thinking about the repeated pieces of CRUD, look at what knowledge can really improve your ability! If you are interested in middleware design and implementation, you can also refer to this nugget booklet: juejin.cn/book/694099…