In addition to using middleware to broker request distribution, another common approach is to split databases and tables at the client level — wrapping client code appropriately makes it easy to write code for database access operations. The scheme of sub-library and sub-table in this paper is based on the MyBatis framework, but it is different from the schemes commonly used in the market. They generally rewrite SQL statements by writing complex MyBatis plug-in. Such plug-in code will be extremely complex, and only the original author of the plug-in may finally fully understand the relevant code. It brings some problems to the maintenance of the project. The scheme of this paper is very simple and easy to understand, but also does not lose the convenience of using. Its design philosophy comes from Python — Explicit is better than Implicit, and it doesn’t hide the process of separating libraries and tables.

Many sub – table design in the implementation will try to sub – table logic hidden, in fact, this is not necessary. The user must be aware of the fact that there is a sub-table, otherwise how can he not perform a global index lookup? Why can’t he join multiple tables at will? If you really use it as a single table, you’re going to have a big problem when it goes live.

Project name: Shardino, project address: github.com/pyloque/sha…

Now let’s look at what the database operation code looks like under this scenario

The posts table is divided into 64 tables, and different records are distributed to one of them, either by hash or by date, with distribution logic determined by the user code itself. The number of tables can be set to different values in different environments, such as 4 tables under unit tests and 64 tables online.

@Configuration
public class PartitionConfig {

    private int post = 64;

    public int post(a) {
        return post;
    }

    public void post(int post) {
        this.post = post; }}Copy the code

The post table will be allocated to multiple libraries, here directly to the model allocation. Suppose there are four post libraries, and the post table is divided into 64 tables, post_0, post_1, post_2, and post_63. Post_0, post_4, post_8, post_1, post_5, post_9, post_2, post_6, post_10, post_2, post_10, post_10, post_5, post_9, post_1, post_5, post_9, post_2, post_6, post_10 Post_3, post_5, and post_11 are assigned to library 4.

Build the MySQLGroupStore database group object from the configuration file. This object is the entry to perform MySQL operations, through which you can find the specific physical MySQL primary and secondary data sources.

@Configuration
public class RepoConfig {

    @Autowired
    private Environment env;

    private MySQLGroupBuilder mysqlGroupBuilder = new MySQLGroupBuilder();

    @Bean
    @Qualifier("post")
    public MySQLGroupStore replyMySQLGroupStore(a) {
        MySQLGroupStore store = mysqlGroupBuilder.buildStore(env, "post");
        store.prepare(factory -> {
            factory.getConfiguration().addMapper(PostMapper.class);
        });
        returnstore; }}Copy the code

The configuration file application.properties is shown below

mysql.post0.master.addrWeights=localhost:3306
mysql.post0.master.db=sample
mysql.post0.master.user=sample
mysql.post0.master.password=123456
mysql.post0.master.poolSize=10

mysql.post0.slave.addrWeights=localhost:3307=100&localhost:3308=100
mysql.post0.slave.db=sample
mysql.post0.slave.user=sample
mysql.post0.slave.password=123456
mysql.post0.slave.poolSize=10

mysql.post1.master.addrWeights=localhost:3309
mysql.post1.master.db=sample
mysql.post1.master.user=sample
mysql.post1.master.password=123456
mysql.post1.master.poolSize=10

mysql.post1.slave.addrWeights=localhost:3310=100&localhost:3311=100
mysql.post1.slave.db=sample
mysql.post1.slave.user=sample
mysql.post1.slave.password=123456
mysql.post1.slave.poolSize=10

mysqlgroup.post.nodes=post0,post1
mysqlgroup.post.slaveEnabled=true
Copy the code

The database group here is composed of multiple equivalent master-slaves pairs. Each master-Slaves is composed of a Master database and multiple slave databases with different weights. The number of master-Slaves pairs is the number of sub-databases.

Mysqlgroup also has a special configuration option slaveEnabled to control whether or not the slave library is required to turn off read/write separation, which is turned off by default so that slave library instance related objects are not built.

Post_k: partition number post_k: partition number post_k: partition number post_k: partition number post_k: partition number We need to calculate the partition number according to the content of the record, and then determine which physical database the physical table where the record resides belongs to according to the partition number, and then perform the corresponding read and write operations on the physical database.

In this case, the post table hash out 64 tables based on the userId field, evenly distributed into 2 pairs of physical libraries, each containing a master and 2 slave libraries.

With the MySQLGroupStore instance, we can manipulate all the databases we want.

@Repository
public class PostMySQL {

    @Autowired
    private PartitionConfig partitions;

    @Autowired
    @Qualifier("post")
    private MySQLGroupStore mysql;

    public void createTables(a) {
        for (int i = 0; i < partitions.post(); i++) {
            intk = i; mysql.master(k).execute(session -> { PostMapper mapper = session.getMapper(PostMapper.class); mapper.createTable(k); }); }}public void dropTables(a) {
        for (int i = 0; i < partitions.post(); i++) {
            intk = i; mysql.master(k).execute(session -> { PostMapper mapper = session.getMapper(PostMapper.class); mapper.dropTable(k); }); }}public Post getPostFromMaster(String userId, String id) {
        Holder<Post> holder = new Holder<>();
        int partition = this.partitionFor(userId);
        mysql.master(partition).execute(session -> {
            PostMapper mapper = session.getMapper(PostMapper.class);
            holder.value(mapper.getPost(partition, id));
        });
        return holder.value();
    }

    public Post getPostFromSlave(String userId, String id) {
        Holder<Post> holder = new Holder<>();
        int partition = this.partitionFor(userId);
        mysql.slave(partition).execute(session -> {
            PostMapper mapper = session.getMapper(PostMapper.class);
            holder.value(mapper.getPost(partition, id));
        });
        return holder.value();
    }

    public void savePost(Post post) {
        int partition = this.partitionFor(post);
        mysql.master(partition).execute(session -> {
            PostMapper mapper = session.getMapper(PostMapper.class);
            Post curPost = mapper.getPost(partition, post.getId());
            if(curPost ! =null) {
                mapper.updatePost(partition, post);
            } else{ mapper.insertPost(partition, post); }}); }public void deletePost(String userId, String id) {
        int partition = this.partitionFor(userId);
        mysql.master(partition).execute(session -> {
            PostMapper mapper = session.getMapper(PostMapper.class);
            mapper.deletePost(partition, id);
        });
    }

    private int partitionFor(Post post) {
        return Post.partitionFor(post.getUserId(), partitions.post());
    }

    private int partitionFor(String userId) {
        returnPost.partitionFor(userId, partitions.post()); }}Copy the code

As can be seen from the above code, the first step of all read/write, create, and delete table operations is to calculate the partition number, and then select the target master and slave libraries according to it, and further operate on the target table. Here I have autoCommit enabled by default, so I don’t need to explicitly session.mit ().

mysql.master(partition)
mysql.slave(partition)

// If there is no branch library
mysql.master()
mysql.slave()

// If there is neither library nor read-write separation
mysql.db()

// To operate a specific table, partition must be carried
mapper.getPost(partition, postId)
mapper.savePost(partition, post)
Copy the code

During the operation of the data table, the specific partition number needs to be passed, so that MyBatis can know which sub-table is being operated on.

public interface PostMapper {

    @Update("create table if not exists post_#{partition}(id varchar(128) primary key not null, user_id varchar(1024) not null, title varchar(1024) not null, content text, create_time timestamp not null) engine=innodb")
    public void createTable(int partition);

    @Update("drop table if exists post_#{partition}")
    public void dropTable(int partition);

    @Results({@Result(property = "createTime", column = "create_time"),
            @Result(property = "userId", column = "user_id")})
    @Select("select id, user_id, title, content, create_time from post_#{partition} where id=#{id}")
    public Post getPost(@Param("partition") int partition, @Param("id") String id);

    @Insert("insert into post_#{partition}(id, user_id, title, content, create_time) values(#{p.id}, ${p.userId}, #{p.title}, #{p.content}, #{p.createTime})")
    public void insertPost(@Param("partition") int partition, @Param("p") Post post);

    @Update("update post_#{partition} set title=#{p.title}, content=#{p.content}, create_time=#{p.createTime} where id=#{p.id}")
    public void updatePost(@Param("partition") int partition, @Param("p") Post post);

    @Delete("delete from post_#{partition} where id=#{id}")
    public void deletePost(@Param("partition") int partition, @Param("id") String id);
}
Copy the code

You might find it a bit tedious to have to take the partition argument with you in every database operation. But it’s also intuitive, because it tells us exactly what particular table we’re operating on.

In MyBatis annotation Mapper class, if the method has more than one parameter, it needs to use @param annotation to annotate the name, so that the corresponding annotation name can be used directly in SQL statement. Otherwise you have to use the default variable placeholder names param0, param1, which is not intuitive.

We write the hash algorithm for the split table in the entity class Post, where we use the CRC32 algorithm to hash.

public class Post {
    private String id;
    private String userId;
    private String title;
    private String content;
    private Date createTime;

    public Post(a) {}

    public Post(String id, String userId, String title, String content, Date createTime) {
        this.id = id;
        this.userId = userId;
        this.title = title;
        this.content = content;
        this.createTime = createTime;
    }

    public String getId(a) {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getUserId(a) {
        return userId;
    }

    public void setUserId(String userId) {
        this.userId = userId;
    }

    public String getTitle(a) {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getContent(a) {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }

    public Date getCreateTime(a) {
        return createTime;
    }

    public void setCreateTime(Date createTime) {
        this.createTime = createTime;
    }

    public int partitionFor(int num) {
        return partitionFor(userId, num);
    }

    public static int partitionFor(String userId, int num) {
        CRC32 crc = new CRC32();
        crc.update(userId.getBytes(Charsets.UTF8));
        return (int) (Math.abs(crc.getValue()) % num); }}Copy the code

The num argument to the partitionFor method in this code is the total number of tables to divide. If the table is sorted by date, this parameter may not be needed and simply return the date as an integer such as 20190304.

The final question is how multiple weighted slave libraries achieve probability assignment. Here will be used to spring – JDBC own AbstractRoutingDataSource – to lead the way by the function of the data source. It can contain multiple sub-data sources, and then dynamically select a data source according to a certain strategy algorithm, which is right of use weight randomization.

The problem is, I only need this one class here, but I need to introduce the whole Spring-boot-JDBC-starter package, which is a bit messy. I studied the AbstractRoutingDataSource class code, found that its implementation is very simple, if is modeled on it implements a simple version of yourself, so you don’t need to introduce the whole package code.

public class RandomWeightedDataSource extends DataSourceAdapter {
    private int totalWeight;
    private Set<PooledDataSource> sources;
    private Map<Integer, PooledDataSource> sourceMap;

    public RandomWeightedDataSource(Map<PooledDataSource, Integer> srcs) {
        this.sources = new HashSet<>();
        this.sourceMap = new HashMap<>();
        for (Entry<PooledDataSource, Integer> entry : srcs.entrySet()) {
            // The weight value should not be too large
            int weight = Math.min(10000, entry.getValue());
            for (int i = 0; i < weight; i++) {
                sourceMap.put(totalWeight, entry.getKey());
                totalWeight++;
            }
            this.sources.add(entry.getKey()); }}private PooledDataSource getDataSource(a) {
        return this.sourceMap.get(ThreadLocalRandom.current().nextInt(totalWeight));
    }

    public void close(a) {
        for(PooledDataSource ds : sources) { ds.forceCloseAll(); }}@Override
    public Connection getConnection(a) throws SQLException {
        return getDataSource().getConnection();
    }

    @Override
    public Connection getConnection(String username, String password) throws SQLException {
        returngetDataSource().getConnection(username, password); }}Copy the code

Those who need a deeper understanding of its implementation code can pull the Shardino code repository to their native place for a run

git clone https://github.com/pyloque/shardino.git
Copy the code

There are unit tests that can be run before you make sure you have the Docker environment installed on your machine

docker-compose up -d
Copy the code

This command starts two pairs of master and slave libraries, one for each master and two for each slave.

Although SpringBoot is used in this example, shardino can exist independently of SpringBoot with its convenient dependency injection and unit testing capabilities.

Shardino isn’t a perfect open source library. It’s just a sample of implementation code that you’ll need to tweak if you’re using another database or MySQL version.