Sharding-jdbc: How to optimize large query volume?

The protagonist, Xiao Wang, works for a start-up company that is developing an App. In order to quickly develop a version that can be put into the market for publicity, Xiao Wang can work overtime every day very late, busy after a period of time finally put out the first version.

In the early stage, there were not many functions and tables. MySql was used to store business data. One node, of course, with a scheduled backup mechanism every morning.

Here is a current situation:

Thanks to the strong promotion of the operation staff, the App has achieved initial results. More and more registered users, query volume is more and more large, for the data that is not likely to update xiao Wang added cache, and held up for a period of time.

For some data or to check the database, according to the current business development, the single-node database has been fast to meet the demand. And read and write together, Wang is going to optimize a database to do read and write separation, a master from more.

The picture below is an improved status quo:

All the read requests go to the slave node, and the master node only writes to relieve the query pressure of data. The database deployment part just happens to have an operation and maintenance of Xiao Wang company, but the application part also has to support multiple data sources.

Xiao Wang is a vigorous person, action power is very strong, immediately there is a plan in the head, configuration of multiple data sources is not on the line, and then use different data sources for data operation is ok!

The pseudocode is as follows:

// Primary data source @bean (name ="primaryDataSource")
@Qualifier("primaryDataSource"// Specify the data source configuration prefix @configurationProperties (prefix ="spring.datasource.primary")
public DataSource primaryDataSource() {
    returnDataSourceBuilder.create().build(); } // From the data source @bean (name ="secondaryDataSource")
@Qualifier("secondaryDataSource") @primary // In the same DataSource, first use the annotated DataSource @configurationProperties (prefix ="spring.datasource.secondary")
public DataSource secondaryDataSource() {
    return DataSourceBuilder.create().build();
}
Copy the code

Suppose we use JdbcTemplate to manipulate the database:

@Bean(name = "primaryJdbcTemplate")
public JdbcTemplate primaryJdbcTemplate(@Qualifier("primaryDataSource") DataSource dataSource) {
    return new JdbcTemplate(dataSource);
}
@Bean(name = "secondaryJdbcTemplate")
public JdbcTemplate secondaryJdbcTemplate(@Qualifier("secondaryDataSource") DataSource dataSource) {
    return new JdbcTemplate(dataSource);
}
Copy the code

Once the configuration is complete, we can use a different JdbcTemplate to manipulate the data. There is a problem that once there are many slave nodes, which means there will be multiple JdbcTemplate, does it have to have an algorithm to use, which operation, it is quite troublesome.

So Xiao Wang found me. I am a warm-hearted person. Since I found it, I must help, of course, I am not to help Xiao Wang write code, just to provide him with ideas + scheme.

I said to Xiao Wang: ShardingSphere know, you can use this, it is much easier than you to match multiple data sources. ShardingSphere was planned later. At the beginning, there was only sharding-JDBC as a product, which was based on sub-library and sub-table in the form of client side. This evolved into Apache ShardingSphere(Incubator), an open source ecosystem of distributed database middleware solutions, It is composed of sharding-JDBC, Sharding-Proxy and Sharding-Sidecar (in planning), which are independent of each other but can be mixed and used together. They all provide standardized data sharding, distributed transaction and database governance functions, and can be applied to diverse application scenarios such as Java isomorphism, heterogeneous languages, containers, cloud native and so on.

After my guidance, Wang was able to use Sharding-JDBC to separate reading and writing. Here are the steps to share with you.

Step 1: Create two databases to simulate one master and one slave, although it is better if you have a master and slave environment already in place


CREATE DATABASE `ds_0` CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';
CREATE DATABASE `ds_1` CHARACTER SET 'utf8' COLLATE 'utf8_general_ci';

CREATE TABLE `user`(
	id bigint(64) not null,
	city varchar(20) not null,
	name varchar(20) not null,
	PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Copy the code

Create a user table in each of the ds_0 and DS_1 libraries for data manipulation demonstrations.

Step 2: Create a Maven project, add the required dependencies, the following is only posted sharding-JDBC, I will give the rest of the source address for your reference:

<dependency> <groupId>org.apache.shardingsphere</groupId> <artifactId>sharding-jdbc-spring-boot-starter</artifactId> < version > 4.0.0 - RC1 < / version > < / dependency >Copy the code

Step 3: Configure read/write separated data sources

A collection of data source names that correspond to the following data source configuration names
spring.shardingsphere.datasource.names=master,slave

# primary data sourcespring.shardingsphere.datasource.master.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.master.driver-class-name=com.mysql.jdbc.Driver spring.shardingsphere.datasource.master.url=jdbc:mysql://localhost:3306/ds_0? characterEncoding=utf-8 spring.shardingsphere.datasource.master.username=root spring.shardingsphere.datasource.master.password=123456# from data sourcespring.shardingsphere.datasource.slave.type=com.alibaba.druid.pool.DruidDataSource spring.shardingsphere.datasource.slave.driver-class-name=com.mysql.jdbc.Driver spring.shardingsphere.datasource.slave.url=jdbc:mysql://localhost:3306/ds_1? characterEncoding=utf-8 spring.shardingsphere.datasource.slave.username=root spring.shardingsphere.datasource.slave.password=123456Read/write separation configuration
spring.shardingsphere.masterslave.load-balance-algorithm-type=round_robin
The final data source name
spring.shardingsphere.masterslave.name=dataSource
Primary database data source name
spring.shardingsphere.masterslave.master-data-source-name=master
List of data source names from the library, separated by multiple commas
spring.shardingsphere.masterslave.slave-data-source-names=slave
Copy the code

Load-balance-algorithm-type Indicates the type of the secondary library load balancing algorithm. Optional values: ROUND_ROBIN or RANDOM

After the configuration is completed, you can insert data for query and insert test. It has no impact on the application layer which ORM framework is used. You can use JdbcTemplate, which we mentioned before, and also can use Mybatis, etc

I will not write out the test steps, relatively simple, of course, I also provided the test code, for reference only:

Github.com/yinjihuan/s…

Remember to give me a Star if you feel good!

Another problem that often arises in read-write architectures is the problem of read latency.

I just inserted a piece of data, and I’m going to read it, and I’m not going to read it, right?

After all, the data is copied to the slave node after the master node writes the data. The reason why the data cannot be read is that the replication takes a long time. That is to say, before the data is copied to the slave node, you have already read the data from the slave node.

The master/slave replication of mysql5.7 is multi-threaded, which means that it will be faster, but it is not guaranteed to be 100% read immediately. There are two ways to solve this problem:

Business compromise, whether to read immediately after the operation
Sharding-jdbc also takes this problem into account, so it provides us with a function that allows users to specify whether to go to the master library for reading

Before reading, use the following method to set it:

public List<User> list() {// force the primary library hintManager.getInstance ().setMasterRouteOnly();return userRepository.list();
}
Copy the code

Sharding-jdbc: How to optimize large query volume?

Related Posts

Detailed explanation of interprocess communication mode

This article gives you an in-depth understanding of the JVM memory model

Roaming the world of data structures