Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.

MyBatis, MyBatis, MyBatis, MyBatis, MyBatis, MyBatis, MyBatis, MyBatis .

But the previous article was also imperfect, for the following reasons: The performance of using “cyclic single insert” is too low, and the performance of using “MyBatis Plus batch Insert” is ok, but to introduce additional MyBatis Plus framework, the performance of using “MyBatis native batch insert” is the best, but when inserting a large amount of data, the program will report errors, then, Today we will provide a better solution.

“Pit” for native bulk inserts

First, let’s take a look at the pit in MyBatis’ native batch insert. When we batch insert 100,000 pieces of data, the implementation code is as follows:

import com.example.demo.model.User;
import com.example.demo.service.impl.UserServiceImpl;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;

import java.util.ArrayList;
import java.util.List;

@SpringBootTest
class UserControllerTest {

    // Maximum number of loops
    private static final int MAXCOUNT = 100000;

    @Autowired
    private UserServiceImpl userService;
    
    / * * * native splicing SQL, bulk insert * /
    @Test
    void saveBatchByNative(a) {
        long stime = System.currentTimeMillis(); // Start time of statistics
        List<User> list = new ArrayList<>();
        for (int i = 0; i < MAXCOUNT; i++) {
            User user = new User();
            user.setName("test:" + i);
            user.setPassword("123456");
            list.add(user);
        }
        // Batch insert
        userService.saveBatchByNative(list);
        long etime = System.currentTimeMillis(); // Statistics end time
        System.out.println("Execution time:"+ (etime - stime)); }}Copy the code

The implementation code in the core file usermapper.xml is as follows:


      
<! DOCTYPEmapper PUBLIC "- / / mybatis.org//DTD Mapper / 3.0 / EN" "http://mybatis.org/dtd/mybatis-3-mapper.dtd">
<mapper namespace="com.example.demo.mapper.UserMapper">
    <insert id="saveBatchByNative">
        INSERT INTO `USER`(`NAME`,`PASSWORD`) VALUES
        <foreach collection="list" separator="," item="item">
            (#{item.name},#{item.password})
        </foreach>
    </insert>

</mapper>
Copy the code

When we happily run the above program, the following happens:Oh, woof, the program is reporting an error!

This is because the size of the insert SQL using MyBatis native bulk insert splice is 4.56m, and the default maximum SQL that MySQL can execute is 4M, then the program will be reported at execution time.

The solution

MySQL failed to execute MySQL because the SQL file was too large for batch insert. The solution we immediately thought of was to split the large file into N smaller files so that the SQL would not be too large and cause execution errors. In other words, we can divide the List to be inserted into multiple smaller lists to perform batch insertion, and this operation is called List sharding.

So with that thought in mind, the next step is practice, so how do we fragment collections?

There are many ways to implement sharding, which we will discuss later. Next, we will use the simplest way, which is the Guava framework provided by Google to implement sharding.

Fragment Demo

To implement sharding, the first step is to add support to the Guava framework by adding the following reference to pom.xml:

<! -- Google Guava tool class --> <! -- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
  <groupId>com.google.guava</groupId>
  <artifactId>guava</artifactId>
  <version>31.01.-jre</version>
</dependency>
Copy the code

Next we write a small demo, divide the following 7 personal names into 3 groups (each group has a maximum of 3), the implementation code is as follows:

import com.google.common.collect.Lists;

import java.util.Arrays;
import java.util.List;

/** * Guava fragments */
public class PartitionByGuavaExample {
    / / the original collection
    private static final List<String> OLD_LIST = Arrays.asList(
            "Tang Seng, Monkey Kong, Ba Jie, Sha Seng, Cao Cao, Liu Bei, Sun Quan.".split(","));

    public static void main(String[] args) {
        // Set fragments
        List<List<String>> newList = Lists.partition(OLD_LIST, 3);
        // Prints a collection of fragments
        newList.forEach(i -> {
            System.out.println("Set length:"+ i.size()); }); }}Copy the code

The execution results of the above procedures are as follows:From the above results, we can easily fragment a collection using Guava’s Lists. Partition method.

Native batch insert fragment implementation

That next, is to transform our MyBatis batch insert code, the specific implementation is as follows:

@Test
void saveBatchByNativePartition(a) {
    long stime = System.currentTimeMillis(); // Start time of statistics
    List<User> list = new ArrayList<>();
    // Build insert data
    for (int i = 0; i < MAXCOUNT; i++) {
        User user = new User();
        user.setName("test:" + i);
        user.setPassword("123456");
        list.add(user);
    }
    // Fragment batch inserts
    int count = (int) Math.ceil(MAXCOUNT / 1000.0); // Divided into N pieces, 1000 pieces per piece
    List<List<User>> listPartition = Lists.partition(list, count);
    // Fragment batch inserts
    for (List<User> item : listPartition) {
        userService.saveBatchByNative(item);
    }
    long etime = System.currentTimeMillis(); // Statistics end time
    System.out.println("Execution time:" + (etime - stime));
}
Copy the code

Execute the above program, and the final execution result is as follows:As can be seen from the figure above, the exception error reported during the previous batch insert is gone, and the execution efficiency of this implementation is higher than that of MyBatis Plus. The execution time of MyBatis Plus for batch insert of 10W data is as follows:

conclusion

In this paper, we demonstrate the problem of MyBatis’ native batch insertion: It may fail to run because too much data is inserted. We can solve this problem by fragmentation. The implementation steps of fragmentation batch insertion are as follows:

  1. Calculate the number of fragments (divided into N batches);
  2. The set is divided into N sets by Lists. Partition method.
  3. The operation of batch inserting a collection of fragments through a loop.

Follow the public account “Java Chinese Community” to see more MyBatis and Spring Boot series of articles.