The pain of comparing data

I don’t know if you, like me, have to do some database comparison work while writing code happily.

Admittedly, general data differences, such as the content of each line is different, there are mature comparison tools such as Compare2 in the market.

But if you’re comparing whether each column of the data is the same, this becomes a little more cumbersome.

V1.0 pure manual comparison

When we do some functions such as data migration, we often need to compare whether the data is correct. The most common method is manual comparison column by column.

At the beginning, old Ma is so right with colleagues, after a few days of feeling efficiency is really low, but also easy to see.

So I thought, this thing should be a lot easier to compare programmatically.

V2.0 semi-manual comparison

Insert into XXX (xx, xx, xx) values (xx, xx, xx); The utility class.

Then compare the two sides of the data, this for the comparison of hundreds of fields in a table, some become much faster, the accuracy is also much higher.

Don’t ask me why there are hundreds of fields, they are the treasures of history…

Ps: Whether the insert into statement is manually exported using the database connection tool.

Later, I found another problem: there were too many tables. If I wanted to change a data comparison, it would take me tens of minutes to manually export again. The key was repetition and boredom.

Since it is repetitive, can it be implemented programmatically?

V3.0 compares basic automation

So I stayed up late after work to implement this version: The Java program implements the data export persistence, and then makes changes before and after the difference comparison.

Below I share their ideas, as well as the core source code, at the end of the download benefits.

I hope I can help you with your work and study.

The whole idea

I want this tool to be the MVP concept, moving from simple to complex and gradually enriching features.

To have scalability, currently support mysql/ Oracle/SQL Server and other mainstream databases, users can customize development.

With as few dependencies as possible, using native JDBC without introducing frameworks like Mybatis.

The core depends on

Here are the core dependencies I used:

Fastjson is used to persist data to JSON

Mysql-connector-java database connection driver

Jsqlparser is not required to parse SQL

Implementation approach

  1. Automatically selects the corresponding JDBC implementation based on the specified JDBC connection information.

  2. Execute the corresponding SQL, parse the result into a map, and carry out JSON persistence

  3. The difference comparison of persistent JSON shows the difference results

With this idea, everything becomes unpretentious.

Of course, before this, we need to implement the code, the following to enter the BUG link:

JDBC implementation

The core interface

Considering different database implementations later, we define a unified query interface

/** * JDBC access layer *@authorThe old horse shouts the west wind@date2017/8/1 * /
public interface JdbcMapper {

    /** * Execute the query statement *@param querySql
     * @return* /
    ResultSet query(String querySql);

}
Copy the code

Abstract implementation

The basic abstract implementation is provided here.

Subclasses only need to implement the corresponding connection to get information.

public abstract class AbstractJdbcMapper implements JdbcMapper {

    protected JdbcVo jdbcVo;

    public AbstractJdbcMapper(JdbcVo jdbcVo) {
        this.jdbcVo = jdbcVo;
    }

    /** * get database connection *@return* /
    protected abstract Connection getConnection(a);

    @Override
    public ResultSet query(String querySql) {
        ResultSet rs = null;
        Connection connection = getConnection();
        try {
            Statement stmt = null;
            stmt = connection.createStatement();
            rs = stmt.executeQuery(querySql);
        } catch (Exception e) {
            System.out.println("SQL: " + querySql);
            throw new ExportdbException(e);
        }
        returnrs; }}Copy the code

JdbcVo Connection information

This object is mainly a database connection information object:

public class JdbcVo {

    /** * Driver class name */
    private String driverClassName;

    /** * database link */
    private String url;

    /** * User name */
    private String username;

    /** * Password */
    private String password;

    //getter & setter
}
Copy the code

Mysql implementation

Mysql is used as an example:

import com.github.houbb.exportdb.dto.JdbcVo;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;

/** * mysql implementation *@authorThe old horse shouts the west wind@date2017/8/1 * /
public class MySqlJdbcMapper extends AbstractJdbcMapper {

    public MySqlJdbcMapper(JdbcVo jdbcVo) {
        super(jdbcVo);
    }

    @Override
    protected Connection getConnection(a) {
        try {
            Class.forName(jdbcVo.getDriverClassName());
            return DriverManager.getConnection(jdbcVo.getUrl(),
                    jdbcVo.getUsername(),
                    jdbcVo.getPassword());
        } catch (ClassNotFoundException | SQLException e) {
            e.printStackTrace();
        }
        return null; }}Copy the code

Here is mainly the initialization of the connection, connection to different databases, need to introduce the corresponding data source.

Line data export implementation

Here is the exported core implementation:

The interface definition

public interface IExportdb {

    /** * query *@paramContext *@param sql sql
     * @returnResults *@since0.0.1 * /
    QueryResultVo query(final ExportdbContext context, final String sql);

}
Copy the code

This specifies the SQL that needs to be executed.

Context in order to facilitate later expansion, currently only JdbcMapper.

Return QueryResultVo, which is the query result, as defined below:

public class QueryResultVo {
    /** * table name */
    private String tableName;

    /** * Database name **@sinceHundreds * /
    private String databaseName;

    /** * result set */
    private List<Map<String, Object>> resultMaps;

    /** * execute SQL */
    private String sql;

    //getter & setter
}
Copy the code

The default implementation

The default export implementation is as follows:

import com.github.houbb.exportdb.core.ExportdbContext;
import com.github.houbb.exportdb.core.IExportdb;
import com.github.houbb.exportdb.dal.JdbcMapper;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.exception.ExportdbException;
import com.github.houbb.heaven.util.lang.StringUtil;
import net.sf.jsqlparser.JSQLParserException;
import net.sf.jsqlparser.parser.CCJSqlParserUtil;
import net.sf.jsqlparser.statement.Statement;
import net.sf.jsqlparser.statement.insert.Insert;
import net.sf.jsqlparser.statement.select.PlainSelect;
import net.sf.jsqlparser.statement.select.Select;

import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;

/ * * *@author binbin.hou
 * @since0.0.1 * /
public class Exportdb implements IExportdb {

    @Override
    public QueryResultVo query(ExportdbContext context, String sql) {
        try {
            final JdbcMapper jdbcMapper = context.jdbcMapper();

            ResultSet resultSet = jdbcMapper.query(sql);
            List<Map<String, Object>> maps = new ArrayList<>();

            String tableName = null;
            while (resultSet.next()) {
                final ResultSetMetaData metaData = resultSet.getMetaData();
                // Set the table name
                if(tableName == null) {
                    tableName = metaData.getTableName(1);
                }

                Map<String, Object> map = new LinkedHashMap<>();
                If the value is null, an error is reported if the value is greater than 1
                // Total number of columns
                int columnCount = metaData.getColumnCount();

                for (int i = 1; i <= columnCount; i++) {
                    String columnName = metaData.getColumnName(i);
                    Object value = resultSet.getObject(columnName);

                    map.put(columnName, value);
                }

                maps.add(map);
            }

            if(StringUtil.isEmptyTrim(tableName)) {
                Statement statement = CCJSqlParserUtil.parse(sql);
                Select select = (Select)statement;
                PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
                tableName = plainSelect.getFromItem().toString();
            }

            return QueryResultVo.newInstance().tableName(tableName)
                    .databaseName("")
                    .sql(sql)
                    .resultMaps(maps);
        } catch (SQLException | JSQLParserException throwables) {
            throw newExportdbException(throwables); }}}Copy the code

In fact, the implementation is very simple, we mainly talk about two points:

(1) Table name

After the test, you can obtain mysql in the following ways:

resultSet.getMetaData();
tableName = metaData.getTableName(1);
Copy the code

When I was testing Oracle, I found it was not available. So sqlParser is used to parse our query.

For the time being, it mainly supports queries, so some of these are fixed and can be optimized later.

if(StringUtil.isEmptyTrim(tableName)) {
    Statement statement = CCJSqlParserUtil.parse(sql);
    Select select = (Select)statement;
    PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
    tableName = plainSelect.getFromItem().toString();
}
Copy the code

(2) Column information

For each query, there may be multiple records.

Let’s look at the construction of each record:

while (resultSet.next()) {
    final ResultSetMetaData metaData = resultSet.getMetaData();
    Map<String, Object> map = new LinkedHashMap<>();
    If the value is null, an error is reported if the value is greater than 1
    // Total number of columns
    int columnCount = metaData.getColumnCount();
    for (int i = 1; i <= columnCount; i++) {
        String columnName = metaData.getColumnName(i);
        Object value = resultSet.getObject(columnName);
        map.put(columnName, value);
    }
    maps.add(map);
}
Copy the code

This often write JDBC partners must be familiar.

You say now all use mybatis, who still write JDBC ah, so low.

So, your own handwriting a Mybatis, these are also will.

Hand-write mybatis (a) MVP version from scratch

Differences in contrast

Use of exports

We can export one row of data, we can export it before and after we modify it.

If you export to different libraries, different tables, then export between different library tables.

Once you export the results, you need to compare them.

Compare the implementation

The interface definition

You can choose what to do with the exported results based on your own situation.

For example, export to CSV/JSON/INSERT, etc., comparison differences can also be customized according to their own requirements.

public interface IQueryResultHandler {

    /** * Result processing class *@paramQueryResultVo Query result */
    void handler(final QueryResultVo queryResultVo);

}
Copy the code

persistence

Here is a simple and practical way: JSON persistence.

import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.support.result.IQueryResultHandler;
import com.github.houbb.heaven.util.io.FileUtil;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

/ * * *@authorThe old horse shouts the west wind@since0.0.1 * /
public class FileJsonQueryResultHandler implements IQueryResultHandler {

    /** * Default file output path ** Automatically set according to the operating system *@since0.0.1 * /
    private final String dir;

    public FileJsonQueryResultHandler(String dir) {
        this.dir = dir;
    }

    public FileJsonQueryResultHandler(a) {
        this("D:\\exportdb\\");
    }

    /** * Result processing class **@paramQueryResultVo Query result */
    @Override
    public void handler(final QueryResultVo queryResultVo) {
        String path = dir+queryResultVo.tableName()+".edb";
        System.out.println("File path:" + path);

        List<Map<String, Object>> list = queryResultVo.resultMaps();
        List<String> lines = new ArrayList<>(list.size()+1);

        lines.add("--"+queryResultVo.sql());
        for(Map<String, Object> map : list) { lines.add(JSON.toJSONString(map, SerializerFeature.WriteMapNullValue)); } FileUtil.write(path, lines); }}Copy the code

We will row data persistence to a file, note here specifies the JSON toJSONString (map, SerializerFeature. WriteMapNullValue);

This allows null fields to be printed as well, making it easier to compare.

File difference comparison implementation

Above we assume that the output file to two files, the following file path can be used for comparison:

/** ** *@paramOldPath Original path *@paramNewPath newPath */
public static void differ(final String oldPath, final String newPath) {
    List<String> oldLines = FileUtil.readAllLines(oldPath);
    List<String> newLines = FileUtil.readAllLines(newPath);
    System.out.println(FileUtil.getFileName(oldPath)+"The comparison begins ---------------");
    for(int i = 0; i < oldLines.size(); i++) {
        String oldL = oldLines.get(i);
        String newL = newLines.get(i);
        if(oldL.startsWith("--")) {
            continue;
        }
        System.out.println("The first" + (i+1) +"Line comparison:");
        differMaps(oldL, newL);
    }
    System.out.println(FileUtil.getFileName(oldPath)+"End of comparison ---------------");
    System.out.println();
}

private static void differMaps(final String oldMap, final String newMap) {
    Map<String, Object> om = JSON.parseObject(oldMap);
    Map<String, Object> nm = JSON.parseObject(newMap);
    for(Map.Entry<String, Object> entry : om.entrySet()) {
        String key = entry.getKey();
        Object oldV = om.get(key);
        Object newV = nm.get(key);
        // Skip null comparisons
        if(oldV == null && newV == null) {
            continue;
        }
        if(! ObjectUtil.isEquals(oldV, newV)) { System.out.println("Difference column:" + key +", old value:" + oldV + ", new value:"+ newV); }}}Copy the code

The difference content is output directly to the console console.

folder

Of course, we can also compare the contents of the two folders.

The implementation is as follows:

public static void differDir(final String oldDir, final String newDir) {
    File[] oldFiles = new File(oldDir).listFiles();

    for(File file : oldFiles) { String fileName = file.getName(); String aop = file.getAbsolutePath(); String anp = newDir+fileName; differ(aop, anp); }}Copy the code

The bootstrap class

convenience

We’ve got the core implementation all figured out, but it’s still not user-friendly enough. Because the configuration and so on are not elegant.

So we introduce a bootstrap class to help users quickly use:

/ * * *@authorThe old horse shouts the west wind@since0.0.1 * /
public class ExportdbBs {

    private ExportdbBs(a){}

    /** * export implementation *@since0.0.1 * /
    private final IExportdb exportdb = new Exportdb();

    /** * Driver class name */
    private String driverName = DriverNameConstant.MYSQL;

    /** * database link */
    private String url = "jdbc:mysql://localhost:3306/test";

    /** * User name */
    private String username = "root";

    /** * Password */
    private String password = "123456";


    public static ExportdbBs newInstance(a) {
        return new ExportdbBs();
    }

    public ExportdbBs driverName(String driverName) {
        this.driverName = driverName;
        return this;
    }

    public ExportdbBs url(String url) {
        this.url = url;
        return this;
    }

    public ExportdbBs username(String username) {
        this.username = username;
        return this;
    }

    public ExportdbBs password(String password) {
        this.password = password;
        return this;
    }

    /** * query *@param sql sql
     * @returnResults *@since0.0.1 * /
    public QueryResultVo query(final String sql) {
        / / 1. Construct the vo
        JdbcVo jdbcVo = new JdbcVo(driverName, url, username, password);

        / / 2. Get the mapper
        final JdbcMapper jdbcMapper = getJdbcMapper(jdbcVo);

        //3. Build context
        final ExportdbContext context = ExportdbContext.newInstance().jdbcMapper(jdbcMapper);
        return this.exportdb.query(context, sql);
    }

    /** * query and process *@paramQueryResultHandler queryResultHandler *@param sql sql
     * @since0.0.1 * /
    public void queryAndHandle(final IQueryResultHandler queryResultHandler,
                               final String sql, final String... otherSqls) {
        QueryResultVo queryResultVo = this.query(sql);
        queryResultHandler.handler(queryResultVo);

        // Do the same with other SQL statements
        for(String os : otherSqls) {
            QueryResultVo vo = this.query(os); queryResultHandler.handler(vo); }}/** * query and process *@paramQueryResultHandler queryResultHandler *@paramSqlList SQL list *@sinceHundreds * /
    public void queryAndHandle(final IQueryResultHandler queryResultHandler,
                               List<String> sqlList) {
        // Do the same with other SQL statements
        for(String sql : sqlList) {
            System.out.println("Commence execution:" + sql);
            QueryResultVo vo = this.query(sql); queryResultHandler.handler(vo); }}private JdbcMapper getJdbcMapper(JdbcVo jdbcVo) {
        if(DriverNameConstant.MYSQL.equalsIgnoreCase(driverName)) {
            return new MySqlJdbcMapper(jdbcVo);
        }
        if(DriverNameConstant.ORACLE.equalsIgnoreCase(driverName)) {
            return new OracleJdbcMapper(jdbcVo);
        }
        if(DriverNameConstant.SQL_SERVER.equalsIgnoreCase(driverName)) {
            return new SqlServerJdbcMapper(jdbcVo);
        }

        throw newUnsupportedOperationException(); }}Copy the code

This section provides users with the most basic configuration of mysql and common query processing methods.

test

Here’s how it works:

Direct query

QueryResultVo resultVo = ExportdbBs.newInstance().query("select * from user;");
System.out.println(resultVo);
Copy the code

Query and process

final String sql = "select * from user;";
final IQueryResultHandler handler = new FileJsonQueryResultHandler();
ExportdbBs.newInstance().queryAndHandle(handler, sql);
Copy the code

You can specify file paths for two exports, for example:

D: \ exportdb \ old \ and D: \ exportdb \ new \

Compare the two results

final String oldP = "D:\\exportdb\\old\\";
final String newP = "D:\\exportdb\\new\\";

CompareUtil.differDir(oldP, newP);
Copy the code

The difference result is output to the console.

Everything is going well, but the revolution is not yet successful, students still need to work overtime

Deficiency in

This is a V0.0.1 release, and there’s a lot left to be desired.

Such as:

  • Export to CSV

  • The export is an INSERT/UPDATE statement

  • Name of the exported file User-defined policy

  • You can specify whether multiple SQL is generated in the same file

  • The export path is automatically changed based on the operating system

  • More convenient to use, for example, the page specified data source + SQL, the page shows the corresponding difference results.

But it’s basically usable, in line with our initial vision.

summary

How do you compare data at ordinary times?

If you need this tool, you can pay attention to [Lao Ma Xiao Xifeng], background reply [contrast] can be.

Hope you found this article helpful, and if you have any other ideas, please share them in the comments section.

All geeks’ likes, favorites and forwarding are the biggest motivation for Lao Ma to continue writing!