This is the sixth day of my participation in the August More text Challenge. For details, see: August More Text Challenge


1. Introduction

In order to improve the performance of persistence layer data query, MyBatis provides a set of caching mechanism, which can be divided into level-1 cache and level-2 cache according to the scope of caching. Level-1 cache is enabled by default and is scoped to SqlSession, also known as callback cache/local cache. The level-2 cache is disabled by default. You need to manually enable it and set the scope to namespace. This article will not discuss the second level cache, only from the perspective of source code analysis of MyBatis first level cache implementation principle.

As we know, SqlSession is the only interface provided by MyBatis to operate the database. When we open a new reply from the SqlSessionFactory, a new SqlSession instance is created. SqlSession has a built-in Executor, which is provided by MyBatis. When we execute a database query, we will finally call executor.query (), which will determine whether the database hit level 1 cache before querying the database. If it does, we will return it directly. Otherwise, the query operation is initiated.

2. Source analysis

We look directly at the executor.query () method, which first resolves BoundSql to execute based on the request parameter ParamMap, then creates the CacheKey, and then calls the overloaded method.

@Override
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler) throws SQLException {
  // Parse the SQL to execute according to the parameters
  BoundSql boundSql = ms.getBoundSql(parameter);
  // Create a cache key
  CacheKey key = createCacheKey(ms, parameter, rowBounds, boundSql);
  // Execute the query
  return query(ms, parameter, rowBounds, resultHandler, key, boundSql);
}
Copy the code

So we’ll focus on the query overload method, which first reports to the ErrorContext that it’s making a query, and then determines whether it needs to clear the cache. If you set flushCache=”true” in the SQL node, you won’t use level-1 cache. The next step is to try to retrieve the result from the level 1 cache. If it hits the cache, it returns directly. Otherwise, call queryFromDatabase to query the database, and in the queryFromDatabase method, the query result will be stored in the level 1 cache.

@Override
public <E> List<E> query(MappedStatement ms, Object parameter, RowBounds rowBounds, ResultHandler resultHandler, CacheKey key, BoundSql boundSql) throws SQLException {
  // Report to ErrorContext that you are doing a query
  ErrorContext.instance().resource(ms.getResource()).activity("executing a query").object(ms.getId());
  if (closed) {
    throw new ExecutorException("Executor was closed.");
  }
  if (queryStack == 0 && ms.isFlushCacheRequired()) {
    // If you need to flush the cache, clear the local cache
    clearLocalCache();
  }
  List<E> list;
  try {
    queryStack++;
    // Attempts to retrieve data from the level-1 cache
    list = resultHandler == null ? (List<E>) localCache.getObject(key) : null;
    if(list ! =null) {
      handleLocallyCachedOutputParameters(ms, key, parameter, boundSql);
    } else {
      // Query the databaselist = queryFromDatabase(ms, parameter, rowBounds, resultHandler, key, boundSql); }}finally {
    queryStack--;
  }
  if (queryStack == 0) {
    for (DeferredLoad deferredLoad : deferredLoads) {
      deferredLoad.load();
    }
    deferredLoads.clear();
    if (configuration.getLocalCacheScope() == LocalCacheScope.STATEMENT) {
      // If the level 1 cache scope is STATEMENT, clear the cacheclearLocalCache(); }}return list;
}
Copy the code

So let’s focus on the localCache object.

2.1 PerpetualCache

LocalCache is an instance of PerpetualCache because it doesn’t proactively delete the cache, but clears it when a transaction is committed or an update method is executed. PerpetualCache is a subclass of Cache, so we should look at the interface before we analyze the subclass.

The following is the Cache interface, which is simply responsible for maintaining the result set Cache, and provides an API for adding, deleting and querying the Cache.

public interface Cache {

  // Get the cache unique identifier
  String getId(a);

  // Add cache entries
  void putObject(Object key, Object value);

  // Get the cache according to the cache key
  Object getObject(Object key);

  // Delete the cache according to the cache key
  Object removeObject(Object key);

  // Clear the cached data
  void clear(a);
}
Copy the code

The implementation of PerpetualCache is very simple. It uses a HashMap container internally to maintain the cache. The Key is a CacheKey and the Value is a result set.

public class PerpetualCache implements Cache {

  // Cache unique representations
  private final String id;

  // Use HashMap as the container for the data cache
  private final Map<Object, Object> cache = new HashMap<>();
}
Copy the code

2.2 CacheKey

CacheKey is the CacheKey provided by MyBatis. Why write a separate class for the CacheKey? MyBatis has many conditions to determine whether the cache has been hit or not, and it is not a simple string to do so, hence the CacheKey.

How do I determine if a query can hit the cache? What are the conditions that need to be determined? Summary:

  1. For StatementID to be the same, the same method on the same interface must be executed.
  2. For RowBounds to be the same, the range of data queried must be the same.
  3. Execute the same SQL statement.
  4. The precompiled SQL is populated with the same parameters.
  5. Query from the same data source, i.e. EnvironmentID.

The above five conditions must be met at the same time to hit the cache. Also, data such as populated parameters is not fixed, so CacheKey uses a List to hold these conditions. Here are its properties:

public class CacheKey implements Cloneable.Serializable {
  // Defaults to multiple factors that participate in hash calculation
  private static final int DEFAULT_MULTIPLIER = 37;
  // Default hash value
  private static final int DEFAULT_HASHCODE = 17;
  // The multiple factor involved in the hash calculation. The default is 37
  private final int multiplier;
  // Hash code to increase the efficiency of equals
  private int hashcode;
  // Checksum, the sum of hash values
  private long checksum;
  // The number of updateList elements
  private int count;
  // The list of conditions, all conditions must be met to hit the cache.
  private List<Object> updateList;
}
Copy the code

Whether the cache is hit depends on whether the cacheKeys are equal. To improve the performance of the equals method and avoid comparing updateList one by one, the CacheKey uses a hashcode to store the hash value. The hash value is calculated based on each conditional object. MyBatis provides an algorithm to spread the hash value as much as possible.

You can add a condition object by calling the update method.

public void update(Object object) {
  // Calculate the hash value of a single object
  int baseHashCode = object == null ? 1 : ArrayUtil.hashCode(object);
  // The number of conditional objects increases
  count++;
  // Add the checksum
  checksum += baseHashCode;
  // Multiply the base hash value by the number of conditions to spread the hash values as widely as possible
  baseHashCode *= count;
  // Multiply the final hash by the prime number 17, again for dispersion
  hashcode = multiplier * hashcode + baseHashCode;
  // Add the conditional object to the List
  updateList.add(object);
}
Copy the code

The equals method is used to determine whether two cacheKeys are equal. To improve performance, a series of simple validations are performed before each condition object is matched individually.

@Override
public boolean equals(Object object) {
  if (this == object) {// Whether it is the same object
    return true;
  }
  if(! (objectinstanceof CacheKey)) {// Class must match
    return false;
  }

  final CacheKey cacheKey = (CacheKey) object;

  if(hashcode ! = cacheKey.hashcode) {// The hash values are not equal
    return false;
  }
  if(checksum ! = cacheKey.checksum) {// Match the checksum
    return false;
  }
  if(count ! = cacheKey.count) {// Number of matching conditions
    return false;
  }
  // The above steps are used to improve the equals performance. I'm still going to compare each of the updateList entries
  for (int i = 0; i < updateList.size(); i++) {
    Object thisObject = updateList.get(i);
    Object thatObject = cacheKey.updateList.get(i);
    if(! ArrayUtil.equals(thisObject, thatObject)) {return false; }}return true;
}
Copy the code

If the cacheKeys are equal, the cache is hit.

2.3 Creating a Cache Key

As we said before, when we call the Query method, we first create a CacheKey to determine whether the cache can be hit. Finally, let’s look at the process of creating a CacheKey.

The createCacheKey method is located in BaseExecutor. The process of creating a CacheKey is simple: instantiate a CacheKey object and then call the update method to save the conditions that need to be matched.

The five criteria for hitting the cache have been described above, so it requires MappedStatement to get the StatementID, parameterObject to get the request parameters, RowBounds to get the paging information, and BoundSql to get the SQL to execute.

/ * * * *@paramSQL node encapsulation object * executed by MS@paramParameterObject specifies the parameter, usually ParamMap *@paramRowBounds Page information *@paramBoundSql specifies the boundSql *@return* /
@Override
public CacheKey createCacheKey(MappedStatement ms, Object parameterObject, RowBounds rowBounds, BoundSql boundSql) {
  if (closed) {
    throw new ExecutorException("Executor was closed.");
  }
  // Instantiate the cache key
  CacheKey cacheKey = new CacheKey();
  // For StatementId to be consistent, you must call the same method on the same Mapper
  cacheKey.update(ms.getId());
  // Page information, the query data range must be consistent
  cacheKey.update(rowBounds.getOffset());
  cacheKey.update(rowBounds.getLimit());
  // Execute SQL to be consistent, dynamic SQL reason, SQL are inconsistent, certainly cannot hit the cache
  cacheKey.update(boundSql.getSql());
  /** * match all parameters in addition to the basic four. * For the same query method, pass different parameters, must not hit the cache. * Here's how to get the parameter from ParamMap. * /
  List<ParameterMapping> parameterMappings = boundSql.getParameterMappings();
  TypeHandlerRegistry typeHandlerRegistry = ms.getConfiguration().getTypeHandlerRegistry();
  // mimic DefaultParameterHandler logic
  for (ParameterMapping parameterMapping : parameterMappings) {
    if(parameterMapping.getMode() ! = ParameterMode.OUT) { Object value; String propertyName = parameterMapping.getProperty();if (boundSql.hasAdditionalParameter(propertyName)) {
        value = boundSql.getAdditionalParameter(propertyName);
      } else if (parameterObject == null) {
        value = null;
      } else if (typeHandlerRegistry.hasTypeHandler(parameterObject.getClass())) {
        value = parameterObject;
      } else {
        MetaObject metaObject = configuration.newMetaObject(parameterObject);
        value = metaObject.getValue(propertyName);
      }
      // Parameters are also components of the cache keycacheKey.update(value); }}// Verify the running environment, the query data source is different, also cannot hit the cache.
    if(configuration.getEnvironment() ! =null) {
    cacheKey.update(configuration.getEnvironment().getId());
  }
  return cacheKey;
}
Copy the code

After a CacheKey is created, you can determine whether the cached data exists in your PerpetualCache. If the cached data is found, you can retrieve the cached data directly from your PerpetualCache to avoid querying the database and improve query performance.

If you look at the rest of the BaseExecutor code, you can see that PerpetualCache doesn’t clean the cache itself, but clears the cache every time an UPDATE is executed, or a transaction commits/rolls back.

3. Summary

The level 1 cache is based on SqlSession. When a session is opened, it creates an Executor with a HashMap inside the Executor. PerpetualCache uses a HashMap to maintain the cached result set. The Key in the HashMap stores the CacheKey, which is the CacheKey provided by MyBatis. Since there are many conditions involved in determining whether a cache is hit, the CacheKey uses a List to hold the condition object, and only if all the conditions match can the cache be hit.

In the multi-session scenario, there is dirty data. SessionA reads the data once, SessionB modifies the data, SessionA reads the old data again. However, you don’t have to worry about this if you integrate MyBatis with Spring.