Introduction: This article will be aimed at the development process of SQL coding defects still often appear, explain the principle behind it and the cause of formation. And in the form of several common loopholes, remind technical students to pay attention to relevant problems. Finally, solutions or mitigation solutions are provided based on principles.

The author | | ali ali cloud security team source technology public number

A preface

This article will be aimed at the development process is still often SQL coding defects, explain its behind the principle and formation reasons. And in the form of several common loopholes, remind technical students to pay attention to relevant problems. Finally, solutions or mitigation solutions are provided based on principles.

Second, the principle and cause of SQL injection vulnerability

SQL injection vulnerability, fundamentally, is the result of mistakenly executing external input as SQL code. The best solution is to precompile.

SQL statements go through the following three basic steps:

  1. Code semantic analysis
  2. Make an execution plan
  3. Get returned results

An SQL statement consists of two parts: code and data.

SELECT id, name, phone FROM userTable WHERE name = ‘xiaoming’;

SELECT id, name, phone FROM userTable WHERE name = code and ‘xiaoming’ = data.

Precompilation, in Mybatis for example, preanalyzes semantics with placeholders:

SELECT name, phone FROM userTable WHERE id = #{name};

And then we pass the data ‘xiaoming’ to the placeholder. This way, stagger the semantic analysis phase of the code so that it is not mistaken for part of the code.

In the earliest days, developers explicitly used JDBC to create connections themselves and execute SQL statements. In this case, vulnerabilities can occur if external controllable data is concatenated into SQL statements without sufficient filtering. This situation is rare in normal business development, and the ORM framework must be used to execute SQL without special circumstances, according to company policy.

However, some projects still use JDBC to write some tool scripts, such as Datamerge. Java, databaseclean. Java, using the flexibility of JDBC, through these scripts to perform database batch operations.

This type of code should not appear in the online version, lest it be called externally for various reasons.

Use Mybatis directly

1 fault point

At present, most of the platform code is based on Mybatis to handle the interaction between the persistence layer and the database. Mybatis passes in data with two placeholders {} and #{}. # {} and {}. {} can be understood as string concatenation before semantic analysis, which tells the parameters passed in intact.

For instance

SELECT id, name, phone FROM userTable WHERE name = ‘${name}’; SELECT id, name, phone FROM userTable WHERE name= ‘xiaoming’;

In practical application

SELECT id, name, phone FROM userTable WHERE ${col} = ‘xiaoming’; SELECT id, name, phone FROM userTable WHERE name = ‘xiaoming’;

As explained in the introduction to the principles of precompilation, there are no injection problems with #{} placeholders. However, there are some business scenarios where #{} cannot be used directly.

For example, in order by syntax

SELECT id, name, phone FROM userTable ORDER BY #{}; , an error will be reported during execution. Because the content after order by is a column name, it is part of the code semantics. SELECT ID, name, phone FROM userTable ORDER BY. There are bound to be grammatical errors.

Another example is the like scenario

SELECT id, name, phone FROM userTable WHERE name like ‘%#{name}%’; #{} will not be parsed, resulting in an error.

This is true for both in and between, so how do you solve this problem?

2 The correct way to write

Order by (group by) statement with ${}

1. Use condition judgment

2. Use global filtering to limit variable content after ORDER by to numbers, letters, and underscores.

If regular filtering is used:

keyword = keyword.replaceAll("[^a-zA-Z0-9_\s+]", "");
Copy the code

Note that whitelist is required for filtering, not blacklist. The blacklist cannot solve the injection problem.

LIKE statements

Because you need keywords in like wrapped in two % symbols, you can concatenate them using the CONCAT function.

Be careful not to use the CONCAT (‘ % ‘, ‘stuName’, ‘{stuName}’, ‘%’), so there are still loopholes. That is, the stuName ‘, ‘symbol is incorrect, and the # symbol is safe.

IN statement

#{return #{return #}

tenant_id in (${tenantIds})
Copy the code

The correct way to write this is:

Mybatis- Generator is safe to use

Under the pressure of heavy CRUD code, developers slowly began to automatically generate Mapper, POJO, Dao and other files through Mybatis- Generator, idea-Mybatis – Generator plug-in, universal Mapper, Mybatis- Generator – Plus.

These tools can automatically generate the files required for CRUD, but if not used properly, they can automatically create SQL injection vulnerabilities. Let’s take org.mybatis. Generator, the most common one, as an example to explain the problems that may occur.

1 Dynamic statement support

Mybatis -Generator provides functions to help users concatenate SQL conditions, such as the like syntax for multiple parameters and the compare syntax for multiple parameters. To keep things simple, you need to concatenate some semantic code into SQL statements. The {} placeholder is also passed in for external input if the developer uses it incorrectly. It creates a leak.

2 Configure the targetRuntime parameter

When configuring the generator, the configuration file generator-rds.xml has a targetRuntime attribute, which defaults to MyBatis3. In this case, Mybatis dynamic statement support is enabled, enabling enableSelectByExample, enableDeleteByExample, enableCountByExample, and enableUpdateByExample.

In the case of enableSelectByExample, the following dynamic modules are substituted into the XML mapping file:

Developers include this module to add where conditions, but if used incorrectly, it can lead to SQL injection vulnerabilities:

And add functions with custom parameters:

public Criteria addKeywordTo(String keyword) {
  StringBuilder sb = new StringBuilder();
  sb.append("(display_name like '%" + keyword + "%' or ");
  sb.append("org like '" + keyword + "%' or ");
  sb.append("status like '%" + keyword + "%' or ");
  sb.append("id like '" + keyword + "%') ");
  addCriterion(sb.toString());
  return (Criteria) this;
}
Copy the code

The purpose is to implement the like operation for display_name, org, status, and ID simultaneously. AddCriterion is a function that comes with Mybatis -Generator:

protected void addCriterion(String condition) { if (condition == null) { throw new RuntimeException("Value for condition  cannot be null"); } criteria.add(new Criterion(condition)); }Copy the code

The mistake here is that addCriterion itself provides support for multiple criteria, but developers think they need to spliced them together and pass the addCriterion method together. As with the code in the example, addCriterion ends up passing only one parameter. To execute the syntax statement Example_Where_Clause:

In other words, the developer inserted his own SQL statement directly into ${criterior. condition}, resulting in the vulnerability. According to Mybatis-generator, the correct way to write this should be:

public void addKeywordTo(String keyword, UserExample userExample) {
  userExample.or().andDisplayNameLike("%" + keyword + "%");
  userExample.or().andOrgLike(keyword + "%");
  userExample.or().andStatusLike("%" + keyword + "%");
  userExample.or().andIdLike("%" + keyword + "%");
}
Copy the code

The OR method is responsible for creating the Criteria, which triggers the logic

${criterion.condition} is replaced with like without single quotes. Like is used as semantic code to concatenate SQL statements before semantic analysis, and “%” + keyword + “%” is added as data to precompiled #{criterion.value}. Injection is avoided.

Similarly, a safe way to use the In syntax is provided:

Safe use for Beetween:

  example.or()
    .andField6Between(3, 7);
Copy the code

Mybatis-generator generates the order BY statement with ${} by default:

Without additional filtering of incoming parameters, injection problems can result.

3 order by

Mybatis-generator generates an order by statement with ${} by default:

Without additional filtering of incoming parameters, injection problems can result.

PS: In the actual minesweeping process, we found that many statements automatically generate the order BY syntax, but when the upper layer calls, this optional parameter is not passed in. In this case, the superfluous Order by syntax should be removed.

4 Other Plug-ins

Security vulnerabilities vary from plug-in to plug-in, but the following is a brief list of common plug-ins.

idea-mybatis-generator

This is a plugin for IDEA that automatically generates the files needed in CRUD from the IDE level during development. There are also some default security risks to be aware of when using this plug-in.

1) Customize order by processing

Like \in\between can be used according to official documents, without potential security risks.

However, this plug-in does not have a built-in order by processing, so you need to write it yourself. When writing it, refer to Case2

2) Check whether the default IF condition is null

The syntax generated by the plug-in by default looks like this:

When the ID parameter is null, the logic under the if tag is not added to the SQL statement, which may cause VULNERABILITIES such as DOS and permission bypass. Therefore, ensure that parameters are not null before being passed into the query statement.

com.baomidou.mybatis-plus

  1. When passing parameters to the apply method, use {}
  2. The last method, whose principle is to concatenate directly to the end of SQL statements, has injection vulnerability.

Other ORM frameworks

1 Hibernate

ORM stands for Object Relational Mapping. Simply put, it is to map the tables in the database into Java objects. These objects with attributes and no business logic are also called Plain Ordinary Java Objects (POJOs).

Hibernate is the first widely used ORM framework, which manages database connections through XML and provides a full table mapping model with a high degree of encapsulation. After configuring the mapping file and database link file, Hibernate can use the Session object to perform database operations. The developer does not need to touch the SQL statement, but only need to write the HQL statement.

Hibernate is often paired with Struts and Spring, the classic SSH framework in the Java world.

HQL has more syntax restrictions than SQL:

  1. You cannot query unmapped tables, and you can use the UNION syntax only when the relationship between models is clear.
  2. Table name, column name is case sensitive.
  3. No *, #, –.
  4. There is no delay function.

So HQL injection is much more difficult to exploit than SQL injection. From a code audit perspective, this is consistent with normal SQL injection:

Splicing causes injection vulnerabilities:

You can use placeholders and named parameters to prevent SQL statements, both of which are precompiled in nature.

Hibernate has many disadvantages:

  1. The full table mapping is inflexible, and all fields need to be sent during update, which affects the program running efficiency.
  2. Poor support for complex queries.
  3. Stored procedure support is poor.
  4. The HQL performance is poor and cannot be optimized based on SQL.

When auditing Hibernate related injection, you can quickly locate SQL operations by searching createQuery globally.

2 JPA

JPA, also known as the Java Persistence API, is a data Persistence specification provided by Java EE that allows developers to persist an object into a database through XML or annotations.

It mainly includes three aspects:

1.ORM mapping metadata, which describes the mapping between objects and data tables through XML or annotations. The framework can then automatically save the data in the object to the database.

Common annotations include @entity, @table, @column, and @TRANSIENT

2. Data operation API, built-in interface, convenient to perform CRUD operations on a data table, save developers writing SQL time.

Common methods are: entityManager.merge(T T);

3.JPQL, which provides an object-oriented rather than database-oriented query language, decouples programs from databases and SQL.

JPA is a set of specifications and Hibernate implements the JPA specification.

In the Spring framework, a simplified JPA implementation, Spirng Data JPA, is provided. Write dao layer interface according to the agreed method naming rules, you can realize the access and operation to the database without writing interface implementation. It also provides many functions in addition to CRUD, such as paging, sorting, complex queries, and so on. It’s easier to use, but the underlying JPA implementation of Hibernate is still used.

As with HQL injection, SQL injection occurs when data controlled by the user is inserted into a query statement using concatenation.

Secure queries should use precompilation techniques.

Spring Data JPA is precompiled as follows:

String getUser = "SELECT username FROM users WHERE id = ?" ; Query query = em.createNativeQuery(getUser); query.setParameter(1, id); String username = query.getResultList();Copy the code

Tip: Hibernate was invented earlier than the JPA specification. After Hibernate became mature, the JavaEE development team invited Hibernate core developers to work together to develop the JPA specification. Spring Data JPA has since been further optimized according to the specification. In addition, there are many products for implementing the JPA specification, such as Eclipse’s TopLink (OracleLink).

Six summarize

After the above introduction, especially the discussion about the error-prone points of Mybatis, we can draw the following conclusions:

  1. There are many types of persistence layer components.
  2. Developers’ misunderstanding of tool use is the main cause of bugs.
  3. Due to the dynamic nature of auto-generated plug-ins, automated discovery of SQL vulnerabilities cannot simply be found using ${}. Detailed matching rules must be made according to the global persistence layer component characteristics.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.