Introduction

I don’t like GORM very much, it feels too complicated (he must really like reflection and be good at it). So I want to understand how the built-in package is used, this article will briefly talk about how the built-in package does the query. Some details such as locking and the use of buffers are not discussed below. Let’s start with a simple example. This is a simple query:

cond := "SELECT name FROM users WHERE (pin=?) "
vals := []interface{} {"liangxiaohan"}
rows, err := db.Query(cond, vals...)
Copy the code

A simple query generally goes through the following steps, and some problems are introduced in understanding these steps:

  • To obtain a connection object, it can be cached or newly created
    • What is a connection object? Why do I need to cache connection objects? How to create a connection object
  • Create a Statement object that is bound to the MySQL connection
    • What is a Statement, why is it needed, and what does it do
  • From this object, query the results and move the results to the Rows object
    • What is the “Rows” object, how to read a “Rows”, and what is the process

Connection object

What is a connection object

The connection object refers to the connection between the GO program and the database. We make this connection an object. If we want to perform database operations, we can use this object directly, avoiding the time-consuming operation of creating temporary connections every time we perform operations.

The Database/SQL package is a generic package that is a lid and an interaction surface. It doesn’t care what database is being linked underneath. So it just provides a “query “/” execute” interface for someone else to implement. For example, if you are using a Mysql database, the “query” here is a pointer to the “Mysql database” query. The corresponding link object you create is a pointer to a connection to the “MYSQL database “.

The connection object is an interface

// Connect objects
type Conn interface {
  Prepare(query string) (Stmt, error)
  Close() error
  Begin() (Tx, error)
}
Copy the code

We might connect to a lot of different databases, MySQL,Postgres, etc., but for any database we just want it to add, delete, change, and check, so to speak. As long as you can provide the above three functions, you are a qualified database link object:

  • Prepare a Statement: A Statement(Stmt) abstracts a database operation that can be executed directly
  • Start a Transaction
  • Stop: Stop is either a query or a Transaction that invalidates the link object.

The realization of the interface

Further, in your daily development process, you need to perform database operations, you introduced the database/ SQL package to help you do database operations, so where is the interface implementation mentioned above? At this point you need to introduce a third-party package that implements all the interfaces you need to implement for your MySql database. You can then execute the MySql statement smoothly, as follows:

import (
  "database/sql"
	_ "github.com/go-sql-driver/mysql"
)
Copy the code
  • All interfaces are implemented in Go-SQL-driver /mysql, which you can think of as a drive
  • Database/SQL is our interactive interface (also GORM), which manages the work, manages the conversation, indicates our requirements, what we need to query. It’s the drive that’s really in charge

As a result, in order to use the database, you have to import two libraries at once, but this time we will just look at how the interface is designed and implemented.

Packaging and protection of connected objects

type driverConn struct {
	sync.Mutex  // Protect the following fields
	ci          driver.Conn
	closed      bool
	finalClosed bool 
	openStmt    map[*driverStmt]bool
	lastErr     error 
}
Copy the code

A database link object itself should not be used for concurrent operations (otherwise inconsistencies may occur). To solve this problem, we wrap the link object to generate a driverConn structure, whose main properties include:

  • A lock: Because the database link object cannot be used for concurrent operations, we attach a lock to it to protect it
  • Ci: Link object ontology

Talk about caching

When we’re trying to get a link object, we specify a “cachedOrNewConn”, usually “cachedOrNewConn”, either from cache or new, but sometimes we’ll specify that we need a new link, such as when an existing connection is invalid, We need to explicitly ask it to create a new link. We specify several questions:

  • What is the nature of connection pooling
    • Each database object (*DB) has its own connection pool, which is essentially an array,[]*driverConnThe content inside is the wrapped link object we just mentioned
    • The objects inside are valid, not closed, ready to use, ready to take, use and then put back
  • Why not Sync.Pool
    • This problem was mentioned in sync. Pool, where the Pool is emptied once every GC and replenished. If an object is valid/invalid/stateful/stateless, then it is not suitable for the Pool
  • Why do we need to keep a connection pool
    • Need not create repeatedly, more convenient, want to use also faster
    • The connection pool facilitates unified management of connection objects, and can effectively reclaim data for any time-out occupied links, avoiding resource leakage
  • Where do connection objects in the connection pool come from and how can they be replenished
    • If we need to create a new connection object before querying, we will consider adding connection objects to the pool
    • The process of creating a new link object involves communicating with the database link. The process of waiting for the link to communicate is asynchronous, so we also introduce select+ timeout (through the context package). We try to read the returned connection object, and if the object is valid (err==nil) we put it in the connection pool
  • Why and how can connection objects fail
    • Connection objects can fail in two ways. One is timeout. Any connection object set in the database has a lifetime
    • If the database connection is closed, then all connection objects in the database are invalid

Statement

What is the Statement

Stmt is the SQL statement that has been processed preliminarily. Before your SQL is queried, it will be parsed into a Statement(Stmt). Because the daily contact is not, so it will be relatively abstract. The SQL query statement will need to be parsed before it can be executed, and then the execution statement will be generated. Then execute the plan.

Why use Stmt

From this point of view, if we can make some common Query statements as STMT, we can not use SQL next time to use STMT, will save some time. The answer given by this person seems to be good, and he gives a conclusion at the end of the experiment: STMT can save our SQL query time, but this person’s experiment design has some problems, we will explain later, is STMT a universal answer, and what is the problem of this experiment

Generation and use of Stmt

Let’s start step by step. Since STMT behaves differently in transaction and regular mode, let’s just talk about how STMT is generated and then used in regular mode (db.query)

stmt,err := db.Prepare(<sql_string>)
Copy the code

This is how you generate the STMT. The Prepare function parses the SQL statement and returns any errors found in the SQL statement to err

  • First, we try to get a connection object and carry it down
  • Call the parse function defined in the MySql driver to parse the query and generate a bare STMT
  • Assigning the connection object brought up above to the naked STMT object generates the STMT object that can be used directly. Here we can see that the generated Stmt object contains a link object.

When we use the STMT object, we will first try whether the connection object in STMT is valid and busy. If it is valid and idle, we will use the STMT object. But if the link object is busy now, we need to create a new STMT object, using a free link object. With this in mind, most of your connected objects are busy on very, very busy days, when you think you’re going to reuse your STMT objects, but in fact they need to be rebuilt. So it’s not necessarily efficient

Going back to the beginning of the question, what might be the problem with the experiment designed by this person, that is, the efficiency of STMT will gradually decline under the condition of high concurrency/tight connection pool, while the experiment designed by this person is serial, that is, it is unlikely to run out of connection objects, so it does not show the efficiency of STMT

Rows

The “Rows” object refers to the result of your query, but the result of the query is not in “Rows”. It’s in the buffer of the connected object, and we need to “walk through” that buffer to get back the data we want

rows, err := db.Query(cond, vals...)
defer rows.Close()
for rows.Next() {
	var name string
	rows.Scan(&name)
}
Copy the code

Here is a simple query example, which goes through the following steps:

  • Request data from the database and store the results in Rows
    • Take the previously generated Stmt and go to the database to query. It turns out that the original data is stored in the buffer of the connected object. I understand that the purpose of this is because the buffer might be quite large, so let’s not read it all at once for now. We’ll just leave it here and go through it a little bit at a time. Then we create a new Rows object and assign the connection object to the Rows
  • Read it back one line at a time and save it
    • Each time at Next we take the link object in Rows, read a row from buffer, and dump the data into RowslastcolsObject. This object can hold only one row of data, representing the contents of each read, essentially one[]interface{}
  • Scan the row into the data structure and repeat the process until Next() can’t read any data, then close Rows
func(*scope) pluck(..){...defer rows.Close()
  for rows.Next() {
    elem := reflect.New(dest.Type().Elem()).Interface()
    scope.Err(rows.Scan(elem))
    dest.Set(reflect.Append(dest, reflect.ValueOf(elem).Elem()))
  }
  ...
}
Copy the code

And the way that GORM scans for data essentially wraps that step, so the one above is pluck, which essentially reads the data into the buffer, and then iterates through Next() and puts it into the target structure (just with a bunch of reflations)

An amway

Database/SQL is simple enough, but I still feel it is not enough to use this package directly. There is a small assistant I would like to recommend didi-Gendry. Briefly introduce the Builder package to help you generate query statements, it is very simple and convenient