Zero, Class diagram & Process preview

This article explores the life cycle of a connection in Druid using getConnection as an entry point. The general process is divided into the following main processes:


Main flow 1: Obtaining the connection flow

Let’s start with the entry to see what it does to get a connection:Click to the original article

The above is the flow chart for obtaining connections. Init is called to initialize the connection pool, then each filter in the responsibility chain is run, and finally getConnectionDirect is executed to obtain the real connection object. If testOnBorrow is enabled, TestOnBorrow is not recommended to set testOnBorrow to true, which affects performance. The test here refers to testing whether the long connection of mysql server is disconnected. Generally, the long connection of mysql server is kept alive for 8h, and the usage time is updated once it is used. If a connection has been used for longer than the keepalive time, it will not be able to communicate with the mysql server when it is used again.

If testOnBorrow is not set to true, a testOnBorrow check is performed (this is officially recommended and the default is true) to determine whether the time since the connection was last used has exceeded the specified check time. If so, a check is performed once. It’s time to control by timeBetweenEvictionRunsMillis test, the default 60 s.

Each connection object would record the last time to be used, with the current time minus the last use idle time, idle time to timeBetweenEvictionRunsMillis comparison, more than this time is to do a connection availability check, The default value is true. If this value is set to false, testOnBorrow will also be set to false. The long connection keepalive time of the database server is changed to 60s. Do not use the connection for 60 seconds, use more than 60 seconds will report a connection error.

If the result is false when the testConnectionInternal method is used to test the long connection, it indicates that the connection has been disconnected by the server or the connection is unavailable due to other network reasons, and discardConnection will be triggered to reclaim the connection (corresponding to process 1.4, because a connection is discarded, So this method wakes up main flow 3 to check if a new connection is needed. The whole process runs in an infinite loop until either a connection is available or an error exits when the retry limit is exceeded (the maximum number of retries can be retried once if the connection has not exceeded the connection pool limit (the default number of retries is 1 retry, which can be controlled using the notFullTimeoutRetryCount attribute), so if waiting to get a connection occurs here, If the connection pool is not full, the maximum time to wait 2 × maxWait ← this is to be verified).

Special note ①

In order to ensure performance, it is not recommended to set testOnBorrow to true, or use the default configuration of Druid to ensure the best performance. As mentioned above, the default long connection check is 60 seconds. Therefore, if testOnBorrow is not enabled, in order to make sure everything is safe, you need to confirm the long connection survival time of the connected mysql server (the default is 8h, but the dba may set the time for the test environment much less than this time, so if the time is less than 60s, You need to manually timeBetweenEvictionRunsMillis, if mysql server connection time is 8 h or more long, use the default value.

Special note ②

To prevent unnecessary capacity expansion, it is recommended that the minimum number of idle connections minIdle and the maximum number of connections maxActive in the pool be set to the same value for some services and gateway services with high QPS when the long connections on the mysql server are sufficient, and the maximum number of connections maxActive should be increased as required. In addition, keepAlive is enabled for connection activity check (refer to process 4.1), so that dynamic connection creation will not happen later (connection construction is still a heavy operation, so it is better to apply for all required connections at the very beginning, personal opinion, just for reference), but like the management background, long-term QPS is very low. But sometimes need to made some huge operation management background, such as derivative according to what cause need to connect, and management background not special requirements for performance, it is suitable for the value is smaller than maxActive minIdle, that won’t cause unnecessary waste connection, will not be in need of explosion connection can not connect the dynamic amplification.

Main flow 2: Initializing the connection pool

As you can see from the above flow diagram, when a connection is acquired, the first check is whether the connection pool has been initialized (controlled by inited, bool, not initialized to flase, initialized to true, this check is done in init method). Init is called (in purple in main flow 1). Let’s see what happens in the init method:


This lock will be used to ensure thread safety during the initialization process, including subsequent connection pool operations. When initializing the connection pool, it will double check whether it has been initialized. If not, it will initialize the connection pool. At this time, additional filters on the responsibility chain will be loaded through the SPI mechanism.

But this type of filter requires an @Autoload annotation on the class. Connections is used to store the connection objects in the pool. EvictConnections is used to store the connections that need to be discarded for each check. KeepAliveConnections is used to hold live connections that need to be checked for connections (also understood in flow 4.1), then generate an initial number of connections, put them into connections, and then generate two necessary daemons. It is used to add connections into the pool and remove unwanted connections from the pool. These two processes are complex and therefore separate (Main flow 3 and Main flow 4).

Special note ①

If the connection pool is not initialized at the beginning of the instantiation, then the first call to getConnection will do a lot of logic, especially the long connection setup operation. If your program has a lot of concurrency, then the first time you get a connection will be queued because of the initialization process, so it is recommended to warm up the connection pool after instantiating it. Either by calling the init method or the getConnection method.

Special note ②

When constructing the global reentrant lock, we use the lock object to generate two conditions, which are explained as follows:

When the connection pool is sufficient, the daemon thread adding the connection is blocked by empty (main flow 3). When the connection pool is insufficient, the thread acquiring the connection (denoted as business thread A) will block on notEmpty and invoke the daemon thread blocking on Empty to complete the process of adding the connection. After that, business thread A, which is blocked on notEmpty, is reawakened and business thread A continues to try to acquire the connection.

Iii. Process 1.1: Chain of Responsibility

WARN: This is easier to understand with the source code


Filters are implementations of the filters interface in Druid. There are a number of mapping methods that correspond to the connection pool. DataSource_getConnection = dataSource_getConnection = dataSource_getConnection = dataSource_getConnection = dataSource_getConnection = dataSource_getConnection = dataSource_getConnection = dataSource_getConnection According to Flow 1.1, the datasource uses the FilterChain to trigger the execution of each filter. The FilterChain also has a bunch of mapping methods in the datasource, such as dataSource_connect in the figure above. This method can be carried all the filters in the datasource again until nextFilter take less than the value, will trigger a datasource. GetConnectionDirect, the combined code will be easy to understand.

Flow 1.2: The process of getting connections from the pool


Druid supports two ways to create new connections. One is to start a different daemon thread and communicate with “await” or “signal”. AsyncInit =true when the druID is initialized and a thread pool object is assigned to createScheduler. This mode is enabled successfully. This mode is not fully considered, so the flowmaps and code blocks in this article will avoid this mode.

Druid uses poolingCount-1, activeCount+1, activeCount+1, activeCount+1, activeCount+1, activeCount+1, activeCount+1, activeCount+1, activeCount+1, activeCount+1 There are two points to wake up the wait. One is triggered after the recycle connection (main process 5) is used up and enters the pool. The other is triggered after the daemon thread of new connection successfully adds a connection and continues to join the lock competition after the await is aroused. Then go down and if you find that the number of connections in the pool is still 0 (indicating that the connection that was just placed in the lock contention was taken by another thread after waking up), then continue to await the next time, using the awaitNanos method with the initial value maxWait, Then, after the next refresh, maxWait minus the actual time spent on the last block, and the time spent on each await will gradually decrease until it goes to zero. The overall time is about equal to maxWait, but actually larger than maxActive, because the program itself is time-consuming and has to participate in lock competition after being awakened.

If the connection cannot be retrieved, null is returned and the retry logic in main flow 1 is triggered.

How does Druid prevent too many business threads from blocking if a connection cannot be obtained?

Through the flow chart and process description, if very extreme, the connection fails completely in the pond, the thread will block too much business, even blocking than maxWait so long, is there a measure is can control congestion in connection with the number of threads, after more than the limit error directly, rather than into a waiting for?

Druid supports this policy. If maxWaitThreadCount is set to a value greater than 0, it is enabled. This is a discards measure done by Druid. If you don’t want to run out of connections in the pool and have too many business threads blocking, you can configure this property. This property tells you how many business threads can block when there are not enough connections. The use of this switch is not shown in the diagram in Flow 1.2. The notEmptyWaitThreadCount attribute is incremented each time the pollLast method is waiting for a connection to be available, and then incremented when the block ends. Before pollLast is called, getConnectionInternal evaluates the following code:

if (maxWaitThreadCount > 0 && notEmptyWaitThreadCount >= maxWaitThreadCount) {

                    connectErrorCountUpdater.incrementAndGet(this);

                    throw new SQLException("maxWaitThreadCount " + maxWaitThreadCount + ", current wait Thread count "

                            + lock.getQueueLength()); // Throw the exception directly instead of waiting and blocking the business thread

                }

Copy the code

As you can see, if maxWaitThreadCount is set to limit the number of waiting threads, it will determine whether the number of waiting threads exceeds maxWaitThreadCount. If the number exceeds maxWaitThreadCount, pollLast will not be invoked (to prevent new waiting threads), and an error will be thrown.

In most cases, you do not need to enable this item, but you must enable this item. Consider the value of maxWaitThreadCount. Generally, a large number of waits occur, indicating that there are some unreasonable places in the code. For example, the borrowed connection is not returned in time; For example, there are slow queries or slow transactions that lead to the connection being lent for too long. These are much more important than maxWaitThreadCount, which is fine to do as an extreme protection, but you need to consider the value for the actual situation.

Flow 1.3: Connection availability test

(1) the init – the checker

Before we get into this, let’s see how to initialize the checker for detecting connections. See the following figure for the process:


Checker initialization happens in the init phase. Druid supports multiple database connection sources, so the Checker ADAPTS to different drivers. The checker initialization does one thing: check whether there is a ping method in the driver (jdbC4 starts with the ping method). Mysql-connector-java has a ping method as early as 3.x), and if so, set usePingMethod to true, which is used to determine whether to enable the function checker. If false, the normal SELECT 1 check is triggered. Create a statement, execute SELECT 1, and determine if the connection is available.

(2) testConnectionInternal

Then return to the approach discussed in this section: testConnectionInternal for flow 1.3


This method makes use of the isValidConnection method in the Checker object (see init-Checker) initialized in main flow 2 (init phase). If ping is enabled, the method will invoke the ping method in the driver. If ping is not enabled, SELECT 1 is used (as you can see from init-checker, this depends on whether a method exists in the loaded driver).

Flow 1.4: Discarding connections


After the test results returned by process 1.3, if the connection is found to be unavailable, the connection logic will be directly triggered to be discarded. This process is very simple, as shown in the figure above, the activeCount accumulated when the connection is obtained by process 1.2 will be reduced by one again in this process, indicating that the removed connection is unavailable and cannot be in active state. Druid: DruidPooledConnection druid: conn DruidPooledConnection druid: Conn DruidPooledConnection druid: Conn DruidPooledConnection druid: Conn DruidPooledConnection druid: Conn DruidPooledConnection druid The closed connection in the figure above is to close the object. If you use the wrapper class DruidPooledConnection to close the object, it means to recycle the connection object (see Main flow 5).

Main flow 3: Add the daemon thread of the connection


This process is started during main flow 2 (init initialization phase). The process runs independently, waits most of the time and does not grab the CPU, but when there are not enough connections, it is invoked to add connections. Successful connection creation wakes up other threads waiting to acquire available connections, such as:

In flow 1.2, when there are not enough connections, the emptysignal is used to wake up the thread to replenish the connection (the thread blocking on Empty is only the single thread of main flow 3), and then the notEmpty is used to block itself. When the thread replenishment succeeds, Threads blocked on notEmpty are woken up and put into lock contention, which is simply understood as a production-consumption model. There are some details here, such as the activeCount in the pool plus the poolingCount, which is the total number of connections currently generated. This number cannot be greater than maxActive. If it is greater than maxActive, the pool will wait again. When putting a connection to the pool, poolingCount is greater than maxActive to determine whether the pool is pooled.

Main flow 4: Discard the daemon thread of the connection


Procedure 4.1: Thin the connection pool, check whether connections are available and discard redundant connections

The whole process is as follows:


The whole process is divided into the main steps in the figure. First, the poolingCount minus minIdle is used to calculate the interval of connection objects that need to be discarded, which means that objects in this interval are likely to be discarded. To determine whether to put them into the discard queue evictConnections, two attributes need to be determined:

MinEvictableIdleTimeMillis: minimum check clearance, the default value for 30 min, the official explanation: a connection in the pool minimum survival time (combined with inspection interval, idle time more than this time, will be discarded).

MaxEvictableIdleTimeMillis: biggest check clearance, the default value of 7 h, official explanation: a connection in the pool the largest survival time (ignore the inspection interval, as long as idle time more than this time, will be discarded).

If the current connection object more than minEvictableIdleTimeMillis idle time and the subscript in evictCheck interval, belong to discard queue evictConnections, if more than maxEvictableIdleTimeMillis idle time, Directly into the evictConnections (usually will hit the first judgment conditions, unless a connection is not in the inspection interval, and idle time more than maxEvictableIdleTimeMillis).

If the connection object is not evictCheck interval, and the keepAlive attribute is true, is whether the object idle time beyond keepAliveBetweenTimeMillis (the default value of 60 s), if beyond, means the connection to connect the availability check, The object is placed in the keepAliveConnections queue.

After the two queue assignments are complete, the pool is compressed once, and unconnected objects are compressed to the head of the queue.

The next two queues are evictConnections and keepAliveConnections. The objects in evictConnections will be closed and then freed. Objects in keepAliveConnections are checked for them (see isValidConnection in Flow 1.3), and any connections that are not available are discarded (Flow 1.4). Connections that are available are added to the connection pool again.

After the process, you can see that connection idle, also not immediately reduced to minIdle, if before produce a bunch of connection (not more than maxActive), idle down, suddenly need to spend at least minEvictableIdleTimeMillis time can only be removed from the connection pool, If a connection is idle time than maxEvictableIdleTimeMillis it will be recycled, so extreme circumstances (such as a connection pool is no longer be used after initialization), the connection pool will not keep minIdle connected, but a all have no, the production environment is not very common, Default maxEvictableIdleTimeMillis has 7 h, unless it is extremely unpopular system will appear this kind of circumstance, and open the keepAlive will not overthrow this rule, keepAlive priority is lower than maxEvictableIdleTimeMillis, KeepAlive only ensures that connections that do not need to be removed from the pool are checked for connection activity within a specified period of time to determine whether to pool or discard.

Flow 4.2: Proactively reclaim connections to prevent memory leaks

The process is as follows:

This process is triggered when removeAbandoned is set to true and is used to reclaim connections that were taken out and not returned for a long time (return: call close triggered main flow 5).

ActiveConnections is used to save connections that are currently lent from the pool. This can be seen in main flow 1. Each time getConnection is called, if removeAbandoned is enabled, If close is not called for a long time, the borrowed connection will never be put back into the pool. This is a very troublesome thing, and there will be a risk of memory leaks because if close is not called, Means will continue to produce new connections in the pool connections, is not in conformity with the connection pool expectations (the starting point is as little as possible to create a connection pool), and then be lend out before connection object and risks have not been able to be recycled, there is the risk of memory leaks, so in order to solve this problem, had this process, whole process is very simple, The abandoned connection objects in the abandonedList will be set to true, and the token has been processed by the process to prevent the main process 5 from processing it again.

RemoveAbandoned =true is completely unnecessary if it can be closed every time in practice. Currently, if you use open source frameworks like Mybatis and Spring, the internal framework will definitely be closed, so it is not recommended to set this, depending on the situation.

Ix. Main Process 5: Reclaim connection

This process is usually triggered by the close method of the DruidPooledConnection wrapper, which targets the RECYCLE method.


Druid’s Connecion wrapper class DruidPooledConnection’s close method is used to return the connection. The dataSource method indirectly triggers the RECYCLE method of the dataSource object through its own internal close or syncClose method.

The ultimate recycle method:

(1) If removeAbandoned is set to true, traceEnable will determine whether the connection object needs to be removed from activeConnections to prevent flow 4.2 from detecting the connection object again. Of course, if the process is actively triggered by Flow 4.2, TraceEnable will be set to false and this process will no longer trigger remove. (removeAbandoned=true) TraceEnable was set to true in main flow 1 when the connection was put into activeConnections, traceEnable is identical to false in the case of removeAbandoned=false.

(2) If it is found that there are unprocessed transactions in the process of recycling, the rollback will be triggered (this is more likely to be triggered by the forced return of the connection in process 4.2, or simply use the connection, open the transaction but did not commit the transaction directly close). The holder object then uses the holder.reset method to restore the default values of some of the properties in the connection object. In addition, the holder object also places the statement object generated by it into its own arrayList. Finally, the list is cleared. Of course, the Statement itself also records all the resultSet objects it has generated. Then, when closing a Statement, it also circulates around closing any unclosed resultSet objects inside it. Prevents the user from doing something with the connection object without closing the open resource.

③ Determine whether to enable testOnReturn. This is the same as testOnBorrow. It is not officially enabled by default, and it is not recommended to enable testOnBorrow.

PutLast sets lastActiveTimeMillis to the current time. PutLast sets lastActiveTimeMillis to the current time. PutLast sets lastActiveTimeMillis to the current time. The last active time is the current time, which can cause a special exception to occur (so extreme that it is almost impossible to trigger, you can choose not to see it) :

If testOnBorrow and testOnReturn are disabled and keepAlive is set to false, the interval between long connection available tests is calculated by subtracting the last active time (lastActiveTimeMillis) from the current time. Then reuse spare time with timeBetweenEvictionRunsMillis (default 60 s) comparison, more than just a long connection available test.

So if a mysql server connection keep alive long time changed artificially to 60 s, then timeBetweenEvictionRunsMillis is set to 59 s, this setting is very reasonable, to ensure the test interval is less than the long connection confirmed actual work time, However, if a connection is closed after 61s, the lastActiveTimeMillis of the connection object is flushed as the current time. If the connection object is retrieved within 59s, it will bypass the connection check and report the connection is not available.

Ten, end

Here for the druid connection pool initialization and its internal a connection from the production to the demise of the whole process has finished, mainly listed its operational process and some main monitoring data is how to produce, not involved is a SQL execution, because this is basically similar to use native drivers, Druid just wraps a layer of statements and so on to perform some of its own operations.