preface

Arguably, thread pools are the most common concurrency framework used in Java concurrency scenarios. Thread pools are available for almost any task that needs to be executed asynchronously or concurrently. There are three benefits to using thread pools properly during development:

  • Reduce resource consumption. This should be easy to understand if you understand the context of Java threads. Reduce the cost of thread creation and destruction by reusing created threads.
  • Improve response speed. When a task arrives, it can be executed immediately instead of waiting for the creation thread.
  • Improve thread manageability. Threads are a scarce resource and can be allocated, monitored and tuned uniformly through the thread pool framework.

Nothing in the world is perfect, everything has two sides. Thread pools can also pose security risks if abused or used improperly. Therefore, thread pools must be used wisely to maximize the benefits. Next, let’s take a systematic look at thread pools so that we can achieve fair use.

Principle of thread pool

How does the thread pool handle tasks once they are submitted to the thread pool? Here’s how it works:

As you can see from the above figure, when a new task is submitted, the thread pool process is as follows:

  • Is corePoolSize full? If not, create a thread (global lock) and execute the task, otherwise proceed to the next step;
  • Is the queue full? If it is not full, the task will be added to the queue, otherwise proceed to the next step;
  • Is the thread pool full? If not, create a thread (global lock) and execute the task, otherwise proceed to the next step;
  • At this point, the thread pool is unable to receive the task, and a rejection policy is implemented.

Can the order of the above 2/3 steps be reversed? In fact, thread pools are designed in this way because one-third of the steps are to acquire the global lock. If the task is frequently committed for execution, then the lock contention will increase, whereas step 2 is a contention that does not require additional global locks. With the above cognition, let’s analyze the source code:

public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
        
        The CTL variable can represent both the number of threads and the state of the thread pool.
        int c = ctl.get();
        // Count the current number of threads based on CTL, if less than the set number of core threads.
        if (workerCountOf(c) < corePoolSize) {
            AddWorker creates a thread and executes the current task. The addWorker method needs to acquire the global lock inside
            if (addWorker(command, true))
                return;
            // If addWorker returns a failure flag, retrieve the current CTL value.
            c = ctl.get();
        }
        // Get the current thread pool state based on CTL, if it is RUNNING and the task is added to the queue
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            // Check the thread pool status again, if the current state is not RUNNING, remove the current task.
            if(! IsRunning (recheck) && remove(command)) If the command succeeds, reject(command) is executed.else if (workerCountOf(recheck) == 0) // If the current number of threads is 0
                // The addWorker method is used to initialize the CTL
                addWorker(null.false);
        }
        // If this fails, the addWorker method creates a new thread to execute the task,
        // If the addWorker method returns the false flag, it has failed to create a new thread to execute the task.
        // The thread pool is full, or is no longer in the RUNNING state.
        else if(! addWorker(command,false))
            // The rejection policy will be executed
            reject(command);
    }
Copy the code

Through the above source code parsing, it is clear how a task submitted to the thread pool is handled. There are two important points in the source code that have not been explained. One is the role of CTL variable; The second is the analysis of addWorker method. These two points are the essence of a thread pool.

CTL variable

The CTL variable is an AtomicInteger type that contains two concepts:

  • The number of valid threads in the thread pool, that is, the current number of worker threads;
  • This is strange. Why can CTLS represent both quantity and state? In fact, if we read the source code more, we will see that there are many places in Java to use this way, the purpose of which is to ensure that the performance and memory space as efficient as possible. Therefore, multiple business states are often represented by a single variable. To understand the CTL variable more clearly, we post the source code directly:

In order to pack them into one int, we limit workerCount to (2^29)-1 (about 500 million) threads rather than (2^31)-1 (2 billion) otherwise representable. If this is ever an issue in the future, the variable can be changed to be an AtomicLong, and the shift/mask constants below adjusted. But until the need arises, this code is a bit faster and simpler using an int.

In order for the number of valid threads and the state of the thread pool to be represented by an int variable, the number of threads is limited to 2^29-1 (approximately 500 million), so that the number of valid threads can be represented by 29 bits lower and the state of the thread by 3 bits higher. For a better understanding, let’s go directly to the source code:

    private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
    private static final int COUNT_BITS = Integer.SIZE - 3;
    private static final int CAPACITY   = (1 << COUNT_BITS) - 1;

    // runState is stored in the high-order bits
    private static final int RUNNING    = -1 << COUNT_BITS;
    private static final int SHUTDOWN   =  0 << COUNT_BITS;
    private static final int STOP       =  1 << COUNT_BITS;
    private static final int TIDYING    =  2 << COUNT_BITS;
    private static final int TERMINATED =  3 << COUNT_BITS;
    
    private static int runStateOf(int c)     { return c & ~CAPACITY; }
    private static int workerCountOf(int c)  { return c & CAPACITY; }
    private static int ctlOf(int rs, int wc) { return rs | wc; }
Copy the code

Based on the source code, you can derive their actual values: COUNT_BITS = integer.size-3; COUNT_BITS = 32-3 = 29; CAPACITY = (1 << 29) – 1; CAPACITY = 2^ 29-1 = 500 million; RUNNING = -1 << 29; –> 111 SHUTDOWN = 0 << 29; 000 STOP = 1 << 29; 001 TIDYING = 2 << 29; The high 3 bits are :010 TERMINATED= 3 << 29; –> The top three bits are :011

The runStateO() method is used to retrieve the current state of the thread pool by CTL & ~ CAPACITY. ~ CAPACITY is used to retrieve 1 bits and 0 bits. So CTL & ~ CAPACITY is actually the value of the higher 3 bits of CTL. Similarly, the workerCountOf() method actually fetches the value 29 bits lower than the CTL. Represents the number of threads currently valid.

From the above analysis, you should be able to understand the CTL variable meaning!

AddWorker () method

In the Thread pool, threads are not threads, but wrapped as a Worker based on threads. Worker is an inner class of ThreadPoolExecutor. Therefore, the addWorker() method actually decides whether to build the Worker and execute it based on the current thread pool state. After the Worker executes the current task, it does not exit directly, but circulates the task in the queue to execute it. We can prove this conclusion from the source code:

final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
            // If task is not empty, the task is executed.
            // If rugo task is empty, the task is fetched from the queue.
            while(task ! =null|| (task = getTask()) ! =null) {
                w.lock();
                // Check whether the state of the thread pool and the interrupt flag of the current task satisfy the conditions to continue execution.
                if((runStateAtLeast(ctl.get(), STOP) || (Thread.interrupted() && runStateAtLeast(ctl.get(), STOP))) && ! wt.isInterrupted()) wt.interrupt();try {
                    // Hook function: the hook function before executing the task
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        // Execute the task
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        // Hook function: the hook function that is called after executing the taskafterExecute(task, thrown); }}finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally{ processWorkerExit(w, completedAbruptly); }}Copy the code

Hook function

The ThreadPoolExecutor framework has several hook functions reserved for the worker. The worker will trigger the hook function when executing a task, and if we need to perform some special services (such as counting the duration of the task), You can inherit ThreadPoolExecutor to implement hook functions for specific business purposes.

Hook function instructions
beforeExecute() Triggered before the task is executed
afterExecute() Description Triggered after the task is executed
terminated() This method is triggered when the thread pool state becomes TIDYING

Status of the thread pool

There are five states in a thread pool.

state explain
RUNNING Running status. When a thread pool is successfully created, its state is RUNNING. New tasks can be received and tasks in the queue can be executed in this state
SHUTDOWN In this state, the thread pool receives no new tasks, but completes the tasks in progress and the tasks in the queue.
STOP In this state, the thread pool receives no new tasks. No longer execute tasks in the queue; A task in progress is interrupted.
TIDYING In this state, workCount=0 in the thread pool and the thread pool will call the terminated() method.
TERMINATED The state of the thread pool becomes terminated when the terminated() method is executed, and the life of the thread pool is terminated.

Thread pool state flow:

Thread pool state flow and thread state flow are completely different concepts.

Use of thread pools

Based on the above, we have a clear idea of how thread pools work. Next, take a look at how thread pools are used and techniques in practice through an example.

Create a thread pool quickly

There are many ways to create a thread pool. Such as:

Create a thread pool of fixed size
Executors.newFixedThreadPool(); 
Create a thread pool based on the SynchronousQueue.
Executors.newCachedThreadPool();  
// 3. Create a thread pool with latency (actually based on the DelayQueue implementation, which was parsed in the previous article)
Executors.newScheduledThreadPool(); 
Create a thread pool with only one thread
Executors.newSingleThreadExecutor();
Copy the code

These are four quick ways to create thread pools, but if we dig a little deeper, we’ll see that they’re all thread pools built from the constructor provided by ThreadPoolExecutor. The thread pools created in these four ways all have certain characteristics in them. If you understand queues thoroughly, you can see that their essence is to choose different queues. This allows you to implement special functions based on the characteristics provided by queues. In practice, unless the application scenario is relatively simple and the task is not very large, we can use this method to create a thread pool quickly. However, if we are in a medium to large project, it is best to create custom thread pools based on ThreadPoolExecutor, which are more appropriate for real-world scenarios.

Create a thread pool based on the constructor

ThreadPoolExecutor provides the following constructors:

public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue);
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, RejectedExecutionHandler handler);
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory);
public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler);
Copy the code

It is found that building thread pools through native constructors is the most flexible. Therefore, in the Alibaba Java Development Specification, it is also strongly recommended to create thread pools in this way. The meanings of each parameter are as follows:

Parameter names meaning
corePoolSize Core threads
maximumPoolSize Maximum number of threads
keepAliveTime When the number of threads exceeds the number of core threads, the lifetime of the extra threads is used in conjunction with Unit
unit Use with keepAliveTime
workQueue Work queues. When the number of core threads is full, tasks are queued first. If there are no special requirements, you should choose bounded queues as much as possible! Otherwise, when tasks proliferate, memory can be overwhelmed and performance can degrade or even crash.
threadFactory Thread factory class, used to create threads. If not specified, the default factory class is used
handler Task rejection policy. When the thread pool can no longer accept new tasks, the task rejection policy is enforced. If not specified, the default rejection policy is used.

Task rejection strategy

When the thread pool is full, a reject policy is executed. Thread pools provide four rejection policies. As follows:

Rejection policies instructions
AbortPolicy When a thread pool is full, will refuse to receive new tasks, and throw RejectedExecutionException. If no rejection policy is specified, this is the default policy
CallerRunsPolicy When the thread pool is full, the task will be executed using the thread of the task submitter. The thread pool itself will drop the task.
DiscardPolicy When the thread pool is full, the task is silently dropped. Tips: In real development, this strategy can be adopted if the task does not affect the business, and it is definitely not adopted otherwise.
DiscardOldestPolicy When the thread pool is full, the tasks at the top of the queue are silently dropped. The submitted task is then executed. Tips: Same as above

Closing the thread pool

It is necessary to turn off thread pools. When an application needs to exit, it can close the thread pool by registering callback functions. If we violently shut down the application, it will result in the loss of executing tasks and tasks in the queue. In enterprise engineering, pay attention to this. Thread pools provide two ways to close:

Close the way instructions
shutdown() When this method is called, the thread pool state changes to SHUTDOWN, and the thread pool will not receive any new tasks, but the received tasks will complete.
shutdownNow() When this method is called, the thread pool state changes to STOP, the thread pool receives no new tasks, interrupts all tasks in progress and discards tasks in the queue.

In practical engineering, the specific way to adopt, should be based on the actual situation to choose. If the task has an impact on the business, you should select Shutdown (), otherwise shutdownNow() as appropriate.

Thread count setting policy

Threads are a rare resource in Java applications. Should the number of threads be set as large as possible? No. In a computer system, if you want to maximize performance, it should be reasonable configuration and use between subsystems. The same is true for the number of threads. To set a reasonable number of threads, you must first analyze the character’s characteristics. It can be analyzed from the following perspectives:

  • Task nature: CPU intensive, IO intensive, hybrid.
  • Task priority: high, medium, low.
  • Task execution time: long, medium, short.
  • Task dependencies: Whether they depend on other resources, such as database connections.

We can consider thread count Settings based on the nature of the task. Generally speaking. If you are CPU intensive, you should allocate the smallest possible number of threads: usually, you can set the number of CPU cores + 1; If you are IO intensive and the threads are not always performing tasks, you should allocate as many threads as possible: usually, 2 * CPU cores; If the task is mixed, you can divide the task into one CPU intensive task and one IO intensive task. As long as the execution time of the two tasks is not too different, the performance is higher than that of the serial execution. If the execution time difference between the two tasks is too large, it is unnecessary to split the task.

Thread pool monitoring

Using thread pools should not be a problem. For a perfect application, it should also have good monitoring ability, so that when problems occur in task execution, problems can be quickly located, analyzed, and solved. ThreadPoolExecutor provides some basic and useful methods for monitoring the health of thread pools:

The method name instructions
getTaskCount() The number of tasks that need to be executed in the thread pool queue
getCompletedTaskCount() The number of completed tasks that have been executed in the thread pool
getActiveCount() Number of threads executing tasks in the thread pool

If we want to monitor the running state of the thread pool and the execution of tasks more comprehensively. You can customize a thread pool by inheriting ThreadPoolExecutor.

conclusion

This article introduces the implementation of thread pools around ThreadPoolExecutor. And how to properly use thread pools in real projects. Through the writing of this article, I have a different feeling about the understanding of thread pools. CLT variables, for example, are really well designed. Big respects like Doug Lea!