Why shouldn't thread pools be created automatically?

This is the 13th day of my participation in the August More Text Challenge

It’s better to create it manually. This allows us to be very clear about the running rules of the thread pool and avoid the risk of running out of resources.

So let’s take a look, what are the risks if we automate it? It also serves to familiarize us with the very typical thread pools that the JDK provides.

FixedThreadPool

First, show newFixedThreadPool in code. Let’s create a new FixedThreadPoolDemo here

Class. In this program, you demonstrate how to perform tasks using the FixedThreadPool thread pool. The specific code is as follows:

public class FixedThreadPoolDemo { public static void main(String[] args) { ExecutorService executorService=Executors.newFixedThreadPool(5); for (int i = 0; i < 1000; i++) { executorService.execute(new Task()); } } } class Task implements Runnable{ @Override public void run() { try { Thread.sleep(500); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println(Thread.currentThread().getName()); }}Copy the code

In the Main function, we go directly to create a thread pool by creating executors helper class, which passes in a parameter that specifies the number of threads we want to keep alive. So if we use this thread pool, it’s a fixed thread pool. Set it to whatever it is, but here we’ll set it to 5 and we’ll have our thread pool. In this case, the task is submitted to the thread pool.

Next, we create a new task class. The task class is very simple. We sleep for 500 milliseconds and then print out our existing name.

In main we use the for loop to perform tasks. Let’s say we do it 1,000 times, and we pass in our task.

And then we’re going to execute our program. The result is as follows:

As you can see, the thread name is always 1, 2, 3, 4, 5 and no more than that, because the thread that executes the 1000 tasks, they are specified, is 5. So even if the task is extremely heavy, no new thread is created to execute it.

So why is that? We see the source code directly, can see this reason.

public static ExecutorService newFixedThreadPool(int nThreads) {
    return new ThreadPoolExecutor(nThreads, nThreads,
                                  0L, TimeUnit.MILLISECONDS,
                                  new LinkedBlockingQueue<Runnable>());
}
Copy the code

The ThreadPoolExecutor constructor is actually called inside newFixedThreadPool. You can see that it is essentially a ThreadPoolExecutor. The first parameter is nThread, and this n is the parameter that we passed in, so actually this parameter was originally corePoolSize, and its core number was set to n. Assuming that the parameter in our above program is set to 5, the parameter passed in is 5. The second parameter is the maximum number of threads, which you can see in the source code is set to the same as the number of cores, so it can never swell beyond 5.

The third parameter is the lifetime, which is set to 0 by default. This is because since it is impossible to inflate beyond the number of core threads, this parameter is virtually meaningless because no threads will be collected. The fourth parameter is a unit of time, in this case milliseconds, so this parameter is bound to this parameter, which is used to represent the timeout. The other parameter is queue, and the queue used in the source code is a very typical LinkedBlockingQueue, and this LinkedBlockingQueue, as described earlier in the class, is an unbounded queue, so no matter how many tasks come in, All can be executed in LinkedBlockingQueue.

By passing arguments to the constructor, we create a thread pool with the same number of core threads as the maximum number of threads. The number of threads we pass is the parameter we pass in. The important thing here is that the queue we use is an unlimited LinkedBlockingQueue. As the number of requests increases, more and more tasks will accumulate in the queue. Eventually, a large number of tasks will occupy a large amount of memory, and OOM exceptions, namely OutOfMemoryError, will occur, which will almost affect the entire program, resulting in serious consequences.

So let’s now demonstrate how a FixedThreadPool error actually happens. The specific code is as follows:

public class FixedThreadPoolOOMDemo { private static ExecutorService executorService= Executors.newFixedThreadPool(1); public static void main(String[] args) { for (int i = 0; i < Integer.MAX_VALUE; i++) { executorService.execute(new TaskFixedThreadPoolOOM()); } } } class TaskFixedThreadPoolOOM implements Runnable{ @Override public void run() { try { Thread.sleep(1000000000); } catch (InterruptedException e) { e.printStackTrace(); }}}Copy the code

Let’s create a new FixedThreadPoolOOMDemo class to demonstrate that we are running out of memory. Even though it looks like it’s pretty well designed, just multithreading, and then putting tasks in the queue, it’s a big problem if there’s more and more tasks in the queue.

I’m going to write a thread pool here, a thread pool with a fixed number of threads, let’s say just one thread. We want to process as slowly as possible, because we want the queue to become overloaded with tasks.

And then in the main function, you’re going to put an extra number of tasks in there, up to the maximum value of the Integer, which is integer.max_value, that’s enough, and then you’re going to execute the task.

Which mission to perform? This task also needs to have certain characteristics. Not every task submitted can trigger the memory overload smoothly, we need to design the task according to our purpose. The feature of this task is very obvious, that is, we only let the task thread sleep for a period of time, and this sleep time is set very long, the characteristic of this task is always in sleep state. The reason why a task like this is designed this way is because we don’t want the task to be completed, because when it is completed, we naturally end the task and move on to the next task.

Our goal is to fill up the queue, so hopefully none of the tasks in the queue will end, but as time goes on, the queue will get bigger and bigger. So in this case, we’re going to use the for loop to keep plugging tasks into the queue, and we’re going to have to make some adjustments as we execute them.

For such a default configuration to work, it would take too long to reach this overflow without adjustments. We’re demo, so let’s make our memory a little bit smaller. Set the size of our JVM memory in IDEA with the -XMx8m and -XMx8m parameters. This also does not affect, as long as we can see it has memory overflow situation, it is actually very dangerous, can execute our program now. The result is as follows:

The result of the execution is an OutOfMemoryError. This is due to an OutOfMemoryError that occurs because tasks are constantly being added to the queue, resulting in a large memory footprint. So this shows you one of the things that can go wrong with newFixedThreadPool, which means we have to pay extra attention to using a fixed number of thread pools.

SingleThreadExecutor

Now let’s explore the second thread pool, SingleThreadExecutor.

So first of all, let’s go through the code, so let’s create a new SingleThreadExecutorDome class. The specific code is as follows:

public class SingleTreadExecutorDemo { public static void main(String[] args) { ExecutorService executorService = Executors.newSingleThreadExecutor(); for (int i = 0; i < 1000; i++) { executorService.execute(new TaskSingleTreadExecutor()); } } } class TaskSingleTreadExecutor implements Runnable { @Override public void run() { System.out.println(Thread.currentThread().getName()); }}Copy the code

What are the characteristics of this thread pool? As the name suggests, it is relatively simple: a single thread, meaning that there is only one thread in this thread pool. The thread pool is created using the newSingleThreadExecutor() method. No arguments are required in the newSingleThreadExecutor method because the number of threads is 1, so there is no need to specify anything. Also use the for loop to perform some tasks. The result of running the program is as follows:

The console output is the name of the thread, which makes it obvious that each execution is the first thread in our thread pool. There are no other threads to execute, so the execution is not particularly fast. Such is the nature of this thread pool, which has only one thread.

Why is that? Actually let’s take a look at the source code it creates. The specific code is as follows:

public static ExecutorService newSingleThreadExecutor() {
    return new FinalizableDelegatedExecutorService
        (new ThreadPoolExecutor(1, 1,
                                0L, TimeUnit.MILLISECONDS,
                                new LinkedBlockingQueue<Runnable>()));
}
Copy the code

By looking at the source code we can see at a glance. As you can see, newSingleThreadExecutor and newFixedThreadPool work in the same way. The number of cores passed in is 1 and the maximum number of threads is 1.

Since the number of cores is the same as the maximum number, the time is negligible, and a wireless capacity queue is also passed in, called LinkedBlockingQueue. So you can see that this thread pool is basically similar to the fixed number of thread pools we discussed earlier. The difference is the number of core threads passed in and the maximum number of threads.

This thread pool works in much the same way as we did before, except that the number of threads is set to 1, but it also causes the same problem, that is, it can take up a lot of memory when requests pile up and OOM exceptions occur.

CachedThreadPool

Next, let’s look at a third thread pool, CachedThreadPool, which is cacheable and seemingly awesome. What do you mean you can cache? It is an unbounded thread pool and can automatically reclaim redundant threads.

This thread pool is represented by a diagram. As shown below:

Let’s take a look at this diagram to illustrate how it works. We start with the Synchronous Queue, which we talked about earlier, and let’s see what it does, it’s a queue that exchanges directly. The internal capacity of the directly exchanged queue is 0, so the task cannot be queued, so it does not have a queue to store it. Once a task is submitted, it is submitted directly to the thread for execution, and the maximum number of threads set by the thread pool is the maximum integer, so there is no upper limit.

So no matter how many tasks are submitted, how many threads are created to help execute them, and cacheable, which is to recycle those extra threads after a certain amount of time. When the task is finished, there will be extra threads, so when the extra time passes, it will be reclaimed. The default time is 60 seconds.

Let’s take a look at how this thread is implemented. The specific implementation is as follows:

public class CachedThreadPoolDemo { public static void main(String[] args) { ExecutorService executorService = Executors.newCachedThreadPool(); for (int i = 0; i <1000 ; i++) { executorService.execute(new TaskCachedThreadPool()); } } } class TaskCachedThreadPool implements Runnable{ @Override public void run() { System.out.println(Thread.currentThread().getName()); }}Copy the code

The result of executing the program is as follows:

The main thing to look at is the thread number, which is quite different from the previous situation. That fixed number or one thread pool, they were 1 or 5, but there are hundreds of threads that are created to execute the task, and then after a while, those threads find that there are no more tasks to execute, and they recycle them, and that’s the nature of this thread pool.

There are problems with this thread pool. What’s the problem? Let’s take a look at its source code. The specific code is as follows:

public static ExecutorService newCachedThreadPool() {
    return new ThreadPoolExecutor(0, Integer.MAX_VALUE,
                                  60L, TimeUnit.SECONDS,
                                  new SynchronousQueue<Runnable>());
}
Copy the code

You can see that when a thread is created, the number of core threads is zero and the maximum number of threads is integer.max_value, which means that there is almost no maximum number. Similarly, since threads are created almost indefinitely, if the number of threads created is too large, I have too many tasks, and eventually exceed the operating system’s upper limit and cannot create new threads, or cause a lack of memory, there may be an OOM exception.

ScheduleThreadPool

Let’s take a look at the characteristics of this thread pool. This thread pool supports timed and periodic execution of tasks. The specific implementation code is as follows:

public class ScheduleThreadPoolDemo { public static void main(String[] args) { ScheduledExecutorService scheduledExecutorService= Executors.newScheduledThreadPool(10); scheduledExecutorService.schedule(new TaskScheduleThreadPool(), 5, TimeUnit.SECONDS); } } class TaskScheduleThreadPool implements Runnable{ @Override public void run() { System.out.println(Thread.currentThread().getName()); }}Copy the code

Result of execution run:

There are several uses for this thread pool implementation, and we’ll explore each of them below.

public <V> ScheduledFuture<V> schedule(Callable<V> callable,
                                       long delay, TimeUnit unit);
Copy the code

Create and execute a ScheduledFuture that is enabled after a given delay, callable: the function to execute, Delay: the time to delay execution from now on, and Unit: the time unit of the delay parameter.

public abstract ScheduledFuture<? > schedule (Runnable command, long delay, TimeUnit unit)Copy the code

Create and execute a one-time operation that is enabled after a given delay. Command: the task to be executed, delay: the time to delay execution from now on, unit: the time unit of the delay parameter.

public abstract ScheduledFuture<? > scheduleAtFixedRate (Runnable command, long initialDelay, long period, TimeUnit unit)Copy the code

Create and execute a periodic action, enabling it first after a given initialDelay and then within a given time period, command: the task to be executed, initialDelay: the time to delay the first execution, period: the time between successive executions, unit: InitialDelay and period time units of parameters.

public abstract ScheduledFuture<? > scheduleWithFixedDelay (Runnable command, long initialDelay, long delay, TimeUnit unit)Copy the code

Creates and executes a periodic action that is first enabled after a given initial delay, followed by a given delay between the termination of one execution and the start of the next. InitialDelay: delay between the end of one execution and the start of the next. Unit: nitialDelay and the time unit of the delay parameter.

Why shouldn’t you create a thread pool manually

The correct way to create threads should not be automatic, because these thread pools are designed in advance and do not fit our business. It is best to set the parameters of the thread pool according to our business scenario.

For example, how much memory we have, how to define the name of the thread, if we want to define the name of the thread in the thread pool, we need to pass in our own thread factory, and how to log when the task is rejected, all of these things are relevant to our business. How much concurrency we have may determine the number of threads we need, which is best determined by the thread pool after research and combined with the business. These parameters are customized for our business, so they are very suitable for our business scenario.

What is the appropriate number of threads, the number of CPU cores versus the number of threads

Let’s discuss the appropriate number of threads and the relationship between the number of CPU cores and the number of threads. Earlier we said it would be better to set the thread manually. But how do we determine this parameter?

The most important parameters in the thread pool are corePoolSize and maxPoolSize, which determine the number of threads we have.

My advice is based on years of personal experience, and there are different rules depending on what we do.

First of all, if the task is CPU intensive, CPU intensive means a lot of computation, like encryption, decryption, computation hashing, compression, these are very typical application scenarios, the CPU is constantly working, the CPU is pretty much at full load, At this point, you should set the number of threads to 1 or 2 times the number of CPU cores. If the CPU is 8 cores, it is better to set it to a number between 8 and 16. This number of threads is sufficient, because our threads are already working at full capacity. Our threads are doing a lot of CPU calculations, so they will allocate one CPU to each thread, and then they will continue to execute. However, if we set up too many threads, each thread will want to use CPU resources to perform its own tasks, which will cause unnecessary context switches. In this case, the increase in the number of threads does not improve performance, but leads to performance degradation due to the excessive number of threads.

In this case, it is also a good idea to consider what other programs are running on the same machine that may use too much CPU, and then make an overall balance of CPU usage.

Second, time-consuming IO type, such as reading and writing database, reading and writing files, network communication, this is not the same, this is why? Because when it comes to networking, the CPU usually doesn’t work, or peripherals are slower than our CPU when it comes to reading and writing files, so the CPU is likely to rest. In this case, you can set the number of threads to be many times greater than the number of CPU cores. For example, an 8-core CPU can be set to 10 times, which is equivalent to having 80 threads to perform tasks. In fact, this is also objective. Why is this? Because even though it looks like there are 80 threads, all of them are actually waiting, either for a file or for the network, so it’s actually getting the most out of the CPU.

So there are different design rules depending on our business. What our task is, is to set the number of threads in the thread pool according to the task type.

Brain Goetz, author of Java Concurrent Programming In Action, recommends the following calculation method:

Formula for setting the number of threads: Number of threads = number of CPU cores * (1+ average waiting time/Average working time).

Using this formula, we can calculate a reasonable number of threads that will increase if the average wait time for a task is long, and decrease if the average wait time is long for our CPU-intensive task above.

Wait time is when a thread spends most of its time waiting, which is obviously reading the database, so the longer it takes to read the database, let’s say the wait time is 100S, the shorter the average working time is, for example, 1S, waiting 100S for the database to read the data, and then performing the computation for 1S. Assuming this ratio, the final value is 101 * CPU cores, which is the calculation method.

The number of threads actually varies depending on the task, so under such a standard. If you want to be more precise, in fact, the most accurate should be based on different applications to do pressure monitoring of JVM threads and CPU load, based on the actual situation, and then you can get a relatively appropriate number of threads, so that reasonable and full utilization of resources. If you make a rough estimate, you can use the formula above.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Why shouldn’t thread pools be created automatically?

FixedThreadPool

SingleThreadExecutor

CachedThreadPool

ScheduleThreadPool

Why shouldn’t you create a thread pool manually

What is the appropriate number of threads, the number of CPU cores versus the number of threads

Why shouldn’t thread pools be created automatically?

FixedThreadPool

SingleThreadExecutor

CachedThreadPool

ScheduleThreadPool

Why shouldn’t you create a thread pool manually

What is the appropriate number of threads, the number of CPU cores versus the number of threads

Related Posts

Spring Source Code Learning Notes BeanPostProcessor

Redis persisted RDB

General technology and framework for front – end separation projects