Diamond 2: Divide and Conquer - How to understand ForkJoinPool speed and Slow from principle

Welcome to King of Concurrency. This is the 25th article in this series, and the second in brick.

In the last article, we learned about a thread pool, ThreadPoolExecutor, that handles concurrent tasks through efficient management of task queues and threads. However, ThreadPoolExecutor has two obvious disadvantages: it can’t split large tasks and can only be executed by a single thread. Second, there is competition when the worker thread obtains the task from the queue. Both of these disadvantages affect the efficiency of the task, and every millisecond counts in a highly concurrent scenario.

The ForkJoinPool described in this article provides alternative answers to both of these questions. In this article, we will start with the dial-and-conquer algorithm, then experience the implementation of custom tasks in ForkJoinPool, and finally dive into Java to understand the principles and uses of ForkJoinPool.

This article is about 20,000 words, the length is long, when reading, it is recommended to look at the table of contents before looking at the content or first collection.

Divide and conquer algorithm and Fork/Join mode

In concurrent computing, Fork/Join mode is often used for parallel computing of large tasks. It constantly disassembles tasks in a recursive way, and then merges the results. If viewed from its ideology, Fork/Join is not complex, its essence is the application of division-and-conquer algorithm.

The basic idea of the diet-and-conquer algorithm is to decompose a problem of size N into K smaller subproblems, which are independent of each other and have the same properties as the original problem. By solving the subproblem, we get the solution to the original problem. ** The steps of the divide and conquer algorithm are as follows:

(1) Decomposition: the problem to be solved is divided into a number of smaller similar problems;
(2) Solution: When the subproblem partition is small enough, a simpler method is used to solve it;
(3) Merge: According to the requirements of the original problem, the solutions of sub-problems are merged layer by layer to form the solution of the original problem.

The same is true for the Fork/Join process of splitting tasks and merging results, which can be represented by the following pseudocode:

solve(problem):
    if problem is small enough:
        // If the task is small enough, execute it
        solve problem directly (sequential algorithm)
    else:
        // Split the task
        for part in subdivide(problem)
            fork subtask to solve(part)
        // Combine the results
        join all subtasks spawned in previous loop
        return combined results
Copy the code

Therefore, to understand the Fork/Join model and the ForkJoinPool thread pool, you must first understand the purpose and idea of the algorithm behind it, because ForkJoinPool is only one implementation and application of this algorithm.

Ii. Application scenarios and experience of Fork/Join

In line with the king concurrent class advocated the idea -> implementation -> source ideas, after understanding the Fork/Join idea, we first through a scene manual implementation of a RecursiveTask, so that you can better experience the use of Fork/Join.

Scenario: Given two natural numbers, calculate the sum between the two numbers. For example, the sum between 1 and n: 1+2+3+… +n

To solve this problem, we created a core class called TheKingRecursiveSumTask, which inherits from RecursiveTask. RecursiveTask is a type of task in the ForkJoinPool. You don’t need to know more about it.

TheKingRecursiveSumTask defines the start and end ranges (sumBegin and sumEnd) and split thresholds (threshold) for task computation, as well as the core compute logic compute().

public class TheKingRecursiveSumTask extends RecursiveTask<Long> {
    private static final AtomicInteger taskCount = new AtomicInteger();
    private final int sumBegin;
    private final int sumEnd;
    /** * Task splitting threshold. When the task size is larger than this value, the task will be split */
    private final int threshold;

    public TheKingRecursiveSumTask(int sumBegin, int sumEnd, int threshold) {
        this.sumBegin = sumBegin;
        this.sumEnd = sumEnd;
        this.threshold = threshold;
    }

    @Override
    protected Long compute(a) {
        if ((sumEnd - sumBegin) > threshold) {
            // If the difference between two numbers is greater than the threshold, split the task
            TheKingRecursiveSumTask subTask1 = new TheKingRecursiveSumTask(sumBegin, (sumBegin + sumEnd) / 2, threshold);
            TheKingRecursiveSumTask subTask2 = new TheKingRecursiveSumTask((sumBegin + sumEnd) / 2, sumEnd, threshold);
            subTask1.fork();
            subTask2.fork();
            taskCount.incrementAndGet();
            return subTask1.join() + subTask2.join();
        }
        // Execute the result directly
        long result = 0L;
        for (int i = sumBegin; i < sumEnd; i++) {
            result += i;
        }
        return result;
    }

    public static AtomicInteger getTaskCount(a) {
        returntaskCount; }}Copy the code

In the following code, we set the calculation interval value from 0 to 10000000. When the number of calculations exceeds 100, the task will be split, and the maximum concurrency is set to 16.

 public static void main(String[] args) {
     int sumBegin = 0, sumEnd = 10000000;
     computeByForkJoin(sumBegin, sumEnd);
     computeBySingleThread(sumBegin, sumEnd);
 }

 private static void computeByForkJoin(int sumBegin, int sumEnd) {
     ForkJoinPool forkJoinPool = new ForkJoinPool(16);
     long forkJoinStartTime = System.nanoTime();
     TheKingRecursiveSumTask theKingRecursiveSumTask = new TheKingRecursiveSumTask(sumBegin, sumEnd, 100);
     long forkJoinResult = forkJoinPool.invoke(theKingRecursiveSumTask);
     System.out.println("= = = = = =");
     System.out.println("ForkJoin Task splitting:" + TheKingRecursiveSumTask.getTaskCount());
     System.out.println("ForkJoin results:" + forkJoinResult);
     System.out.println("ForkJoin computations take time:" + (System.nanoTime() - forkJoinStartTime) / 1000000);
 }

 private static void computeBySingleThread(int sumBegin, int sumEnd) {
     long computeResult = 0 L;
     long startTime = System.nanoTime();
     for (int i = sumBegin; i < sumEnd; i++) {
         computeResult += i;
     }
     System.out.println("= = = = = =");
     System.out.println("Single-thread calculation result:" + computeResult);
     System.out.println("Single thread computation time:" + (System.nanoTime() - startTime) / 1000000);
 }
Copy the code

The running results are as follows:

====== ForkJoin task splitting: 131071 ForkJoin calculation result: 49999995000000 ForkJoin calculation time: 207 ====== Single-thread calculation result: 49999995000000 Single-thread calculation time: 40 Process finished with exit code 0Copy the code

The results show that ForkJoinPool performed a total of 131,071 split tasks, and the final result is 49999995000000, taking 207 milliseconds.

If you were careful, however, you might have noticed that ForkJoin’s parallel calculations take less time than a one-way journey. And it was nearly five times slower! Before you panic, we will discuss the performance of ForkJoin later.

ForkJoinPool design and source code analysis

In Java, ForkJoinPool is an implementation of the Fork/Join model, introduced in Java7 and widely used in Java8. The ForkJoinPool allows other threads to submit tasks to it and, depending on the Settings, to split them into more granular subtasks that are executed in parallel by worker threads within the ForkJoinPool, and worker threads can steal each other’s tasks.

ForkJoinPool and ThreadPoolExecutor are similar in their interface implementation and inheritance. They implement the Executor and ExecutorService interfaces and inherit the AbstractExecutorService abstraction class. In terms of task types, there are two main types of ForkJoinPool tasks: RecursiveAction and RecursiveTask, which inherit from ForkJoinTask. The correlation is shown in the figure below:

Reading the ForkJoinPool source code is not easy. Although the idea is simple, it requires more consideration in the implementation, and some of the code is not very readable, so it is impractical and unnecessary to explain the entire source code. In the following article, we will mainly introduce its core task submission and execution related part of the source code, other source code interested can be read by themselves.

1. Different ways to construct ForkJoinPool

The ForkJoinPool has four core parameters that control the number of concurrent threads in the pool, the creation of worker threads, exception handling, and pattern specification. The parameters are described as follows:

int parallelism: Specifies the parallelism level. The ForkJoinPool determines the number of worker threads based on this setting. If not set, will be usedRuntime.getRuntime().availableProcessors()To set the parallelism level;
ForkJoinWorkerThreadFactory factoryForkJoinPool is created using a Factory when creating a thread. Note the need to implement ForkJoinWorkerThreadFactory, rather than the ThreadFactory. If you do not specify a factory, then will be the default DefaultForkJoinWorkerThreadFactory is responsible for the creation of the thread;
UncaughtExceptionHandler handler: specifies the exception handler, when the task is running error, will be handled by the set processor;
boolean asyncMode: From the name, you might think it isAsynchronous modeSet, but actually set how the queue will work:asyncMode ? FIFO_QUEUE : LIFO_QUEUEWhen asyncMode is true, fifO queues will be used, and when false, lifO mode will be used.

ForkJoinPool is constructed in three ways around the four core parameters, and you can choose one of them as needed.

(1) Method 1: No parameters are used by default

In this construct, you don’t need to set any parameters. ForkJoinPool sets the number of parallelisms based on the current number of processors and uses the default thread structure factory. Is not recommended.

 public ForkJoinPool(a) {
        this(Math.min(MAX_CAP, Runtime.getRuntime().availableProcessors()),
             defaultForkJoinWorkerThreadFactory, null.false);
 }
Copy the code

(2) Method 2: Construction by parallel number

In this construct, you can specify the number of parallelisms to more efficiently balance the number of processors and the load. It is recommended that you set the parallelism level to be lower than the current number of processors.

 public ForkJoinPool(int parallelism) {
        this(parallelism, defaultForkJoinWorkerThreadFactory, null.false);
 }
Copy the code

(2) Method 3: Customize all parameter constructions

Both of the above constructs are based on this construct, which allows you to configure all of the core parameters. In order to manage ForkJoinPool more effectively, it is recommended that you use this construction.

public ForkJoinPool(int parallelism,
                        ForkJoinWorkerThreadFactory factory,
                        UncaughtExceptionHandler handler,
                        boolean asyncMode) {
        this(checkParallelism(parallelism),
             checkFactory(factory),
             handler,
             asyncMode ? FIFO_QUEUE : LIFO_QUEUE,
             "ForkJoinPool-" + nextPoolId() + "-worker-");
        checkPermission();
 }
Copy the code

2. Submit tasks by type

Task submission is one of the core competencies of ForkJoinPool. You have three options when submitting a task, as shown in the table below:

	Called from a non-fork /join thread	From the fork/join the call
Commit asynchronous execution	execute(ForkJoinTask)	ForkJoinTask.fork()
Wait and get the results	invoke(ForkJoinTask)	ForkJoinTask.invoke()
Commit execution gets the Future result	submit(ForkJoinTask)	ForkJoinTask.fork() (ForkJoinTasks are Futures)

(1) Core method of the first class: Invoke

An Invoke method accepts a task of type ForkJoinTask and returns a generic result when the task finishes. If the submitted task is NULL, a null pointer exception is thrown.

 public <T> T invoke(ForkJoinTask<T> task) {
        if (task == null)
            throw new NullPointerException();
        externalPush(task);
        return task.join();
 }
Copy the code

(2) The second core method: execute

Methods of the execute type do not return results after a task is submitted. Note that ForkJoinPool allows submission of not only ForkJoinTask type tasks, but also Callable or Runnable tasks, so you can use It the same way you use the existing Executors.

Of course, a Callable or Runnable task will be converted to a ForkJoinTask. You can see the source code for the submitted task. What is the difference between this type of task and submitting a ForkJoinTask? There are. The difference is that since tasks are not sharable, such tasks do not reap the benefits of task splitting, but they still reap the benefits and performance gains of task stealing.

 public void execute(ForkJoinTask
        task) {
        if (task == null)
            throw new NullPointerException();
        externalPush(task);
 }

public void execute(Runnable task) {
        if (task == null)
            throw newNullPointerException(); ForkJoinTask<? > job;if (task instanceofForkJoinTask<? >)// avoid re-wrapjob = (ForkJoinTask<? >) task;else
            job = new ForkJoinTask.RunnableExecuteAction(task);
        externalPush(job);
 }
Copy the code

(3) The third type of core method: submit

Methods of the submit type support three types of task submission: ForkJoinTask, Callable, and Runnable. After the task is submitted, the result of type ForkJoinTask is returned. A null pointer exception is thrown if the submitted task is NULL, and a task rejection exception is thrown if the task does not execute as planned.

   public < T > ForkJoinTask < T > submit(ForkJoinTask < T > task) {
       if (task == null)
           throw new NullPointerException();
       externalPush(task);
       return task;
   }

   public < T > ForkJoinTask < T > submit(Callable < T > task) {
       ForkJoinTask < T > job = new ForkJoinTask.AdaptedCallable < T > (task);
       externalPush(job);
       return job;
   }

   public < T > ForkJoinTask < T > submit(Runnable task, T result) {
       ForkJoinTask < T > job = new ForkJoinTask.AdaptedRunnable < T > (task, result);
       externalPush(job);
       return job;
   }

   public ForkJoinTask < ? > submit(Runnable task) {
       if (task == null)
           throw new NullPointerException();
       ForkJoinTask < ? > job;
       if (task instanceofForkJoinTask < ? >)// avoid re-wrap
           job = (ForkJoinTask < ? > ) task;
       else
           job = new ForkJoinTask.AdaptedRunnableAction(task);
       externalPush(job);
       return job;
   }

Copy the code

3. ForkJoinTask

The ForkJoinTask is one of the core components of the ForkJoinPool. It is the actual carrier of the task, defining the concrete logic and split logic of the task execution. The example code in this article inherits from it. As an abstract class, ForkJoinTask behaves somewhat like a thread, but it is more lightweight because it does not maintain its own runtime stack or program counters, etc.

In class design, ForkJoinTask inherits the Future interface, so it can also be considered a lightweight Future. The relationship between them is shown in the following figure.

(1) Fork and join

Fork ()/ Join () is the core method of A ForkJoinTask or even a ForkJoinPool. Fork ()/join() carries the primary task coordination function, one for task submission and one for result retrieval.

Fork – Submits the task

The fork() method is used to submit a task to the thread pool where the current task is running, such as subtask1.fork () in the example code above. Note that unlike other thread pools, task submission is done by the task itself by calling fork(). Don’t be surprised. Fork () internally associates the task with the current thread.

In the source code, if the current thread is of type ForkJoinWorkerThread, it will be placed in the task queue of that thread, otherwise in the common thread pool. More on common thread pools later.

    public final ForkJoinTask<V> fork(a) {
        Thread t;
        if ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread)
            ((ForkJoinWorkerThread)t).workQueue.push(this);
        else
            ForkJoinPool.common.externalPush(this);
        return this;
    }
Copy the code

Join – Obtains the task execution result

You already know that you can commit tasks by fork(). Now, you can get the result of the task by using the join() method.

When join() is called, the current thread is blocked until the corresponding subtask finishes running and returns the result. From the source, the core logic of Join() is taken care of by doJoin(). Although doJoin() is short, it is not very readable.

public final V join(a) {
    int s;
    // If doJoin returns a non-normal state, an exception will be reported
    if((s = doJoin() & DONE_MASK) ! = NORMAL) reportException(s);// Normal execution ends and the original result is returned
    return getRawResult();
}

private int doJoin(a) {
    int s;
    Thread t;
    ForkJoinWorkerThread wt;
    ForkJoinPool.WorkQueue w;
    // If done, return the status
    return (s = status) < 0? s :// If it is not completed and the current thread is a ForkJoinWorkerThread, then the workQueue is pulled from that thread and the current task is attempted. If the result of execution is complete, the status is returned; Otherwise, wait using the current thread pool awaitJoin method
        ((t = Thread.currentThread()) instanceof ForkJoinWorkerThread) ?
        (w = (wt = (ForkJoinWorkerThread) t).workQueue).
    tryUnpush(this) && (s = doExec()) < 0 ? s :
        wt.pool.awaitJoin(w, this.0 L):
     // The current thread is not a ForkJoinWorkerThread and calls externalAwaitDone to wait
        externalAwaitDone();
}

final int doExec(a) {
    int s;
    boolean completed;
    if ((s = status) >= 0) {
        try {
            completed = exec();
        } catch (Throwable rex) {
            return setExceptionalCompletion(rex);
        }
        // After the execution, set the state to NORMAL
        if (completed)
            s = setCompletion(NORMAL);
    }
    return s;
}
Copy the code

(2) RecursiveAction and RecursiveTask

There are two types of tasks commonly used in a ForkJoinPool: those that return results and those that do not, similar to those used in thread pools such as ThreadPoolExecutor. The two classes are RecursiveAction and RecursiveTask. As you can see from the class diagram, they all inherit from ForkJoinTask.

RecursiveAction: No result returned

RecursiveAction is used for tasks that execute recursively but do not need to return results, such as sorting below, which is a typical application scenario. When you use RecursiveAction, you need to inherit and implement its core method compute().

static class SortTask extends RecursiveAction {
    final long[] array;
    final int lo, hi;
    SortTask(long[] array, int lo, int hi) {
        this.array = array;
        this.lo = lo;
        this.hi = hi;
    }
    SortTask(long[] array) {
        this(array, 0, array.length);
    }
    // Core calculation method
    protected void compute(a) {
        if (hi - lo < THRESHOLD)
            // Execute directly
            sortSequentially(lo, hi);
        else {
            // Split the task
            int mid = (lo + hi) >>> 1;
            invokeAll(new SortTask(array, lo, mid),
                newSortTask(array, mid, hi)); merge(lo, mid, hi); }}// implementation details follow:
    static final int THRESHOLD = 1000;
    void sortSequentially(int lo, int hi) {
        Arrays.sort(array, lo, hi);
    }
    void merge(int lo, int mid, int hi) {
        long[] buf = Arrays.copyOfRange(array, lo, mid);
        for (int i = 0, j = lo, k = mid; i < buf.length; j++) array[j] = (k == hi || buf[i] < array[k]) ? buf[i++] : array[k++]; }}Copy the code

RecursiveTask: Returns the result

RecursiveTask is used to recursively execute tasks that need to return results, such as the summation in the previous example code or the following Fibonacci sequence summation. When using RecursiveTask, you also need to inherit and implement its core method compute().

 class Fibonacci extends RecursiveTask<Integer> {
   final int n;
   Fibonacci(int n) { this.n = n; }
   Integer compute(a) {
     if (n <= 1)
       return n;
     Fibonacci f1 = new Fibonacci(n - 1);
     f1.fork();
     Fibonacci f2 = new Fibonacci(n - 2);
     returnf2.compute() + f1.join(); }}Copy the code

(3) Limits on using ForkJoinTask

Although forkJoinTasks can be performed efficiently by dismantling tasks in some scenarios, it is important to note that it is not suitable for all scenarios. Forkjointasks need to be used with some limitations in mind. Breaching these limits can be counterproductive or even disastrous.

Why do you say that?

This is because ForkJoinTask is best used for pure computations, i.e. pure function computations, in which the objects are independent and have no external dependencies. You can imagine what it would be like if there were a large number of tasks or subtasks that were broken down that depended on each other or had a very blocking dependency on the outside… It is not an exaggeration to say that external dependence can bring about both task execution and problem detection uncertainty.

Therefore, tasks submitted to the ForkJoinPool should ideally avoid performing blocking I/O in case of unmanageable accidents. Of course, this is not absolute. You can define and use forkJoinTasks if necessary, but it takes a lot of effort and consideration to use them with caution, which is not covered in this article.

4. Work queue and task theft

As mentioned earlier, a significant difference between ForkJoinPool and ThreadPoolExecutor is that the existence of a ForkJoinPool introduces a task-stealing design, which is one of the keys to its performance guarantee.

Task stealing, in simple terms, allows idle threads to steal tasks from a dual-end queue of busy threads. By default, the worker thread gets the task from the head of its own two-endian queue. However, when its own task is empty, the thread gets the task from the tail of the dual-end queue of other busy threads. This approach minimizes the possibility of threads competing for tasks.

Most operations with ForkJoinPool occur in work-stealing queues, which are implemented by the inner class WorkQueue. In fact, this queue is not magic, it is a special form of Deques, but only supports three operations: push, pop, and poll (also known as stealing). Of course, there are strict restrictions on reading queues in the ForkJoinPool. Pushes and Pops can only be called from the thread they belong to, while polls can be called from other threads. In other words, the first two methods are for your own use, and the third method is for someone else to steal the task. The process of task stealing can be illustrated in the following picture, which I suggest you keep as a collection:

Why do worker threads always get tasks from their own headers? Why is it designed like this? Wouldn’t it make more sense to deal with the longer-waiting tasks in the queue first?

The answer, of course, is not “more.” The main reason for doing this is to improve performance. By always selecting the most recently committed task, you increase the chances that the resource will still be allocated to the CPU cache, so that the CPU can process it faster. The reason why the thief obtains the task from the tail is to reduce the possibility of competition between threads, after all, everyone from the same part of the task, competition is much more likely.

In addition, there is another consideration for such a design. Since tasks are divisible, the older tasks in the queue are most likely to be larger-grained because they may not have been divisible, and idle threads have relatively more “energy” to complete these larger-grained tasks.

5. ForkJoinPool monitoring

For a complex framework, understanding the internal state of ForkJoinPool in real time is essential. Therefore, ForkJoinPool provides some common methods. With these methods, you can understand the current worker thread, task processing, and so on.

(1) Get the total number of threads in the running state

public int getRunningThreadCount(a) {
    int rc = 0;
    WorkQueue[] ws;
    WorkQueue w;
    if((ws = workQueues) ! =null) {
        for (int i = 1; i < ws.length; i += 2) {
            if((w = ws[i]) ! =null&& w.isApparentlyUnblocked()) ++rc; }}return rc;
}
Copy the code

(2) Obtain the number of active threads

public boolean isQuiescent(a) {
    return (config & SMASK) + (int)(ctl >> AC_SHIFT) <= 0;
}
Copy the code

(3) Determine whether the ForkJoinPool is idle

public boolean isQuiescent(a) {
    return (config & SMASK) + (int)(ctl >> AC_SHIFT) <= 0;
}
Copy the code

(4) Obtain the number of tasks stolen

public long getStealCount(a) {
    AtomicLong sc = stealCounter;
    long count = (sc == null)?0 L : sc.get();
    WorkQueue[] ws;
    WorkQueue w;
    if((ws = workQueues) ! =null) {
        for (int i = 1; i < ws.length; i += 2) {
            if((w = ws[i]) ! =null) count += w.nsteals; }}return count;
}
Copy the code

(5) Obtain the number of tasks in the queue

public long getQueuedTaskCount(a) {
    long count = 0;
    WorkQueue[] ws;
    WorkQueue w;
    if((ws = workQueues) ! =null) {
        for (int i = 1; i < ws.length; i += 2) {
            if((w = ws[i]) ! =null) count += w.queueSize(); }}return count;
}

Copy the code

(6) Obtain the number of submitted tasks

public int getQueuedSubmissionCount(a) {
    int count = 0;
    WorkQueue[] ws;
    WorkQueue w;
    if((ws = workQueues) ! =null) {
        for (int i = 0; i < ws.length; i += 2) {
            if((w = ws[i]) ! =null) count += w.queueSize(); }}return count;
}
Copy the code

Be alert to ForkJoinPool#commonPool

In the source code shown above, you may have noticed the presence of commonPool in several places. In ForkJoinPool, the commonPool is a shared, static thread pool that is lazily loaded only when used, as is used by CompletableFuture and Parallel Streams in Java8. However, you can specify your own thread pool when using CompletableFuture, but parallel streams cannot, which is a caution. Why do you say that?

The commonPool in ForkJoinPool was designed to reduce repeated creation of thread pools by allowing tasks to share the same pool, since both pool creation and thread creation are expensive. However, every coin has two sides. CommonPool can be used to reuse thread pools in some situations, but if you decide to share your space with others, it may no longer be entirely yours when you want to use it. That is, by the time you want to use commonPool, it may already be filled with other tasks.

There are two types of tasks submitted to the ForkJoinPool: computed and blocked. Consider a scenario where the shared thread pool is being used in multiple places in your application, and someone does something wrong somewhere, such as dropping a blocking task into the pool. What happens? As a result, of course, the entire thread pool could be blocked! As such, the entire app is at risk of being dragged down. At this point, you should be extremely wary of the use of parallel streams in Java8.

So how do you avoid that? The answer is to avoid using commonPool as much as possible, and to create separate thread pools that are isolated from the rest of the system when blocking tasks need to be run to prevent risk from spreading.

ForkJoinPool Performance evaluation

To test the performance of ForkJoinPool, I conducted a set of simple, informal experiments. The experiment was divided into three groups. In order to make the data of each group as objective as possible, each group of experiments was run for 5 times, and the final average was taken.

Experimental code: Sample code from the first part of this article;
Experiment environment: Mac;
JDK version: 8;
Task separation threshold: 100

The experimental results are shown in the following table:

The number of	1000 order time (milliseconds)		1000000 time (milliseconds)		1000000000 time (milliseconds)
The number of	Fork/Join	Single thread	Fork/Join	Single thread	Fork/Join	Single thread
1	4	0	34	5	1157	313
2	3	0	34	6	848	344
3	5	0	16	9	1069	325
4	4	0	35	8	955	307
5	5	0	30	22	922	385
On average,	4.2	0	29.8	10	990.2	334.8

Based on experimental results (0 means less than 1 millisecond), ForkJoinPool performance is not as efficient as single-threaded performance! This result, it seems very pleasantly surprised, very unexpected… But why is this?

Don’t be surprised, the reason for this surprising result is that the granularity of task splitting is too small! In the above test, the task splitting threshold is only 100, resulting in a large number of task splitting actions during Fork/Join calculation, that is, tasks are divided too thin, and a large number of task splitting and management also need additional costs.

Taking the sum from 0 to 1000000 as an example, when the threshold is adjusted from 100 to 100000, the result is as follows. As you can see, Fork/Join has its advantages.

====== ForkJoin task splitting: 16383 ForkJoin calculation result: 4999999500000000 ForkJoin calculation time: 143 ====== Single-thread calculation result: 4999999500000000 Single-thread calculation time: 410Copy the code

So, what factors affect Fork/Join performance?

Based on experience and experimentation, the total number of tasks, the execution time per task, and the number of parallelisms all affect performance. Therefore, when you use the Fork/Join framework, you need to evaluate these three metrics carefully, preferably through simulation and comparison, rather than using them in a production environment.

summary

That’s all there is to ForkJoinPool. Fork/Join is a model based on divide and conquer algorithm, which has significant advantages in concurrent processing of computational tasks. The improvement of its efficiency is mainly due to two aspects:

Task segmentation: divide large tasks into smaller tasks of smaller granularity, so that more threads can participate in the execution;
Task stealing: Task stealing makes the most of idle threads and reduces contention.

When using ForkJoinPool, it is important to note that the tasks are of pure function type, that is, they should not be concerned with state or external changes, which is the safest way to do it. If it’s a blocking type of task, then you need to carefully evaluate the technical options. Although ForkJoinPool can handle blocking types of tasks, it can impose complex administrative costs.

In terms of performance, it is important to recognize that Fork/Join performance does not come out of the box, but requires you to evaluate and validate some important indicators, and draw the best conclusions from the data comparison.

In addition, ForkJoinPool provides commonPool but is not recommended or used with caution due to potential risks.

The teacher’s trial

Hands-on: Use ForkJoinPool to sort List arrays.

About the author

Pay attention to [technology 8:30], get the article updates in time. Pass on quality technical articles, record the coming-of-age stories of ordinary people, and occasionally talk about life and ideals. 8:30 in the morning push author quality original, 20:30 in the evening push industry depth good article.

If this article is helpful to you, welcome to like, follow, supervise, we together from bronze to king.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Diamond 2: Divide and Conquer – How to understand ForkJoinPool speed and Slow from principle

Divide and conquer algorithm and Fork/Join mode

Ii. Application scenarios and experience of Fork/Join

ForkJoinPool design and source code analysis

1. Different ways to construct ForkJoinPool

2. Submit tasks by type

3. ForkJoinTask

(1) Fork and join

(2) RecursiveAction and RecursiveTask

(3) Limits on using ForkJoinTask

4. Work queue and task theft

5. ForkJoinPool monitoring

Be alert to ForkJoinPool#commonPool

ForkJoinPool Performance evaluation

summary

The teacher’s trial

Further reading and references

About the author

Diamond 2: Divide and Conquer – How to understand ForkJoinPool speed and Slow from principle

Divide and conquer algorithm and Fork/Join mode

Ii. Application scenarios and experience of Fork/Join

ForkJoinPool design and source code analysis

1. Different ways to construct ForkJoinPool

2. Submit tasks by type

3. ForkJoinTask

(1) Fork and join

(2) RecursiveAction and RecursiveTask

(3) Limits on using ForkJoinTask

4. Work queue and task theft

5. ForkJoinPool monitoring

Be alert to ForkJoinPool#commonPool

ForkJoinPool Performance evaluation

summary

The teacher’s trial

Further reading and references

About the author

Related Posts

Redis performance optimization of 13 catch-22! In the history of the most complete

Learn Rabbitmq from installation to practice

Chapter IX Freezing Plans