JAVA concurrency? What do we do? Let's see here (2)

Mind mapping

J.U.C – AQS

Java.util.concurrent (J.U.C) greatly improves concurrency performance, and AQS are considered to be at the heart of J.U.C.

CountDownLatch

Used to control one or more threads to wait for multiple threads.

A counter CNT is maintained and every call to the countDown() method decays the counter by 1, at which point threads that are waiting because of calling the await() method are woken up.

public class CountdownLatchExample {

    public static void main(String[] args) throws InterruptedException {
        final int totalThread = 10;
        CountDownLatch countDownLatch = new CountDownLatch(totalThread);
        ExecutorService executorService = Executors.newCachedThreadPool();
        for (int i = 0; i < totalThread; i++) {
            executorService.execute(() -> {// Create a totalThread thread
                System.out.print("run..");
                countDownLatch.countDown();
            });
        }
        countDownLatch.await();
        System.out.println("end"); executorService.shutdown(); }}Copy the code

run.. run.. run.. run.. run.. run.. run.. run.. run.. run.. endCopy the code

CyclicBarrier

It is used to control multiple threads to wait for each other, and only when multiple threads have arrived will they continue to execute.

Similar to CountdownLatch, both are implemented by maintaining counters. After the thread executes the await() method, the counter decreases by one and waits until the counter reaches zero and all waiting threads calling the await() method can continue.

One difference between CyclicBarrier and CountdownLatch is that the CyclicBarrier’s counter can be recycled by calling the reset() method, hence its name.

CyclicBarrier has two constructors, parties indicating the initial value of the counter and barrierAction executing once when all threads have reached the barrier.

public CyclicBarrier(int parties, Runnable barrierAction) {
    if (parties <= 0) throw new IllegalArgumentException();
    this.parties = parties;
    this.count = parties;
    this.barrierCommand = barrierAction;
}

public CyclicBarrier(int parties) {
    this(parties, null);
}

Copy the code

public class CyclicBarrierExample {

    public static void main(String[] args) {
        final int totalThread = 10;
        CyclicBarrier cyclicBarrier = new CyclicBarrier(totalThread);
        ExecutorService executorService = Executors.newCachedThreadPool();
        for (int i = 0; i < totalThread; i++) {
            executorService.execute(() -> {
                System.out.print("before..");
                try {
                    cyclicBarrier.await();
                } catch (InterruptedException | BrokenBarrierException e) {
                    e.printStackTrace();
                }
                System.out.print("after.."); }); } executorService.shutdown(); }}Copy the code

before.. before.. before.. before.. before.. before.. before.. before.. before.. before.. after.. after.. after.. after.. after.. after.. after.. after.. after.. after..Copy the code

Semaphore

Semaphore is similar to an operating system Semaphore that controls the number of threads accessing a mutex resource.

The following code simulates concurrent requests to a service that can only be accessed by three clients at a time, with a total of 10 requests.

public class SemaphoreExample {

    public static void main(String[] args) {
        final int clientCount = 3;
        final int totalRequestCount = 10;
        Semaphore semaphore = new Semaphore(clientCount);
        ExecutorService executorService = Executors.newCachedThreadPool();
        for (int i = 0; i < totalRequestCount; i++) {
            executorService.execute(()->{
                try {
                    semaphore.acquire();
                    System.out.print(semaphore.availablePermits() + "");
                } catch (InterruptedException e) {
                    e.printStackTrace();
                } finally{ semaphore.release(); }}); } executorService.shutdown(); }}Copy the code

2, 1, 2, 2, 2, 2, 2, 1, 2, 2Copy the code

J.U.C – Other components

FutureTask

When we introduced Callable, we learned that it can have a return value, which is encapsulated by a Future. FutureTask implements the RunnableFuture interface, which inherits from the Runnable and Future interfaces, allowing FutureTask to either execute as a task or have a return value.

public class FutureTask<V> implements RunnableFuture<V>
Copy the code

public interface RunnableFuture<V> extends Runnable.Future<V>
Copy the code

FutureTask can be used in scenarios where an execution result is retrieved asynchronously or a task is cancelled. When a computational task takes a long time to execute, it can be encapsulated with FutureTask, and the main thread will fetch the results after it completes its task.

public class FutureTaskExample {

    public static void main(String[] args) throws ExecutionException, InterruptedException {
        FutureTask<Integer> futureTask = new FutureTask<Integer>(new Callable<Integer>() {
            @Override
            public Integer call(a) throws Exception {
                int result = 0;
                for (int i = 0; i < 100; i++) {
                    Thread.sleep(10);
                    result += i;
                }
                returnresult; }}); Thread computeThread =new Thread(futureTask);
        computeThread.start();

        Thread otherThread = new Thread(() -> {
            System.out.println("other task is running...");
            try {
                Thread.sleep(1000);
            } catch(InterruptedException e) { e.printStackTrace(); }}); otherThread.start(); System.out.println(futureTask.get()); }}Copy the code

other task is running...
4950
Copy the code

BlockingQueue

Java. Util. Concurrent. BlockingQueue interface has the following the realization of the blocking queue:

FIFO queues: LinkedBlockingQueue, ArrayBlockingQueue (fixed length)
Priority queue: PriorityBlockingQueue

Provide blocking take() and put() methods: if the queue is empty, take() will block until there is something in the queue; If the queue is full, put() blocks until there is a free place in the queue.

Implement producer-consumer issues using BlockingQueue

public class ProducerConsumer {

    private static BlockingQueue<String> queue = new ArrayBlockingQueue<>(5);

    private static class Producer extends Thread {
        @Override
        public void run(a) {
            try {
                queue.put("product");
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.print("produce.."); }}private static class Consumer extends Thread {

        @Override
        public void run(a) {
            try {
                String product = queue.take();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.print("consume.."); }}}Copy the code

public static void main(String[] args) {
    for (int i = 0; i < 2; i++) {
        Producer producer = new Producer();
        producer.start();
    }
    for (int i = 0; i < 5; i++) {
        Consumer consumer = new Consumer();
        consumer.start();
    }
    for (int i = 0; i < 3; i++) {
        Producer producer = newProducer(); producer.start(); }}Copy the code

produce.. produce.. consume.. consume.. produce.. consume.. produce.. consume.. produce.. consume..Copy the code

ForkJoin

It is used for parallel computing. Similar to MapReduce, a large computing task is divided into multiple small tasks for parallel computing.

public class ForkJoinExample extends RecursiveTask<Integer> {

    private final int threshold = 5;
    private int first;
    private int last;

    public ForkJoinExample(int first, int last) {
        this.first = first;
        this.last = last;
    }

    @Override
    protected Integer compute(a) {
        int result = 0;
        if (last - first <= threshold) {
            // The task is small enough to calculate directly
            for (inti = first; i <= last; i++) { result += i; }}else {
            // Break it down into small tasks
            int middle = first + (last - first) / 2;
            ForkJoinExample leftTask = new ForkJoinExample(first, middle);
            ForkJoinExample rightTask = new ForkJoinExample(middle + 1, last);
            leftTask.fork();
            rightTask.fork();
            result = leftTask.join() + rightTask.join();
        }
        returnresult; }}Copy the code

public static void main(String[] args) throws ExecutionException, InterruptedException {
    ForkJoinExample example = new ForkJoinExample(1.10000);
    ForkJoinPool forkJoinPool = new ForkJoinPool();
    Future result = forkJoinPool.submit(example);
    System.out.println(result.get());
}

Copy the code

ForkJoin is started using a ForkJoinPool, a special thread pool whose number of threads depends on the number of CPU cores.

public class ForkJoinPool extends AbstractExecutorService

Copy the code

ForkJoinPool implements work stealing algorithms to improve CPU utilization. Each thread maintains a double-ended queue (inserts and deletes at both ends of a linear table) to store tasks that need to be performed. The job stealing algorithm allows an idle thread to steal a task from another thread’s two-ended queue to execute. The stolen task must be the latest to avoid competing with the queue thread. For example, in the figure below, Thread2 takes the latest Task1 task from Thread1’s queue, and Thread1 takes Task2 out to execute it, thus avoiding a race. However, contention can still occur if there is only one task in the queue.

Example of thread insecurity

If multiple threads access the same shared data without performing a synchronization operation, the results of the operation are inconsistent.

The following code shows 1000 threads incrementing the CNT at the same time, and it may be less than 1000 at the end of the operation.

public class ThreadUnsafeExample {

    private int cnt = 0;

    public void add(a) {
        cnt++;
    }

    public int get(a) {
        returncnt; }}Copy the code

public static void main(String[] args) throws InterruptedException {
    final int threadSize = 1000;
    ThreadUnsafeExample example = new ThreadUnsafeExample();
    final CountDownLatch countDownLatch = new CountDownLatch(threadSize);
    ExecutorService executorService = Executors.newCachedThreadPool();
    for (int i = 0; i < threadSize; i++) {
        executorService.execute(() -> {
            example.add();
            countDownLatch.countDown();
        });
    }
    countDownLatch.await();
    executorService.shutdown();
    System.out.println(example.get());
}

Copy the code

997
Copy the code

Java memory model

The Java memory model attempts to mask the differences in memory access between hardware and operating systems, so that Java programs can achieve a consistent memory access effect across all platforms.

Main memory vs. working memory

Registers on the processor can read and write several orders of magnitude faster than memory, and to resolve this speed contradiction, a cache is inserted between them.

The addition of caching brings a new problem: cache consistency. If multiple caches share the same main memory area, data from multiple caches may be inconsistent, and some protocol is required to resolve this problem.

Threads can only manipulate variables in working memory directly, and variable values between threads need to be passed through main memory.

Intermemory operation

The Java memory model defines eight operations to interact with main and working memory.

Read: Transfers the value of a variable from main memory to working memory
Load: Executes after read, putting the value of read into a copy of the variables in working memory
Use: Passes the value of a variable in working memory to the execution engine
Assign: A variable that assigns a value received from the execution engine to working memory
Store: Transfers the value of a variable in working memory to main memory
Write: Executes after store, putting the value of store into a variable in main memory
Lock: variable applied to main memory
unlock

There are three main features of the memory model

atomic

Indicates that the operation is indivisible and cannot be interrupted, and must be performed entirely or not at all.

visibility

Visibility means that when one thread changes the value of a shared variable, other threads are immediately aware of the change.

There are three main ways to achieve visibility:

volatile
Synchronized, the value of a variable must be synchronized back to main memory before it can be unlocked.
Final, fields decorated with the final keyword are visible to other threads in the constructor once the initialization is complete and no this escape has occurred (other threads access the half-initialized object through this reference).

Using the volatile modifier for the CNT variable in the previous thread-unsafe example does not solve thread-unsafe problems because volatile does not guarantee atomicity of operations.

order

Orderliness means that all operations are in order when observed in the thread. Observing from one thread to another, all operations are out of order because instruction reordering has occurred.

The volatile keyword prevents instruction reordering by adding a memory barrier that does not place subsequent instructions in front of the barrier.

Orderliness can also be guaranteed by synchronized, which guarantees that only one thread executes synchronized code at any one time, effectively ordering the threads to execute synchronized code sequentially.

Principle of antecedent

As mentioned above, you can use volatile and synchronized to ensure order. In addition, the JVM provides for the principle of antecedent, allowing one operation to complete before another without control.

Single thread principle

In a thread, actions taken earlier in the program precede actions taken later.

Pipe lock rules

An UNLOCK operation occurs first after a lock operation on the same lock.

Volatile variable rule

A write to a volatile variable occurs first after a read to that variable.

Thread start rule

The start() method of the Thread object calls each action of the Thread first.

Thread joining rule

The end of the Thread object occurs first when the join() method returns.

Thread interrupt rule

A call to the interrupt() method occurs when code in the interrupted thread detects the occurrence of an interrupt, which can be detected through the interrupted() method.

Object finalization rule

The completion of an object’s initialization (the end of constructor execution) occurs first at the beginning of its Finalize () method.

Finalize () is a method in Object, which will be called before the garbage collector collects the memory of the Object. When an Object is declared dead by a virtual machine, finalize() will be called first, and then the Object can deal with the last things before it is dead (the Object can get rid of the fate of death at this time).

transitivity

If operation A precedes operation B and operation B precedes operation C, then operation A precedes operation C.

Thread safety

Multiple threads can behave correctly no matter how they access a class and do not need to synchronize in the calling code.

Thread safety can be implemented in the following ways:

immutable

Immutable objects are thread-safe, requiring no thread-safety safeguards. As long as an immutable object is constructed correctly, you will never see it in an inconsistent state across multiple threads. In multithreaded environments, objects should be made immutable as much as possible to meet thread safety.

An immutable type:

The base data type modified by the final keyword
String
Enumerated type
Number subclasses include numeric wrapper types such as Long and Double, and big data types such as BigInteger and BigDecimal. But the AtomicInteger and AtomicLong classes, both Number, are mutable.

For collection types, you can use the Collections. UnmodifiableXXX () method to get an immutable collection.

public class ImmutableExample {
    public static void main(String[] args) {
        Map<String, Integer> map = new HashMap<>();
        Map<String, Integer> unmodifiableMap = Collections.unmodifiableMap(map);
        unmodifiableMap.put("a".1); }}Copy the code

Exception in thread "main" java.lang.UnsupportedOperationException
    at java.util.Collections$UnmodifiableMap.put(Collections.java:1457)
    at ImmutableExample.main(ImmutableExample.java:9)

Copy the code

Collections. UnmodifiableXXX () on the set of the original copy first, need to modify the collection methods are directly throw an exception.

public V put(K key, V value) {
    throw new UnsupportedOperationException();
}

Copy the code

The mutex synchronization

Synchronized and already.

Nonblocking synchronization

The main problem with mutex synchronization is the performance problem caused by thread blocking and waking up, so it is also called blocking synchronization.

Mutex synchronization is a pessimistic concurrency strategy that assumes that if you don’t do the right synchronization, you’re bound to have a problem. Whether or not shared data is actually competing, it does locking (this is a conceptual model, but the virtual machine optimizes a large portion of unnecessary locking), user-mode core mind-shifting, maintaining lock counters, and checking to see if there are blocked threads that need to be woken up.

As the hardware instruction set evolves, we can use an optimistic concurrency strategy based on collision detection: do the operation first, and if there are no other threads contending for the shared data, then the operation succeeds, otherwise compensate (keep retrying until it succeeds). Many implementations of this optimistic concurrency strategy do not require threads to block, so this synchronization operation is called non-blocking synchronization.

CAS

The CAS instruction requires three operands, which are the memory address V, the old expected value A, and the new value B. When performing the operation, the value of V is updated to B only if the value of V is equal to A.

CAS semantics is “I think the V value should be A, if it is, so will the value of V updated to B, or how much is the value of the actual changes and not tell V”, the CAS is A optimistic locking technology, when multiple threads try to use the CAS for the same variable, at the same time only one thread can update variable values, and all the other thread failure, The thread that failed is not suspended, but is told that it lost the race and can try again.

AtomicInteger

The method of the integer atom class AtomicInteger in the J.U.C package calls the CAS operation of the Unsafe class.

The following code uses AtomicInteger to perform the increment operation.

private AtomicInteger cnt = new AtomicInteger();

public void add(a) {
    cnt.incrementAndGet();
}

Copy the code

The following code is the source for incrementAndGet(), which calls the Unsafe getAndAddInt().

public final int incrementAndGet(a) {
    return unsafe.getAndAddInt(this, valueOffset, 1) + 1;
}
Copy the code

The following code is the source of getAndAddInt(), var1 indicates the memory address of the object, var2 indicates the offset of the field relative to the memory address of the object, and var4 indicates the value to be added by the operation, which is 1. Use getIntVolatile(var1, var2) to get the old expected value. Use compareAndSwapInt() to do the CAS comparison. If the value in the field’s memory address is equal to var5, Update the variable with memory address var1+var2 to var5+var4.

You can see that getAndAddInt() goes in a loop, and the conflict is repeated retry.

public final int getAndAddInt(Object var1, long var2, int var4) {
    int var5;
    do {
        var5 = this.getIntVolatile(var1, var2);
    } while(!this.compareAndSwapInt(var1, var2, var5, var5 + var4));

    return var5;
}

Copy the code

ABA

If A variable is first read with A value of A, its value is changed to B, and then changed back to A, the CAS operation will assume that it has never been changed.

The J.U.C package addresses this problem by providing a tagged atom reference class, AtomicStampedReference, that guarantees the correctness of the CAS by controlling the version of the variable value. ABA problems do not affect concurrency in most cases, and if ABA problems need to be addressed, switching to traditional mutex synchronization may be more efficient than atomic classes.

Asynchronous scheme

Synchronization is not necessary to be thread-safe. If a method does not inherently involve sharing data, it naturally does not require any synchronization measures to ensure correctness.

The stack is closed

There are no thread-safety issues when multiple threads access local variables of the same method because local variables are stored in the virtual machine stack and are thread-private.

public class StackClosedExample {
    public void add100(a) {
        int cnt = 0;
        for (int i = 0; i < 100; i++) { cnt++; } System.out.println(cnt); }}Copy the code

public static void main(String[] args) {
    StackClosedExample example = new StackClosedExample();
    ExecutorService executorService = Executors.newCachedThreadPool();
    executorService.execute(() -> example.add100());
    executorService.execute(() -> example.add100());
    executorService.shutdown();
}

Copy the code

100
100
Copy the code

Thread Local Storage

If the data needed in one piece of code must be shared with other code, see if the code that shares the data can be guaranteed to execute in the same thread. If we can, we can limit the visibility of shared data to the same thread, so that synchronization is not required to ensure that there is no data contention between threads.

Applications that fit this pattern are not uncommon, and most architectural patterns that use consumption queues (such as the producer-consumer pattern) try to consume products in a single thread. One of the most important application examples is the thread-per-request processing in the classic Web interaction model. The widespread application of this processing method enables many Web server applications to use thread-local storage to solve the thread-safety problem.

You can use the java.lang.ThreadLocal class to implement thread-local storage.

For the following code, thread1 sets threadLocal to 1 and thread2 sets threadLocal to 2. After a certain amount of time, thread1 reads threadLocal as 1, unaffected by thread2.

public class ThreadLocalExample {
    public static void main(String[] args) {
        ThreadLocal threadLocal = new ThreadLocal();
        Thread thread1 = new Thread(() -> {
            threadLocal.set(1);
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            System.out.println(threadLocal.get());
            threadLocal.remove();
        });
        Thread thread2 = new Thread(() -> {
            threadLocal.set(2); threadLocal.remove(); }); thread1.start(); thread2.start(); }}Copy the code

1
Copy the code

Reentrant Code

This Code, also known as Pure Code, can be interrupted at any point in its execution to execute another piece of Code (including recursive calls to itself) without any errors in the original program after control is returned.

Reentrant code has some common characteristics, such as not relying on data stored on the heap and common system resources, using state quantities passed in from parameters, and not calling non-reentrant methods.

Lock the optimization

Lock optimization here refers primarily to JVM optimization for synchronized.

spinlocks

Mutex synchronization entering a blocking state is expensive and should be avoided as much as possible. In many applications, shared data is locked for only a short period of time. The idea of a spinlock is to have a thread perform a busy loop (spin) for a period of time when it requests a lock that shares data, and if the lock is acquired during that time, it can avoid entering a blocking state.

Busy loops: A busy loop is a loop where the programmer makes a thread wait. Unlike the traditional methods wait(), sleep(), or yield(), which give up CPU control, a busy loop does not give up CPU; it simply runs an empty loop. This is done to preserve the CPU cache.

Spin-locks reduce overhead by avoiding blocking, but they require busy loops that consume CPU time and are only suitable for scenarios where shared data is locked in a very short state.

Adaptive spin locking was introduced in JDK 1.6. Adaptive means that the number of spins is no longer fixed, but is determined by the number of spins on the same lock the previous time and the state of the lock owner.

Lock elimination

Lock elimination refers to the elimination of locks on shared data that are detected to be impossible to compete with.

Lock elimination is mainly supported by escape analysis. If shared data on the heap cannot escape and be accessed by other threads, it can be treated as private data and thus unlocked.

For code that doesn’t appear to be locked, many locks are implicitly added. For example, the following string concatenation code implicitly locks:

public static String concatString(String s1, String s2, String s3) {
    return s1 + s2 + s3;
}

Copy the code

String is an immutable class, and the compiler automatically optimizes String concatenation. Prior to JDK 1.5, this translates to a successive append() operation on a StringBuffer object:

public static String concatString(String s1, String s2, String s3) {
    StringBuffer sb = new StringBuffer();
    sb.append(s1);
    sb.append(s2);
    sb.append(s3);
    return sb.toString();
}

Copy the code

Each append() method has a synchronization block. The virtual machine looks at the variable sb and quickly discovers that its dynamic scope is limited within the concatString() method. That is, all references to sb never escape out of the concatString() method, and other threads cannot access it, so it can be eliminated.

Lock coarsening

If a series of continuous operations repeatedly lock and unlock the same object, frequent locking operations can lead to performance degradation.

The successive append() method in the sample code in the previous section is such a case. If the virtual machine detects that the same object is locked by such a piecemeal sequence of operations, the scope of locking will be extended (coarsened) outside the entire sequence of operations. For the example in the previous section, the code extends before the first append() operation and after the last append() operation, requiring only one lock.

Lightweight lock

Lightweight locks are not intended to replace heavyweight locks, but are intended to reduce the performance cost of traditional heavyweight locks using OS mutex without multi-threaded competition

Biased locking

The idea of biased locking is to favor the first thread to acquire the lock object so that the thread does not need to perform synchronization operations, or even CAS operations, after acquiring the lock.

Good practice for multithreaded development

Give threads meaningful names to help you find bugs.
Narrow the synchronization range to reduce lock contention. For synchronized, for example, try to use synchronized blocks rather than synchronized methods.
Use more synchronization tools and less wait() and notify(). First, the synchronization classes CountDownLatch, CyclicBarrier, Semaphore and sanodomain simplify encoding, while wait() and notify() are difficult to implement complex control flow. Second, these synchronous classes are written and maintained by the best companies and will continue to be optimized and refined in subsequent JDKS.
Implement producer-consumer issues using BlockingQueue.
More concurrent collections and less synchronous collections, for example ConcurrentHashMap should be used instead of Hashtable.
Use local variables (local variables) and immutable classes to ensure thread-safety.
Use thread pools instead of creating threads directly because thread creation is expensive and thread pools can effectively start tasks with a limited number of threads.

Threads and Locks, Threads and Locks, CS-Notes, three features of the Java memory model, lightweight Locks, thread State Java

JAVA concurrency? What do we do? Let’s see here (2)