This article is the nuggets community first contract article, not authorized to reprint

Multicore machines are now very common. Even a mobile phone is equipped with a powerful multi-core processor. By means of multi-process and multi-thread, multiple cpus can be made to work simultaneously to speed up the execution of tasks.

Multithreading is a more advanced topic in programming. Because it involves sharing resources, it can be very problematic when coding. The Concurrent package in Java provides a number of tools to help simplify the synchronization of these variables, but the road to learning the application is still full of twists and turns.

This article will briefly introduce the basic knowledge of multithreading in Java. Then I will highlight some of the most problematic areas for beginners in multithreaded programming, many of which are blood and tears of experience. Getting around these pits is like getting around 90% of the nasty multithreading bugs.

1. Basic concepts of multithreading

1.1 Lightweight Processes

In the JVM, a thread is actually a lightweight process (LWP). The so-called lightweight process is actually a set of interfaces provided by the user process to call the system kernel. In fact, it also calls the underlying kernel thread (KLT).

In fact, JVM thread creation, destruction, and scheduling are all operating system dependent. If you look at the various functions in the Thread class, you’ll see that many of them are native and directly call the underlying operating system functions.

The following is a simple threading model for the JVM on Linux.

As you can see, different threads are frequently switching between user and kernel state. The cost of this Switch is high, which is commonly known as Context Switch.

1.2 the JMM

Before introducing thread synchronization, it’s worth introducing a new term, the JVM’s memory model JMM.

The JMM does not refer to partitions of memory such as heap and Metaspace; it is an entirely different concept and refers to the Thread-specific Java runtime thread-memory model.

Since many of the instructions in Java code are non-atomic when executed, if the order of execution of these values is out of place, the results will be different. For example, i++ actions can be translated into the following bytecode.

getfield      // Field value:I
iconst_1
iadd
putfield      // Field value:I
Copy the code

And that’s just at the code level. If you add in levels of caching per CPU core, the execution becomes even more subtle. If we want to execute I — after I ++, we can’t do that with just a rudimentary bytecode instruction. We need some synchronization.

The figure above shows the JMM Memory model, which is divided into Main Memory and Working Memory. These variables that we normally operate on in Thread are actually a copy of the main memory of the operation. When the changes are made, they need to be rewritten to main memory for other threads to know about the changes.

1.3 Common Thread synchronization methods in Java

In order to accomplish JMM operations, Java provides a number of synchronization methods to accomplish variable synchronization between threads.

  1. Java base class Object provides primitives of Wait and notify to synchronize between Monitor. But this is something we rarely see in business programming
  2. Use synchronized to synchronize a method, or to lock an object to complete the synchronization of a code block
  3. Use the reentrant lock in the Concurrent package. The locks are based on AQS
  4. Use the volatile lightweight synchronization keyword for real-time visibility of variables
  5. Use the Atomic series to add and subtract
  6. Use ThreadLocal ThreadLocal variables to achieve thread closure
  7. Implement producer consumers using various tools provided by the Concurrent package, such as LinkedBlockingQueue. The essence is AQS
  8. Sequential execution of concurrent tasks using the join of Thread and the various await methods

As you can see from the above description, there is a lot to learn about multithreaded programming. Fortunately, there are a lot of different ways to synchronize, but there are very few ways to create threads.

The first is the Thread class. You know there are two ways to do this. First, Thread overrides its run method. The second is to implement the Runnable interface and implement its run method. The third way to create a thread is through a thread pool.

In the end, there is only one way to start, and that is Thread. Thread pools and Runnable are just encapsulated shortcuts.

Multithreading is so complex, so easy to give problems, that common have those problems, how should we avoid it? Below, I will introduce 10 frequently occurring pits and offer solutions.

2. Pothole avoidance Guide

2.1. Thread pool crashes the machine

First, let’s talk about a very, very low-level multithreaded error with serious consequences.

In general, threads are created by Thread, Runnable, and Thread pool. With the popularity of Java1.8, the thread pool approach is now the most common.

Once, our online server froze, even remote SSH, can not log in, had no choice but to restart. And you find that when you launch an app, within minutes, this happens. I finally located a few lines of code that were quite ironic.

A student who is less familiar with multithreading uses thread pools to process messages asynchronously. Usually, we treat the thread pool as a static or member variable of the class. But this guy, he put it inside the method. That is, every time a request comes in, a new thread pool is created. As the number of requests increases, system resources are exhausted, resulting in the death of the entire machine.

void realJob(a){
    ThreadPoolExecutor exe = newThreadPoolExecutor(...) ; exe.submit(new Runnable(){...})
}
Copy the code

How can this problem be avoided? Only through code review. So multithreaded code, even simple synchronization keywords, should be written by experienced people. Even if this condition is not available, review the code very carefully.

2.2. Lock should be closed

Locks in concurrent packages provide more flexibility than exclusive locks added with the synchronized keyword. You can choose fair and unfair locks, read locks and write locks as required.

When a Lock is used, it must be closed. That is, Lock and unlock must appear in pairs. Otherwise, the Lock will leak and other threads will never be able to get the Lock.

In the following code, after we call lock, an exception occurs, the execution logic in the try is broken, and the unlock is never executed. In this case, the lock resource acquired by the thread will never be released.

private final Lock lock = new ReentrantLock();
void doJob(a){
    try{
        lock.lock();
        // An exception occurred
        lock.unlock();
    }catch(Exception e){
    }
}
Copy the code

The correct thing to do is to place the unlock function ina finally block and ensure that it always executes.

Since lock is also an ordinary object, it can be used as an argument to a function. If you pass the lock from function to function, the timing logic will also be messed up. In normal coding, also want to avoid this kind of lock as an argument.

2.3. Wait consists of two layers

Object, as the base class of Java, provides four methods wait wait(timeout) notify notifyAll, which are used to deal with thread synchronization. It can be seen that wait and other functions are very important. In ordinary work, students who write business code are less likely to use these functions, so it is easy to have problems once they are used.

But the premise of using these functions have a very big, it is must be conducted using synchronized packages, otherwise you will be thrown IllegalMonitorStateException. The following code, for example, will report an error when executed.

final Object condition = new Object();
public void func(a){
	condition.wait();
}
Copy the code

Similarly, the Condition object in the Concurrent package must appear between lock and unlock when used.

Why do we need to synchronize this object before we wait? Since the JVM requires the thread to hold the object’s Monitor while performing the wait, it is obvious that the synchronization keyword can do this.

However, this is not enough. Wait functions are usually placed in a while loop, and the JDK makes explicit comments in the code.

Important: This is because wait means to execute logic down when notify. However, in notify, the condition of the wait may already be invalid, because the condition may have changed during the waiting period and need to be determined again. Therefore, it is a simple way to write in the while loop.

final Object condition = new Object();
public void func(a){
	synchronized(condition){
		whileCondition. Wait (); }}}Copy the code

Wait and notify with the if condition have two layers: synchronized and while. This is the correct use of wait and other functions.

2.4. Do not overwrite lock objects

The synchronized keyword locks this object if appended to a normal method; If the static method is loaded, the lock is class. In addition to methods, synchronized can also directly specify the object to be locked, lock code blocks, to achieve fine-grained lock control.

What happens if the lock object is overwritten? Like this one down here.

List listeners = new ArrayList();

void add(Listener listener, boolean upsert){
    synchronized(listeners){
        List results = new ArrayList();
        for(Listener ler:listeners){ ... } listeners = results; }}Copy the code

The code above may cause the lock to become corrupted or invalid because it is forcibly reassigned to the locklisteners object.

To be on the safe side, we usually declare lock objects as final.

final List listeners = new ArrayList();
Copy the code

Or declare a dedicated lock Object and define it as a common Object.

final Object listenersLock = new Object();
Copy the code

2.5. Handle exceptions in the loop

It is a common requirement to handle timed tasks in asynchronous threads, or batch processing with very long execution times. More than once, I’ve seen my friends’ programs stop halfway through.

The root cause of these aborts is that a row of data fails, causing the entire thread to die.

Let’s take a look at the code template.

volatile boolean run = true;
void loop(a){
    while(run){
    	for(Task task: taskList){
            //do . sth
            int a = 1/0; }}}Copy the code

In the loop function, we perform our real business logic. An exception occurs when a task is being executed. Procedure At this point, the thread does not continue running, but aborts directly by throwing an exception. We all know this behavior when we write ordinary functions, but when we get to multithreading, a lot of people forget about it.

It is worth noting that even nullPointerExceptions of a non-capture type can cause a thread to abort. So it’s a very good habit to always put the logic you’re trying to execute in a try catch.

volatile boolean run = true;
void loop(a){
    while(run){
    	for(Task task: taskList){
    		try{
                //do . sth
                int a = 1/0;
    		}catch(Exception ex){
    			//log}}}}Copy the code

2.6. Correct usage of HashMap

HashMap, in a multi-threaded environment, causes an infinite loop. This problem has become widespread because it can have very serious consequences: CPU runs full, code doesn’t execute, and JStack view blocks on get methods.

As for how to improve the efficiency of HashMap, and when to convert the red-black tree to the list, it is a hot topic in the world.

There are detailed articles on the web describing scenarios where the dead-loop problem arises, largely because a HashMap, when rehashing, forms a loop chain. Some GET requests go to this ring. The JDK does not consider this to be a bug, although it has a nasty impact.

If you decide that your collection class will be used by multiple threads, use thread-safe ConcurrentHashMap instead.

HashMap remove safety remains a problem, and multithreading relationship is not big, but it is throw ConcurrentModificationException, looks like the multithreading. Let’s look at it together.

Map<String, String> map = new HashMap<>();
map.put("xjjdog0"."Dog 1");
map.put("xjjdog1"."Dog 2");
 
for (Map.Entry<String, String> entry : map.entrySet()) {
    String key = entry.getKey();
    if ("xjjdog0".equals(key)) { map.remove(key); }}Copy the code

The above code throws an exception due to the fail-fast mechanism of HashMap. If we want to safely delete elements, we should use iterators.


Iterator<Map.Entry<String, String>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
   Map.Entry<String, String> entry = iterator.next();
   String key = entry.getKey();
   if ("xjjdog0".equals(key)) { iterator.remove(); }}Copy the code

2.7. Scope of protection for thread safety

Is code written using thread-safe classes necessarily thread-safe? The answer is no.

A thread-safe class is only responsible for its internal methods being thread-safe. If we wrap it in a layer, then whether it can achieve thread-safe effect needs to be reconsidered.

In this case, for example, we use a thread-safe ConcurrentHashMap to store counts. While ConcurrentHashMap itself is thread-safe, there is no longer the problem of dead loops. The addCounter function, however, is clearly incorrect and needs to be wrapped with a synchronized function.

private final ConcurrentHashMap<String,Integer> counter;
public int addCounter(String name) {
    Integer current = counter.get(name);
    int newValue = ++current;
    counter.put(name,newValue);
    return newValue;
}
Copy the code

This is one of the most common pitfalls for developers. To achieve thread-safety, you need to look at the scope of thread-safety. If the larger-dimensional logic has synchronization problems, then even thread-safe collections will not achieve the desired results.

2.8. Volatile is of limited use

The volatile keyword, which solves the visibility problem, allows your changes to be immediately read by other threads.

Although there are many questions about this in the interview, including the optimizations for Volatile in ConcurrentHashMap squadron. But in ordinary use, you’ll really only touch value changes to Boolean variables.

volatile boolean closed;  
  
public void shutdown(a) {   
    closed = true;   
}  
Copy the code

Never use it for counting or thread synchronization, such as the following.

volatile count = 0;
void add(a){
    ++count;
}
Copy the code

This code is not accurate in multithreaded environments. This is because volatile only guarantees visibility, not atomicity, and multithreading does not guarantee correctness.

It’s better to just use the Atomic class or sync keyword. Do you really care about the nanosecond difference?

2.9. Be careful with dates

A lot of times, dates can be a problem. This is because the use of the global Calendar, SimpleDateFormat and so on. Data corruption occurs when multiple threads execute the format function simultaneously.

SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd hh:mm:ss");

Date getDate(String str){
    return format(str);
}
Copy the code

To improve, we usually put SimpleDateFormat in ThreadLocal, one copy per thread, to avoid some problems. Of course, now we can use thread-safe DateTimeFormatter.

static DateTimeFormatter FOMATTER = DateTimeFormatter.ofPattern("MM/dd/yyyy HH:mm:ss");
public static void main(String[] args) {
    ZonedDateTime zdt = ZonedDateTime.now();
    System.out.println(FOMATTER.format(zdt));
}
Copy the code

2.10. Do not start threads in constructors

There is nothing wrong with starting a new thread in a constructor, or in a static code block. However, this is not highly recommended.

Because Java has inheritance, if you do this in constructors, the behavior of subclasses becomes pretty magical. In addition, the this object may be used in another location before construction is complete, causing unexpected behavior.

It is better to start the thread in a normal method, such as start. It reduces the chance of bugs.

End

Wait and notify are very problematic places,

The encoding format is very strict. The synchronized keyword is relatively simple, but there are still a number of considerations when synchronizing code blocks. These lessons are still relevant in the various apis provided by the Concurrent package. We also have to deal with all kinds of exceptions in multi-threaded logic to avoid interruptions, to avoid deadlocks. Bypassing these pitfalls, multithreaded code is basically a beginner’s game.

Many Java development, are just contact with multi-threaded development, in the ordinary work of the application is not a lot. If you’re working on a CRUD business system, there’s even less time to write multithreaded code. But there are always exceptions, when your program gets slow, or you’re troubleshooting a problem, and you’re directly involved in multithreaded coding.

Our various tools and software also make extensive use of multithreading. From Tomcat, to various middleware, to various database connection pool caches, there is multithreaded code everywhere.

Even experienced developers can fall into many multithreading pitfalls. Because asynchrony can cause time sequence confusion, data synchronization must be achieved by means of coercion. Multi-thread operation, first of all, to ensure accuracy, use thread-safe collection for data storage; Efficiency, after all, is the goal of multithreading.

Hopefully, these practical cases in this article will give you a better understanding of multithreading.

This article is the nuggets community first contract article, not authorized to reprint