This post is a summary of ThreadLocal, a free course on MOOCs. This course introduces ThreadLocal very clearly and deeply into the underlying principles and design ideas. It is the best material about ThreadLocal I have ever seen. Now I will use my own words to sort it into a text version.

This is the first of four articles expected to be produced.

Consistency problem

What is a consistency problem?

Multithreading makes full use of the power of multi-core CPU and provides high performance for our program. However, sometimes we need multiple threads to work together, and this may involve data consistency issues.

Data consistency refers to the problem that occurs when “multiple subjects” cannot reach “consensus” on “the same” data. There are multiple principals, which could be multiple threads, or multiple server nodes.

Of course, the “multiple subjects” here can also refer to friends, husband and wife, the so-called “different ways, do not seek each other”, said this principle.

The data consistency problem is the price we pay for using distributed or multithreading, which makes the program or system complicated. So what can we do to solve it?

How to solve the consistency problem?

When we solve the consistency problem, there are a few ideas.

The first is “queuing”. If two people don’t agree on a problem, they line up and change it one by one, so that the person behind can always get the value changed by the person in front, and the data is always the same. Our concepts of locks, mutexes, pipes, barriers, and so on in operating systems take advantage of queuing.

Queuing can ensure data consistency, but the performance is very low.

The second is “voting”, in which more than one person can make a decision at the same time, or modify the data at the same time, but the successful modification is decided by voting. This approach is efficient, but it also creates a lot of problems, such as network outages, fraud and so on. Trying to achieve consistency through voting is complicated, often with rigorous mathematics, and requires a number of “messengers” to pass messages back and forth, with some performance overhead.

Paxos and Raft algorithms, which are common in distributed systems, use voting to solve consistency problems. Those of you who are interested can go to my personal website and read my previous article on these two algorithms.

Distributed consensus algorithm

The third is “avoid”. Since data consistency is difficult, is there anything I can do to avoid consistency issues between multiple threads? Git is the implementation we programmers are familiar with, where you modify the same file locally and distribute it, and then use version control and “conflict resolution” to solve the problem.

Today’s topic, ThreadLocal, also uses this “avoid” approach.

We cannot avoid all data inconsistency problems, so we still need to learn queuing and voting to solve data inconsistency problems in different scenarios.

What is a ThreadLocal?

define

ThreadLocal provides “thread-local variables”, a thread-local variable that has a separate value (copy) in multiple threads.

Is it a little confusing? When I first learned about ThreadLocal, I was a little confused about what that meant. The biggest puzzle is: since it is unique to each thread, why don’t I declare and use this local variable directly in the corresponding method when calling the thread?

It turns out that the same thread might be calling a lot of different classes and methods, and you might need to use this variable in a lot of different places. If you go to achieve such a function, the cost is actually quite large.

ThreadLocal is an out-of-the-box, no-overhead, thread-safe utility class that solves this problem perfectly.

ThreadLocal is not unique to the Java language; there are implementations of ThreadLocal in almost every language that offers multithreading features. In Java, ThreadLocal is implemented using hash tables.

Threading model

This diagram provides an intuitive explanation of ThreadLocal’s threading model.

The threading model of ThreadLocal

It’s not complicated. The large black circle on the left represents a progression. There is a table of threads in a process, with red wavy lines representing individual threads.

For each thread, it has its own exclusive data. Each Thread has a ThreadLocalMap object, which is itself a hash table containing some of the Thread’s local variables (red rectangles). The heart of ThreadLocal is also this ThreadLocalMap.

Related source code:

// Thread class variables:
ThreadLocal.ThreadLocalMap threadLocals = null;

// Define ThreadLocalMap:
static class ThreadLocalMap {
 static class Entry extends WeakReference<ThreadLocal<? >>{  /** The value associated with this ThreadLocal. */  Object value; Entry(ThreadLocal<? > k, Object v) { super(k);  value = v;  }  }  // ... } Copy the code

Basic API

The basic API is divided into four parts:

  • Constructor ThreadLocal()
  • Initialize the initialValue ()
  • The visitor to get/set
  • Recovery and remove

The constructor is a generic, and the type passed in is the type of the local variable you want to use. InitialValue () is the default value returned by calling get() if you have not called set(). If the initialization method is not overloaded, null is returned.

If the set() method is called followed by the get() method, the initialValue() method is not called.

InitialValue () is called if set() is called, then remove() is called, then get() is called.

JDK 8 provides the static method withInitial for better initialization.

Sample code:

public class ThreadLocalDemo {
    public static final ThreadLocal<String> THREAD_LOCAL = ThreadLocal.withInitial(() -> {
        System.out.println("invoke initial value");
        return "default value";
    });
  public static void main(String[] args) throws InterruptedException {  new Thread(() ->{  THREAD_LOCAL.set("first thread");  System.out.println(THREAD_LOCAL.get());  }).start();   new Thread(() ->{  THREAD_LOCAL.set("second thread");  System.out.println(THREAD_LOCAL.get());  }).start();   new Thread(() ->{  THREAD_LOCAL.set("third thread");  THREAD_LOCAL.remove();  System.out.println(THREAD_LOCAL.get());  }).start();   new Thread(() ->{  System.out.println(THREAD_LOCAL.get());  }).start();   SECONDS.sleep(1L);  } }  / / output: first thread second thread invoke initial value default value invoke initial value default value Copy the code

Four core scenarios

What is ThreadLocal used for in a real project? Four core application scenarios are summarized here.

Resources holdings

Let’s say we have three different classes. In a Web request, instances of these three classes are invoked in different places and at different times. But the user is the same, and the user data can be stored in “one thread”.

Thread resource holding

At this point, we can put user data into ThreadLocalMap in program 1 and use it in programs 2 and 3.

This has the advantage of holding thread resources for each part of the thread to use, globally fetching, and reducing “programming difficulty.”

The thread is consistent

JDBC is used as an example. We use transactions a lot. How does it work?

Thread resource consistency

JDBC getConnection ensures that all requests from the same thread, regardless of part, will return the same connection. This is done using ThreadLocal.

When a part comes in, JDBC checks ThreadLocal to see if there is already a connection for that thread, and returns it if there is. If not, a connection is requested from the pool and placed in ThreadLocal.

This ensures that all parts of a transaction are in one connection. TheadLocal helps it maintain this consistency and reduce “programming difficulty.”

Thread safety

Suppose we have a thread with a long call link. What should I do if there is an exception on the way? We can put the error information into ThreadLocal when an error occurs and use this value on subsequent links. Using TheadLocal ensures that multiple threads are thread-safe while working on this scenario.

Thread safety

Concurrent computing

Distributed computing

If we have a big task, we can break it up into many smaller tasks, calculate them separately, and finally sum up the results. If distributed computing is used, it may be stored on its own node first. For multi-threaded calculations on a single machine, the results of each thread can be stored in a ThreadLocal and then aggregated.

How do I retrieve the values of all threads in a ThreadLocal? See the next article for analysis.

About the author

Wechat public number: made up a process

Personal website: https://yasinshaw.com

I’m Yasin, not a well-known Java programmer. Pay attention to my public number, grow up with me ~

The public,