The purpose of this article is to clarify the principles of ThreadLocal. The thesis mainly completes the following four parts:

  • See how ThreadLocal prevents set() and get() values from being accessed by other threads.
  • The application of weak reference in ThreadLocalMap is introduced.
  • Explore how ThreadLocalMap implements the hash map function.
  • A memory leak caused by using ThreadLocal is listed and analyzed.

First, let’s look at what ThreadLocal can do:

public class ThreadLocalDemo {
    public static void main(String[] args) {
        final ThreadLocal<Integer> local = new ThreadLocal<>();
        local.set(100);
        Thread t = new Thread(new Runnable() {
            @Override
            public void run(a) {
                System.out.println(Thread.currentThread().getName() + " local: "+ local.get()); }}); t.start(); System.out.println("Main local: "+ local.get()); }}Copy the code

The print result is as follows:

Thread-0 local: null
Main local: 100
Copy the code

The value of local in the main thread set can be obtained by calling get on the main thread, but the result of calling GET in thread T is null.

In this article, the set method called by local is used as the entry point to explore the reasons for this result.

Set ()

In the ThreadLocal source code set() is implemented like this:

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if(map ! =null)
        map.set(this, value);
    else
        createMap(t, value);
}
Copy the code

Get the Thread object (t) where the local.set() statement is currently executed, and then get the ThreadLocalMap object (t) from the local getMap() class.

ThreadLocal.ThreadLocalMap threadLocals = null;
Copy the code

The source of getMap() returns threadLocals:

ThreadLocalMap getMap(Thread t) {
    return t.threadLocals;
}
Copy the code

map ! = null

If the map! = null, map.set(this, value), where this is local.

Local is still the same as local. Instead of creating a copy of local for each thread, we call the set method. It and the incoming value are stored as key-value pairs in a ThreadLocalMap object held internally by each thread.

map == null

CreateMap (t, value) createMap(t, value)

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}
Copy the code

Create a ThreadLocalMap object and assign it to threadLocals.

At this point, the basic principle of a ThreadLocal is clear: threads that operate on a shared ThreadLocal instance actually operate on an internally held ThreadLocalMap object using that instance as a key.

In addition to set(), ThreadLocal also provides get(), remove() and other operations, which are simpler to implement than to describe.

ThreadLocalMap structure

To really understand ThreadLocal, you also need to know what a ThreadLocalMap is.

ThreadLocalMap is a customized hash map suitable only for Maintaining Thread local values.

ThreadLocalMap is a custom map. It is a static internal class with hash functionality and is not related to the map class provided under the java.util package. There is a static Entry class inside. The following describes Entry.

Entry Implementation Principle

First, the class code looks like this:

static class Entry extends WeakReference<ThreadLocal<? >>{
    /** The value associated with this ThreadLocal. */Object value; Entry(ThreadLocal<? > k, Object v) {super(k); value = v; }}Copy the code

Here I quote the comment given in the code: The entries in this hash map extend WeakReference, using its main ref field as the key (which is always a ThreadLocal object). Note that null keys (i.e. entry.get() == Null) mean that the key is no longer referenced.

The first sentence actually tells us that entry inherits WeakReference and uses the field referenced by main method as the key in entry.

When entry.get() == null, it means that the key is no longer referenced.

These two comments will be parsed later.

Weak reference basics

Before we begin this summary, there are two things to know:

  • What is a weak reference? “Objects associated with weak references only survive until the next garbage collection occurs, when the garbage collector works, regardless of whether there is currently enough memory, and only objects associated with weak references are reclaimed.”
  • What is a parameter passing by reference, which belongs to the Java SE basic knowledge does not need to be described.

Next, reading the source code first, when the constructor passes an argument, k for the key is passed into super(), that is, it executes the parent class’s constructor first:

public WeakReference(T referent) {
    super(referent);
}
Copy the code

WeakReference’s constructor continues to call the parent class’s constructor first:

Reference(T referent) {
    this(referent, null);
}

Reference(T referent, ReferenceQueue<? super T> queue) {
    this.referent = referent;
    this.queue = (queue == null)? ReferenceQueue.NULL : queue; }Copy the code

In addition, we don’t see any native methods in the Reference class, but we do see some instance methods, such as get(), which we’ll talk about later.

In the comments, there are words like: “Special treatment by the garbage collector.” It can be seen that the function realization of WeakReference is handed over to the garbage collector for processing, so it is not expanded here. Interested people can refer to the link at the end of the article. Here we only need to understand the use method of WeakReference.

Weak and strong references are not used in the same way. Here is an example of a weak reference:

public class WeakReferenceDemo {
    public static void main(String[] args) {
        WeakReference<Fruit> fruitWeakReference = new WeakReference<>(new Fruit());
        // Fruit f = fruitWeakReference.get();

        if(fruitWeakReference.get() ! =null) {
            System.out.println("Before GC, this is the result");
        }

        System.gc();

        if(fruitWeakReference.get() ! =null) {
            System.out.println("After GC, fruitWeakReference.get() is not null");
        } else {
            System.out.println("After GC, fruitWeakReference.get() is null"); }}}class Fruit {}Copy the code

The following output is displayed:

Before GC, this is the result
After GC, fruitWeakReference.get() is null
Copy the code

With fruitweakReference.get (), you can get the object to which the weak reference points, which is reclaimed after system.gc () is executed.

Use a graph to show the relationship between strong and weak references:

To be clear, references such as “Object obj = new Object()” are strong references, so fruitWeakReference is strong reference, at this time it points to a WeakReference Object, when the new Object, We’re also passing in a new Fruit object, and the whole purpose of this whole line of code is to create a weak reference to that Fruit object. The weak reference is in the object that fruitWeakReference points to.

To use a careless analogy, weak reference is just like a Schrodinger’s cat. We want to know its state, but we can’t observe it by calling itself out of ordinary Java code. If the double slash comment in the previous WeakReferenceDemo is removed, Use a variable f to point to fruitweakReference.get (), but it just points a strong reference to the object originally pointed to by weak reference. Then run the program and get the following results:

Before GC, this is the result
After GC, fruitWeakReference.get() is not null

Process finished with exit code 0
Copy the code

Because the object is strongly referenced, it is not garbage collected.

A key that weakly references Entry

Given the basics, it’s easy to understand how Entry is constructed. For the sake of illustration, let’s assume we can create an Entry object as follows:

Entry entry = new Entry(local, 100);
Copy the code

The relationship between strong and weak references is as follows:

Entry inherits WeakReference, internally maintains a WeakReference, which points to the object local points to in the main method; Entry.get () returns the object to which the weak reference points. If entry.get() == null, it naturally means that the key will no longer be referenced.

So, unlike the Entry class of a normal Map, when an Entry instance of a ThreadLocalMap is created, the key is a weak reference, and the basic structure of a ThreadLocalMap inside a ThreadLocal is clear.

Set () into the order

Set () in ThreadLocal

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if(map ! =null)
        map.set(this, value);
    else
        createMap(t, value);
}
Copy the code

Note that map.set(this, value) is executed when set() is called by local, and if threadLocals is not null, map.set(this, value) is executed. In the previous section, which examined the structure of ThreadLocalMap, this section focuses on the ThreadLocalMap operation method set().

Set () = set();

private void set(ThreadLocal
        key, Object value) {

    Entry[] tab = table;
    int len = tab.length;
    // Compute the hash position I
    int i = key.threadLocalHashCode & (len-1);
    // Handle set key logic
    for(Entry e = tab[i]; e ! =null; e = tab[i = nextIndex(i, len)]) { ThreadLocal<? > k = e.get();if (k == key) {
            e.value = value;
            return;
        }

        if (k == null) {
            replaceStaleEntry(key, value, i);
            return; }}// Save the newly generated Entry object in the hash table
    tab[i] = new Entry(key, value);
    int sz = ++size;
    if(! cleanSomeSlots(i, sz) && sz >= threshold) rehash(); }Copy the code

In this code, I is the index of the hash table (also known as the Hash bucket), which is the location where the new entry is stored. Of course, it needs to be compared before it is stored. ThreadLocalHashCode is obtained as follows:

private static AtomicInteger nextHashCode = new AtomicInteger();

private static final int HASH_INCREMENT = 0x61c88647;

private static int nextHashCode(a) {
    return nextHashCode.getAndAdd(HASH_INCREMENT);
}

private final int threadLocalHashCode = nextHashCode();
Copy the code

0x61C88647 is used for better hashing, which increments threadLocalHashCode with a value of 0x61C88647 each time a new ThreadLocal object calls threadLocalHashCode. As for why 0x61C88647 does a better hash, it involves the Fibonacci Hashing algorithm (the binary form of this number is a Fibonacci Hashing constant with the reverse addition of 1), which can be found at the end of this article.

Of course, there is a bit operation before calculating I, which is very simple. For example, if len is 16(2 to the fourth power) before expanding, then the binary form of len-1 is 1111.

In order to prevent collision, linear detection method is used here, and zipper method is not used. The index rules for probes are as follows:

private static int nextIndex(int i, int len) {
  return ((i + 1 < len) ? i + 1 : 0);
}
Copy the code

The execution logic of the for loop looks like this:

  1. First get the hash table Entry element TAB [I];
  2. Check whether TAB [I] is null. If TAB [I] is null, it indicates that no Entry instance has existed at this position before. The new Entry object is saved at this position in the hash table.
  3. If TAB [I] is not null, either a key to the same object exists, in which case, change value to the desired value. Either the weak reference points to NULL, in which case the replaceStaleEntry method is executed;
  4. Modify I value with nextIndex method, skip to the second step to continue judgment;

After breaking out of the loop and saving the newly generated Entry at the appropriate location in the hash table, size is also incremented by 1. CleanSomeSlots (I, sz) && sz >= threshold will also be rehash().

The main function of both replaceStaleEntry and cleanSomeSlots is to delete entries with weak reference null. The latter search time is log2(n), which is not expanded due to space limitation. The preset function defined in Threshold and HashMap is similar. It’s mostly for capacity expansion, so len times 2/3.

Memory clean

Using the original example, if local is null, the new ThreadLocal object will only be weakly referenced by the ThreadLocalMap instance in the thread and will be collected at the next garbage collection as long as system.gc () is called. What if you want to actively break weak references? Java provides the following methods:

clear()
Copy the code

It is a method provided by the Reference abstract class.

Let’s use an example to discuss possible memory leaks with ThreadLocal.

Memory leak instance

Example source code is as follows:

public class ThreadLocalTest throws InterruptedException{

    public static void main(String[] args) {
        MyThreadLocal<Create50MB> local = new MyThreadLocal<>();

        ThreadPoolExecutor poolExecutor = new ThreadPoolExecutor(5.5.1,
                TimeUnit.MINUTES, new LinkedBlockingQueue<Runnable>());
        for (int i = 0; i < 5; i++) {
            final int[] a = new int[1];
            final ThreadLocal[] finallocal = new MyThreadLocal[1];
            finallocal[0] = local;
            a[0] = i;
            poolExecutor.execute(new Runnable() {
                @Override
                public void run(a) {
                    finallocal[0].set(new Create50MB());
                    System.out.println("add i = " + a[0]); }}); } Thread.sleep(50000);
        local = null;
    }

    static class Create50MB {
        private byte[] bytes = new byte[1024 * 1024 * 50];
    }

    static class MyThreadLocal<T> extends ThreadLocal {
        private byte[] bytes = new byte[1024 * 1024 * 500]; }}Copy the code

First say that the small program design ideas:

This program aims to construct a memory leak situation: MyThreadLocal is null when the thread pool is waiting for the current task, and MyThreadLocal is retrieved from main. ThreadLocalMap instances of individual threads in the thread pool are weakly referenced to MyThreadLocal. However, internally held values are still strongly referenced and cannot be reclaimed.

In this program, we define a MyThreadLocal to make the size of our new MyThreadLocal object 500MB; Create50MB is the capacity package created, and the last value held by each thread is a 50MB Create50MB object. Thread pools are also custom pass-throughs for better control and can work up to five threads at a time. The for loop uses two temporary variables to get around the language restriction that an anonymous inner class referencing an external variable must be declared final.

Start the program, and the running status is shown in the following figure:

The heap size used was 750MB, as expected, and the new MyThreadLocal object came out 500MB with five threads of 50MB each, making a total of 750MB.

After 50 seconds, set local to null, there will be no strong references to MyThreadLocal that point to new, and garbage collection will be performed as follows:

The heap size used was changed to 250MB, which alone does not prove that there are weak references to MyThreadLocal objects in each thread, but there certainly are no strong references.

I have studied the source code of the thread pool before, the thread pool is not destroyed after executing a task, in this case, they are in waiting state, so, the program is always maintained at the size of 250MB, can not be released, once the program condition is changed to large enough, there will be obvious performance problems. The usual solution is to call ThreadLocal’s remove method inside the thread. In fact, ThreadLocal doesn’t provide many public apis, but this method is sufficient to solve the problem.

summary

I have to say, I learned a lot from parsing ThreadLocal. The whole article is also written in one piece (so it may contain errors), I guess if there is a need for private Settings of shared variables, I can refer to this method. Before the four kinds of reference just understand, this is to understand how to use; Resolving collisions with hash tables using linear probes, unlike HashMaps, is also a feature of ThreadLocal. The last example of memory leakage is an actual combat of the previous ones.

cool.

reference

WeakReference

JVM principles and implementation — Reference

What is the meaning of 0x61C88647 constant in ThreadLocal.java

Fibonacci Hashing

PrintGC: -xx :+PrintGCDetails, more visible: The vm parameters used to view GC logs