Background analysis

I believe that many programs in the normal implementation of the function of the process, will encounter some static variables, whether single thread or multi-thread in use, will not affect each other, that is, the static variables between threads is read and write isolation.

The SimpleDateFormat dateFormat dateformat utility has multiple threads. If you look on the web, there will be a lot of people who say ThreadLocal.

define

So what exactly is ThreadLocal? Thread-local means thread-local, and we are used to store variables that we need to be thread-isolated, that is thread-local variables. That is, when we store a variable in a ThreadLocal, we can achieve thread-isolation for that variable.

example

Let’s take a look at two examples that also happen to involve two concepts, value passing and reference passing.

  • Value passed
public class ThreadLocalTest {

    private static ThreadLocal<Integer> threadLocal = new ThreadLocal<Integer>(){
        protected Integer initialValue(a){
            return 0; }};/ / value
    @Test
    public void testValue(a){
        for (int i = 0; i < 5; i++){
            new Thread(() -> {
                Integer temp = threadLocal.get();
                threadLocal.set(temp + 5);
                System.out.println("current thread is " + Thread.currentThread().getName() + " num is " + threadLocal.get());
            }, "thread-"+ i).start(); }}}Copy the code

The output of the above program is:

current thread is thread-1 num is 5
current thread is thread-3 num is 5
current thread is thread-0 num is 5
current thread is thread-4 num is 5
current thread is thread-2 num is 5
Copy the code

As you can see, each thread prints a 5, even if I fetch the variable using threadlocal.get () and then set it in, I still don’t overwrite it.

This is called thread isolation.

But for reference passing, we need to pay more attention to the example.

  • reference
public class ThreadLocalTest {

    static NumIndex numIndex = new NumIndex();
    private static ThreadLocal<NumIndex> threadLocal1 = new ThreadLocal<NumIndex>(){
        protected NumIndex initialValue(a){
            returnnumIndex; }};static class NumIndex{
        int num = 0;
        public void increment(a){ num++; }}// Reference passing
    @Test
    public void testReference(a){
        for (int i = 0; i < 5; i++){
            new Thread(() -> {
                NumIndex index = threadLocal1.get();
                index.increment();
                threadLocal1.set(index);
                System.out.println("current thread is " + Thread.currentThread().getName() + " num is " + threadLocal1.get().num);
            }, "thread-"+ i).start(); }}}Copy the code

Let’s look at the results of the run

current thread is thread-0 num is 2
current thread is thread-2 num is 3
current thread is thread-1 num is 2
current thread is thread-4 num is 5
current thread is thread-3 num is 4
Copy the code

We see that not only are values not quarantined, but there are thread safety issues.

So it’s important to note the difference between value passing and reference passing, which we won’t cover here.

Source code analysis

To get a better understanding of what ThreadLocal does, go back to the source code and see how Josh Bloch and Doug Lea implement it. The entire class adds up to only 800 or 800 lines.

In this article, I will analyze the source code of ThreadLocal and ThreadLocalMap.

ThreadLocalMap source code analysis

After much deliberation, I decided to explain the source code of ThreadLocalMap first.

ThreadLocalMap is a static inner class of ThreadLocal, but it is a key thing that we need to think about. If we want to implement this function, what should we do? And look at other people’s code and think, why are people doing this?

I hope that through my article, I can not bring you any awesome technology, but at least let you understand that what we need to learn is the rigorous thinking logic of these great men.

Anyway, what exactly is ThreadLocalMap? Think of it this way, since it’s a thread-local variable, and we can get and assign values using the GET and set methods.

1. What structure is stored in when we assign values?

2. How does it achieve thread isolation?

3, When I get and set, how does it achieve thread-value correspondence to save?

With these three questions and the name ThreadLocalMap, I think you get a sense of what this is.

That’s right, it’s the core of ThreadLocal, the class that maintains the relationship between our threads and variables, and if we see the end of Map, we know that it’s actually a key-value pair. As for what KEY is, we will see it in the source code analysis.

Entry inner class

The following source code is extracted to explain part of the content to show

static class ThreadLocalMap {

    /** * create a custom Entry class that inherits a weak reference * to store the mapping between ThreadLocal and Value ** The weak reference is used to resolve the problem that a strong binding between a thread and a ThreadLocal will cause that if the thread is not collected, The GC will never be able to reclaim this part */
    static class Entry extends WeakReference<ThreadLocal<? >>{
        /** The value associated with this ThreadLocal. */Object value; Entry(ThreadLocal<? > k, Object v) {super(k); value = v; }}/** * The initial capacity -- MUST be a power of two
    private static final int INITIAL_CAPACITY = 16;

    /** * The table, resized as necessary. * table.length MUST always be a power of two. * 
      
        array * must be a power N of 2 * this can be used to see why arrays maintained in HashMap must also be a power N of 2 * mainly to reduce collisions, The key code is hashcode & table.length - 1 */
      ,>
    private Entry[] table;

    ** * The number of entries in The table. * The number of entries in The table */
    private int size = 0;

    /** * The next size value at which to resize
    private int threshold; // Default to 0

    /** * Set the resize threshold to maintain at worst a 2/3 load factor. */
    private void setThreshold(int len) {
        threshold = len * 2 / 3;
    }

    /** * The entry array is actually a circular structure */ as you can see from the following two codes that get next and prev
    /** * Increment I modulo len. * Increment I modulo len
    private static int nextIndex(int i, int len) {
        return ((i + 1 < len) ? i + 1 : 0);
    }

    /** * Decrement I modulo len. * Returns the last index or an index of length -1 */ if -1 is negative
    private static int prevIndex(int i, int len) {
        return ((i - 1> =0)? i -1 : len - 1);
    }

    /** * construct argument creates a ThreadLocalMap code * ThreadLocal is key, our generic is value */ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize the table size to 16
        table = new Entry[INITIAL_CAPACITY];

        // Use the bit operation hashCode & (length -1) to determine the position of the key-value pair
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);

        // Create a new node and store it in the table
        table[i] = new Entry(firstKey, firstValue);

        // Set the table element to 1
        size = 1;

        // Set the capacity expansion threshold
        setThreshold(INITIAL_CAPACITY);
    }

    [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal] [InheritableThreadLocal@param parentMap the map associated with parent thread.
         */
    private ThreadLocalMap(ThreadLocalMap parentMap) {
        Entry[] parentTable = parentMap.table;
        int len = parentTable.length;
        setThreshold(len);
        table = new Entry[len];

        for (int j = 0; j < len; j++) {
            Entry e = parentTable[j];
            if(e ! =null) {
                @SuppressWarnings("unchecked")
                ThreadLocal<Object> key = (ThreadLocal<Object>) e.get();
                if(key ! =null) {
                    Object value = key.childValue(e.value);
                    Entry c = new Entry(key, value);
                    int h = key.threadLocalHashCode & (len - 1);
                    while(table[h] ! =null)
                        h = nextIndex(h, len);
                    table[h] = c;
                    size++;
                }
            }
        }
    }
}
Copy the code

Some simple things directly read my above notes can be.

We can see that in the inner class of ThreadLocalMap, we define an Entry inner class that inherits from weak references. The generic is ThreadLocal, and there is a constructor in it. From this we can roughly guess, The key in a ThreadLocalMap is actually the current ThreadLocal object.

Why weak references? ThreadLocal is essentially a thread-local variable isolation utility class. When a thread runs out, it must want to reclaim the resources generated by this part of the ThreadLocal, so it uses weak references.

I’m sure some people are wondering, what if it gets recycled when I need it? The following code will show you step by step that the problems you have considered have already been figured out and solved. Continue to learn!

GetEntry and getEntryAfterMiss methods

Using the method name, we can see that the Entry node is fetched from the ThreadLocalMap corresponding to ThreadLocal, which is where we need to think about it.

1) What should we obtain the corresponding Entry through

2) We know from above that weak references are used, what if the GC does not collect them?

3) What if it is not on the calculated subscript?

4) What if ThreadLocalMap does not exist?

These are some of the questions THAT COME to mind when I look at the source code for THreadLocalMap. Some of the answers are not obvious when I look at the source code for THreadLocalMap.

Here we have a look at Daniel’s source code is how to solve the above problems.

/** * get the index position of ThreadLocal, get the content by subscript index */
private Entry getEntry(ThreadLocal
        key) {
    // Use hashcode to determine the subscript
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];

    // If found, return directly
    if(e ! =null && e.get() == key)
        return e;
    else
        // If you can't find it, start at position I and traverse backwards. Based on the linear detection method, it is possible to find it after I
        return getEntryAfterMiss(key, i, e);
}

private Entry getEntryAfterMiss(ThreadLocal<? > key,int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;

    // Loop over backwards
    while(e ! =null) {

        // Get k of the nodeThreadLocal<? > k = e.get();// Return if the value is equal
        if (k == key)
            return e;

        // If null, a sequential segment clearing is triggered
        if (k == null)
            expungeStaleEntry(i);

        // get the next index and judge
        else
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}
Copy the code

By looking at the two method names, we know that these are the methods that get the Entry node.

GetEntry (ThreadLocal
key) and getEntryAfterMiss (ThreadLocal
key, int I, Entry e); > key, int I, Entry e

Select * from key hashcode and table. length-1

2, if the direct return is obtained, if not, then back traversal to see if it can be obtained (because the use of linear detection method, backward traversal is likely to obtain the results)

3. Enter the getEntryAfterMiss method for linear detection, and return directly if it is obtained; If the key obtained is null, a sequential cleanup is triggered. (In fact, this method is triggered in many methods, often, and is the core cleanup method of ThreadLocal.)

ExpungeStaleEntry method

This is a very core cleanup of ThreadLocal, so why is it needed? Maybe a lot of people don’t understand, whether we use List or Map, there is no need to clean up the contents.

However, this is a local variable that is isolated to the thread and uses a weak reference, which can be reclaimed during GC.

1) If there are many entries that have been reclaimed but still have space in the table array, it will be a waste of resources not to clean up

2) When the nodes are cleared, the subscripts of subsequent non-empty Entry nodes can be recalculated for emission. In this way, resources can be quickly located and efficiency can be accelerated during GET.

Let’s see how others source code is done!

/** * this function can be used as a central cleaning function in ThreadLocal. It does a lot of work by starting from staleSlot and setting null to the Entry node where ThreadLocal is collected. The size is reduced by 1 * 2, and non-NULL entries are rehashed to the next null location * * so it is actually a clean and rehash of the contiguous segment starting from staleSlot */
private int expungeStaleEntry(int staleSlot) {
    // The new reference points to table
    Entry[] tab = table;

    // Get the length
    int len = tab.length;

    // expunge entry at staleSlot
    // set the subscript passed to null
    tab[staleSlot].value = null;
    tab[staleSlot] = null;

    / / the size of table 1
    size--;

    // Rehash until we encounter null
    Entry e;
    int i;
    // Delete all subsequent nodes of the specified node from which ThreadLocal is reclaimed
    for(i = nextIndex(staleSlot, len); (e = tab[i]) ! =null;
         i = nextIndex(i, len)) {
        // Get the key of the entryThreadLocal<? > k = e.get();// If ThreadLocal is null, set value and array subscripts to null for GC
        / / and the size 1
        if (k == null) {
            e.value = null;
            tab[i] = null;
            size--;
        } else {    // If not null
            // recalculate the subscript of key
            int h = k.threadLocalHashCode & (len - 1);

            // If it is the current position, iterate over the next one
            // if it is not the current position, start from I and find the next coordinate that is null
            if(h ! = i) { tab[i] =null;

                // Unlike Knuth 6.4 Algorithm R, we must scan until
                // null because multiple entries could have been stale.
                while(tab[h] ! =null) h = nextIndex(h, len); tab[h] = e; }}}return i;
}
Copy the code

I’m sure the code comments above make it pretty clear that this method actually does a sequential cleanup and rehash from staleSlot.

Set method series

Then we look at the set method, which naturally stores our variables into a ThreadLocal, essentially a ThreadLocalMap, and we have a couple of questions to consider here.

1) What if ThreadLocalMap does not already exist?

2) What if the calculated subscript already has an Entry node in the table?

I want to explain the above part of the code, to these two problems, we also have more ideas.

As usual, let’s look at the code implementation

/** * the set method of ThreadLocalMap is used to resolve conflicts */
private void set(ThreadLocal
        key, Object value) {

    // create a new reference to table
    Entry[] tab = table;

    // Get the table length
    int len = tab.length;

    // Get the subscript of ThreadLocal in the table
    int i = key.threadLocalHashCode & (len-1);

    /** * loops through * 1 from the subscript. If the same key exists, replace value * 2 directly. If the key has been reclaimed, replace the invalid key */
    for(Entry e = tab[i]; e ! =null; e = tab[i = nextIndex(i, len)]) { ThreadLocal<? > k = e.get();if (k == key) {
            e.value = value;
            return;
        }

        // If k is null, the current invalid Entry node where K resides is replaced
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return; }}// Find the empty location, create an Entry object and insert it
    tab[i] = new Entry(key, value);

    // The size of the element in the table increases
    int sz = ++size;
    if(! cleanSomeSlots(i, sz) && sz >= threshold) rehash(); }private void replaceStaleEntry(ThreadLocal<? > key, Object value,int staleSlot) {
    // create a new reference to table
    Entry[] tab = table;

    // Get the table length
    int len = tab.length;
    Entry e;

    // Record the current failed node subscript
    int slotToExpunge = staleSlot;

    /** * From the prevIndex(staleSlot, len) of the for loop, we can see that * this is scanned forward from the staleSlot subscript */
    for (inti = prevIndex(staleSlot, len); (e = tab[i]) ! =null;
         i = prevIndex(i, len))
        if (e.get() == null)
            slotToExpunge = i;

    /** * For nextIndex(staleSlot, len
    for (inti = nextIndex(staleSlot, len); (e = tab[i]) ! =null;
         i = nextIndex(i, len)) {

        // Get the ThreadLocal object corresponding to the Entry nodeThreadLocal<? > k = e.get();/** * If the new key is assigned to value *, the I and staleSlot subscripts */ are replaced directly
        if (k == key) {
            e.value = value;

            tab[i] = tab[staleSlot];
            tab[staleSlot] = e;

            // Start expunge at preceding stale entry if it exists
            // There is no case where value is null before I
            if (slotToExpunge == staleSlot)
                slotToExpunge = i;

            /** * The expungeStaleEntry method is called before cleanSomeSlots is called for a heuristic cleanup from slotToExpunge to the null subscript of table * returns table[] as null subscript We then do a heuristic cleanup with that subscript --len. The method actually calls expungeStaleEntry again
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
            return;
        }

        /** * If the current subscript is invalid and no invalid Entry node * is found during the backward scan, slotToExpunge is assigned to the current position */
        if (k == null && slotToExpunge == staleSlot)
            slotToExpunge = i;
    }

    // If key not found, put new entry in stale slot
    // If the key is not found in the table, a new Entry is created in the current position
    tab[staleSlot].value = null;
    tab[staleSlot] = new Entry(key, value);

    /** * During the for loop probe above * if any invalid Entry is found, slotToExpunge is reassigned * triggering sequential and heuristic cleanup */
    if(slotToExpunge ! = staleSlot) cleanSomeSlots(expungeStaleEntry(slotToExpunge), len); }/** * This method is called in two places ** before determining whether resize is required. It will clean up and rehash * 2. It will also clean up */ when replacing failed nodes
private boolean cleanSomeSlots(int i, int n) {
    boolean removed = false;
    Entry[] tab = table;
    int len = tab.length;
    do {
        i = nextIndex(i, len);
        Entry e = tab[i];
        // The Entry object is not empty, but the ThreadLocal key is null
        if(e ! =null && e.get() == null) {
            n = len;
            removed = true;

            /** * This method does not actually recycle only I, but does a cleanup and rehash of all nodes in the null subscript from I to tablei = expungeStaleEntry(i); }}while ( (n >>>= 1) != 0);
    return removed;
}


/** * expand the size of the table by 2 */
private void resize(a) {

    // Get the length of the old table and create an Entry array twice the length of the old table
    Entry[] oldTab = table;
    int oldLen = oldTab.length;
    int newLen = oldLen * 2;
    Entry[] newTab = new Entry[newLen];

    // Records the number of valid Entry nodes inserted
    int count = 0;

    /** * insert into a new table * 1, if the key is null, set the value to null, so that GC can retrieve * 2, calculate the subscript with hashcode & len-1, if there is already an Entry array, Insert */ backwards through linear probe
    for (int j = 0; j < oldLen; ++j) {
        Entry e = oldTab[j];
        if(e ! =null) { ThreadLocal<? > k = e.get();if (k == null) {
                e.value = null; // Help the GC
            } else {
                int h = k.threadLocalHashCode & (newLen - 1);
                while(newTab[h] ! =null) h = nextIndex(h, newLen); newTab[h] = e; count++; }}}// Reset the capacity expansion threshold
    setThreshold(newLen);

    / / update the size
    size = count;

    // Points to a new Entry array
    table = newTab;
}
Copy the code

The above code is a set of code that calls ThreadLocalMap to store the k-V relationship. I won’t go through it one by one, so it looks more convenient and continuous.

We can take a look at the entire set process:

1. Use hashCode & (len-1) to locate the subscript of ThreadLocal in the table

2. The for loop iterates backwards

1) If the key of the Entry node is equal to the ThreadLocal we need to operate on, we replace value directly

2) If a key is null, the replaceStaleEntry method is called to replace it.

3. If both cases are true, insert a new Entry in the calculated subscript.

4. Perform a heuristic cleanup and if the size of the node after insertion is greater than the capacity expansion threshold, call the resize method for capacity expansion.

The remove method

So if we’re storing it in Map form, and we have put, we’re going to have remove, and any data structure, we’re going to have to be able to add, delete, change, and look.

Let’s go straight to the code.

/** * Remove the entry for key. */ Remove the entry from the table
private void remove(ThreadLocal
        key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    for(Entry e = tab[i]; e ! =null;
         e = tab[i = nextIndex(i, len)]) {
        if (e.get() == key) {
            // Set the reference to null to facilitate GC
            e.clear();

            // Perform a continuous segment cleanup from this position
            expungeStaleEntry(i);
            return; }}}Copy the code

As we can see, linear detection is also used for remove node. When the corresponding key is found, clear will be called to point the reference to NULL, and a continuous segment cleaning will be triggered.

I believe that the above source code analysis of ThreadLocalMap has given you a basic understanding of the concept of ThreadLocal, and I believe that the concept of ThreadLocal is no longer just for implementing thread-local variables.

Let’s take a look at the source code for ThreadLocal.

ThreadLocal source code analysis

The source code for ThreadLocal is much simpler, since it’s the ThreadLocalMap inner class that does the work, managing our local variables.

Get method series

/** * gets the value of the current thread local variable */
public T get(a) {
    // Get the current thread
    Thread t = Thread.currentThread();

    // Get the ThreadLocalMap of the current thread
    ThreadLocalMap map = getMap(t);

    // If map is not empty
    if(map ! =null) {

        // If the Entry of the current ThreadLocal object still exists
        ThreadLocalMap.Entry e = map.getEntry(this);

        // The Entry value is not null and the corresponding value is returned. Otherwise, the setInitialValue method is executed
        if(e ! =null) {
            @SuppressWarnings("unchecked")
            T result = (T)e.value;
            returnresult; }}// If ThreadLocalMap does not already exist, the initialization method is performed
    return setInitialValue();
}


ThreadLocalMap getMap(Thread t) {
    return t.threadLocals;
}


private T setInitialValue(a) {
    // Get the initial value, usually subclass override
    T value = initialValue();

    // Get the current thread
    Thread t = Thread.currentThread();

    // Get the ThreadLocalMap of the current thread
    ThreadLocalMap map = getMap(t);

    // If map is not null
    if(map ! =null)

        // Call ThreadLocalMap's set method for assignment
        map.set(this, value);

    // Otherwise create a ThreadLocalMap for assignment
    else
        createMap(t, value);
    return value;
}


/** * construct argument creates a ThreadLocalMap code * ThreadLocal is key, our generic is value */ThreadLocalMap(ThreadLocal<? > firstKey, Object firstValue) {// Initialize the table size to 16
    table = new Entry[INITIAL_CAPACITY];

    // Use the bit operation hashCode & (length -1) to determine the position of the key-value pair
    int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);

    // Create a new node and store it in the table
    table[i] = new Entry(firstKey, firstValue);

    // Set the table element to 1
    size = 1;

    // Set the capacity expansion threshold
    setThreshold(INITIAL_CAPACITY);
}
Copy the code

ThreadLocal’s get method isn’t hard, either, with just a few lines of code, but when it’s combined with ThreadLocalMap’s method, the whole logic is worth delving into the mind of the person who wrote the tool.

Let’s take a look at one of its processes.

1. Get the current thread and get the corresponding ThreadLocalMap based on the current thread

2. Get the Entry node of the ThreadLocalMap and return the corresponding value

3. If ThreadLocalMap is null, the setInitialValue method is called

1) When the setInitialValue method is called, the ThreadLocalMap is double guaranteed to be fetched again

2) If it is still null, the constructor of ThreadLocalMap is finally called

Set method series

I’m not going to go into too much detail about the set methods of ThreadLocalMap, but with the set methods of ThreadLocalMap above, I think I can get a general answer to the question posed by each of the above methods.

public void set(T value) {
    // Get the current thread
    Thread t = Thread.currentThread();

    // Get the ThreadLocalMap for each thread
    ThreadLocalMap map = getMap(t);

    // If map is not empty, then k-v is assigned to this, which is the current ThreaLocal object
    if(map ! =null)
        map.set(this, value);

    // If the map obtained is empty, create one and save the k-v relationship
    else
        createMap(t, value);
}
Copy the code

The set method of ThreadLocal is very simple, the most important is to call ThreadLocalMap’s set method, which is the core of the execution process.

But let’s look at the process:

1. Get the current thread and get the corresponding ThreadLocalMap based on the current thread

2. If the corresponding ThreadLocalMap is not null, call its set method to save the mapping

3. If the map is null, the constructor of ThreadLocalMap is finally called to create a ThreadLocalMap and save the corresponding relationship

Summary of Execution process

Source code analysis summary

ThreadLocal and ThreadLocalMap (ThreadLocalMap) ¶ I really admire Josh Bloch and Doug Lea for not just implementing this thing, but thinking about GC issues, how to maintain data storage, and how to establish correspondence between threads and local variables.

They write code logic is very rigorous, they saw that a few hundred lines of code, really to find that we actually are not mainly on the technology gap with others, but a whole set of thinking logic in function to realize the above and they have a huge gap, the most obvious is that we are simple to implement and realize, basically won’t consider other anomalies, Some GC issues are not considered.

So through the analysis of the source code, let me really realize that we can not just look at the source code to do translation, we must learn how they think to achieve such a function, we have to learn how they think about the logic of each function.