GitHub 2.1 K Star Java engineers become god’s way, not to know?

GitHub 2.1K Star Java engineers become god’s way, really not to know?

GitHub 2.1K Star Java engineers become god’s way, really sure not to check out?

I have introduced the Fail-fast mechanism in Java in my article “Why Alibaba prohibits element remove/ Add operation in foreach loop”, but I did not introduce it in depth. In this article, I will introduce fail-fast in depth.

What is the fail – fast

First of all, let’s take a look at wikipedia’s explanation of Fail-fast:

In systems design, a fail-fast system is one which immediately reports at its interface any condition that is likely to indicate a failure. Fail-fast systems are usually designed to stop normal operation rather than attempt to continue a possibly flawed process. Such designs often check the system’s state at several points in an operation, so any failures can be detected early. The responsibility of a fail-fast module is detecting errors, then letting the next-highest level of the system handle them.

In system design, a system that immediately reports anything that might indicate a failure. Rapid failure systems are usually designed to stop normal operation rather than try to continue a process where there may be a defect. This design typically checks the state of the system at multiple points during operation, so any failures can be detected early. The responsibility of the fast failure module is to detect errors and then let the next highest level of the system handle the errors.

In fact, this is a concept, to put it bluntly, in the design of the system to consider the anomaly, once the anomaly occurs, directly stop and report.

Take the simplest example of fail-fast:

public int divide(int divisor,int dividend){
    if(dividend == 0){
        throw new RuntimeException("dividend can't be null");
    }
    return divisor/dividend;
}
Copy the code

The above code is a method that divides two integers. In Divide, we do a simple check on the divider. If it is 0, we throw an exception and tell us why. This is a practical application of the fail-Fast concept.

The advantage of this is that some error cases can be identified in advance, on the one hand, to avoid the execution of complex other code, on the other hand, the exception can be identified to do some specific processing.

Now that you know fail-fast, it’s not a mystery, and you probably use it in your code all the time.

Since Fail-fast is a good mechanism, why does the article title say fail-fast has pits?

The reason is that Java’s collection class uses the Fail-fast mechanism to design. Once it is improperly used, the code designed by the Fail-fast mechanism will be triggered and unexpected situations will occur.

Fail-fast in the collection class

By default, the fail-fast mechanism in Java refers to an error detection mechanism in Java collections. When multiple threads to the operation of the part of the change in the structure of the collection, is likely to fail – fast mechanism, this time will throw ConcurrentModificationException (later replaced with CME).

CMException, thrown when a method detects concurrent changes to an object but does not allow such changes.

A lot of times cmExceptions are thrown in code, and many programmers are confused as to why they throw concurrency-related exceptions when their code is not executed in a multithreaded environment. At what point does this situation get thrown? Let’s take a closer look.

Abnormal emersion

In Java, a fail-fast mechanism is triggered when a foreach loop removes/adds elements of a collection, throwing a CMException.

Such as the following code:

List<String> userNames = new ArrayList<String>() {{
    add("Hollis");
    add("hollis");
    add("HollisChuang");
    add("H");
}};

for (String userName : userNames) {
    if (userName.equals("Hollis")) {
        userNames.remove(userName);
    }
}

System.out.println(userNames);
Copy the code

The code above iterates through the element using an enhanced for loop and attempts to remove the Hollis string element. Running the above code throws the following exception:

Exception in thread "main" java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
at java.util.ArrayList$Itr.next(ArrayList.java:859)
at com.hollis.ForEach.main(ForEach.java:22)
Copy the code

Similarly, the reader can try adding elements in an enhanced for loop using the add method, and the result will be the same.

Before we dive into the principle, let’s try to decode foreach and see how it works.

We use jad to decompile the compiled class and get the following code:

Public static void main(String[] args) {// Use ImmutableList to initialize a List of names. List<String> userNames = new ArrayList<String>() {{  add("Hollis"); add("hollis"); add("HollisChuang"); add("H"); }}; Iterator iterator = userNames.iterator(); do { if(! iterator.hasNext()) break; String userName = (String)iterator.next(); if(userName.equals("Hollis")) userNames.remove(userName); } while(true); System.out.println(userNames); }Copy the code

As you can see, foreach relies on the while loop and Iterator implementation.

Exception principle

Through the exception stack of the above code, we can trace the code that actually threw the exception:

java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
Copy the code

This method is called in iterator.next(). Let’s look at the implementation of this method:

final void checkForComodification() { if (modCount ! = expectedModCount) throw new ConcurrentModificationException(); }Copy the code

As above, modCount and expectedModCount are compared in this method and a CMException is thrown if they do not want to wait.

So what are modCount and expectedModCount? What’s the reason why their values don’t want to wait?

ModCount is a member variable in the ArrayList. It represents the number of times the collection has actually been modified.

List<String> userNames = new ArrayList<String>() {{
    add("Hollis");
    add("hollis");
    add("HollisChuang");
    add("H");
}};
Copy the code

This variable is available when the collection is initialized with the above code. The initial value is 0.

ExpectedModCount is a member variable in Itr, an inner class in the ArrayList.

Iterator iterator = userNames.iterator();
Copy the code

This code gives you an Itr class that implements the Iterator interface.

ExpectedModCount indicates how many times the iterator expects the set to be modified. Its value is initialized as the Itr is created. The value will change only if the collection is operated on by an iterator.

So, let’s look at usernames.remove (userName); What is happening in the method and why is causing the expectedModCount and modCount values to be different?

Through the code, we can also find that the core logic of remove method is as follows:

private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}
Copy the code

As you can see, it only modifies the modCount and does nothing to the expectedModCount.

Draw a simple picture to describe the above scenario:

To recap, the CMException is thrown because our code uses an enhanced for loop, in which collection traversal is done through iterator, but element add/remove is directly used by the collection class’s own methods. This causes the iterator to iterate over an element that was deleted/added without realizing it, and to throw an exception indicating that concurrent changes may have occurred!

So, if a CMException occurs in Java’s collection classes, the fail-fast case is given priority. In fact, there is no concurrency, but the Iterator uses fail-fast’s protection mechanism. Then an exception will be thrown.

How to solve this problem is described in “Why Alibaba forbids the remove/add operation of elements in foreach loop”, which will not be described here.

fail-safe

To avoid triggering fail-fast and causing exceptions, we can use some of the fail-Safe collection classes provided in Java.

The traversal of such a collection container does not directly access the contents of the collection, but copies the contents of the original collection and traverses the copied collection.

The containers under the java.util.concurrent package are fail-safe and can be used and modified concurrently in multiple threads. You can also add/remove in foreach.

Let’s take a quick look at the fail-Safe collection class CopyOnWriteArrayList.

public static void main(String[] args) {
    List<String> userNames = new CopyOnWriteArrayList<String>() {{
        add("Hollis");
        add("hollis");
        add("HollisChuang");
        add("H");
    }};

    userNames.iterator();

    for (String userName : userNames) {
        if (userName.equals("Hollis")) {
            userNames.remove(userName);
        }
    }

    System.out.println(userNames);
}
Copy the code

The above code, using CopyOnWriteArrayList instead of ArrayList, does not raise an exception.

All changes to the fail-safe collection are made to a copy of the collection and then on the replica collection, not directly to the original collection. And these modification methods, such as add/remove, control concurrency by locking.

Therefore, iterators in CopyOnWriteArrayList do not need to do fail-Fast concurrency detection during iteration. (Because the main purpose of Fail-Fast is to identify concurrency and then notify users of exceptions.)

But, although based on the copy content has the advantage of avoiding the ConcurrentModificationException, but by the same token, the iterator and cannot access to the content of the modified. Such as the following code:

public static void main(String[] args) { List<String> userNames = new CopyOnWriteArrayList<String>() {{ add("Hollis"); add("hollis"); add("HollisChuang"); add("H"); }}; Iterator it = userNames.iterator(); for (String userName : userNames) { if (userName.equals("Hollis")) { userNames.remove(userName); } } System.out.println(userNames); while(it.hasNext()){ System.out.println(it.next()); }}Copy the code

After we get the Iterator from CopyOnWriteArrayList, we go through the for loop to delete the values from the original array. Finally, we print the Iterator at the end.

[hollis, HollisChuang, H]
Hollis
hollis
HollisChuang
H
Copy the code

Iterators iterate over a copy of the collection at the beginning of the iteration. The iterators are not aware of any changes to the collection that occurred during the iteration.

Copy-On-Write

After reading CopyOnWriteArrayList, I don’t know if you will have such a question: his add/remove methods have locked, but also need a copy and then modify why? Superfluous? What’s the difference between a thread-safe collection and a Vector?

Copy-on-write, for short, is an optimization strategy used in program design. The basic idea is that everyone is sharing the same content from the beginning, and when someone wants to change the content, they will actually Copy the content to form a new content and then change it, which is a kind of delayed lazy strategy.

A CopyOnWrite container is a container for copying while writing. The common understanding is that when we add elements to a container, we do not directly add to the current container, but first Copy the current container, Copy the new container, then add elements to the new container, after adding elements, the original container reference to the new container.

Add /remove and other write methods in CopyOnWriteArrayList need to be locked to avoid copying N copies, resulting in concurrent write.

However, the read methods in CopyOnWriteArrayList are unlocked.

public E get(int index) {
    return get(getArray(), index);
}
Copy the code

The advantage of this is that we can do concurrent reads to the CopyOnWrite container, although the data we read may not be up to date. Because the idea of copy-on-write is to achieve the final consistency of data through a delayed update strategy, it is not strong consistency.

** So CopyOnWrite container is an idea of read-write separation, reading and writing different containers. Vector uses the same container for reading and writing, reads and writes are mutually exclusive and can only do one thing at a time.