Visibility, atomicity and orderliness

The CPU, memory, and I/O device core contradiction is the speed difference between these three. The solution

  1. The CPU added a cache to balance the speed difference with memory.
  2. The operating system added processes and threads to time-share multiplexing CPU, and then balance the speed difference between CPU and I/O device;
  3. The compiler optimizes the order of instruction execution so that the cache can be used more efficiently.

Visibility issues caused by caching

Changes made by one thread to a shared variable that another thread can see immediately are called visibility.

Single-core:Multi-core:

Atomicity issues with thread switching

The operating system allows a process to execute for a short period of time, such as 50 milliseconds, after which the operating system selects another process to execute (we call it “task switching”). This 50 milliseconds is called“Time slice”.A statement in a high-level language usually requires more than one CPU instruction, such as count += 1, which requires at least three CPU instructions.

  • Instruction 1: First, we need to load the variable count from memory into the CPU register;
  • Instruction 2: After that, the +1 operation is performed in the register;
  • Instruction 3: Finally, write the result to memory (the caching mechanism makes it possible to write to the CPU cache instead of memory).

We call the ability for one or more operations to be executed by the CPU without being interruptedatomic.

Order problems with compiler optimization

Double check to create a singleton object.

public class Singleton {
  static Singleton instance;
  static Singleton getInstance(){
    if (instance == null) {
      synchronized(Singleton.class) {
        if (instance == null)
          instance = new Singleton();
        }
    }
    return instance;
  }
}
Copy the code

If two threads A and B call getInstance() at the same time, they both find instance == NULL and lock singleton.class. The JVM guarantees that only one thread will be able to lock singleton.class. The other thread will be in a wait state (let’s say thread B). Thread A will create A Singleton instance and release the lock. After the lock is released, thread B will wake up and try to lock again. At this point, the lock can be successfully added. Thread B will not create another Singleton instance since it has already created one.

This all looks perfect and impeccable, but the getInstance() method isn’t. What’s the problem? On the new operation, we think the new operation should be:

  1. Allocate a block of memory M;
  2. Initialize the Singleton object on memory M;
  3. M’s address is then assigned to the instance variable.

But the actual optimized execution path looks like this:

  1. Allocate a block of memory M;
  2. Assign M’s address to the instance variable;
  3. Finally, the Singleton object is initialized on memory M.

What problems do optimizations cause? Let’s assume that thread A executes the getInstance() method first, and when instruction 2 is finished, A thread switch happens, switching to thread B; If thread B also executes the getInstance() method, then thread B will find instance! = null, so instance is returned directly, and the instance is not initialized. If we call the member variable of instance, we may raise the null-pointer exception.

Java memory model

What is the Java Memory model?

The Java Memory model specifies how the JVM provides ways to disable caching and compilation optimizations on demand. Specifically, these methods include the keywords volatile, synchronized, and final, as well as six happens-before rules.

The confusion of using volatile

For example, we declare a volatile variable, volatile int x = 0, which tells the compiler that reads or writes to this variable must be read or written from memory, not from CPU cache.

For example, if thread A executes writer(), volatile semantics will write the variable “v=true” to memory. Suppose thread B executes reader(). Again volatile, thread B reads variable V from memory. If thread B sees “v == true”, how much variable x does thread B see?

Intuitively, it should be 42, but what should it actually be? Depending on the Java version, x may be 42 or 0 if running on a version less than 1.5. If running on a version 1.5 or higher, x is equal to 42.

class VolatileExample { int x = 0; volatile boolean v = false; public void writer() { x = 42; v = true; } public void reader() {if (v == true) { }}}Copy the code

Why did x = 0 occur in versions prior to 1.5? As I’m sure you’re aware, the variable X can be cached by the CPU and cause visibility problems. This issue has been satisfactorily addressed in version 1.5. The Java memory model has enhanced the volatile semantics in version 1.5. How is it enhanced? The answer is a happens-before rule.

Happens-before rules

The result of a previous operation is visible to subsequent operations. Happens-before construres compiler optimization behavior, allowing compiler optimization but requiring compiler optimization to follow the happens-before rule.

1. Sequential rules of procedures

Happens-before any subsequent action in a thread, in procedural order. For example, in the preceding example, line 6 “x = 42;” in program order. Happens-before on line 7 of code “v = true;” .

2. Volatile variable rules

Happens-before refers to a write to a volatile variable, Before subsequent reads to the volatile variable.

Write operations on a volatile variable are visible as opposed to subsequent reads on a volatile variable. This disables caching. If you look at this rule alone, that’s true, but if we relate it to rule 3, it’s a little different.

3. The transitivity

This rule means that if A happens-before B, and B happens-before C, then A happens-before C. From the picture, we can see:

  1. “X =42” happens-before write the variable “v=true”, which is the content of rule 1;
  2. Write the variable “v=true” happens-before read the variable “v=true”, which is the content of rule 2.

Following this transitivity rule, we get the result: “x=42” happens-before reads the variable “v=true”. What does that mean?

If thread B reads “v=true”, the “x=42” set by thread A is visible to thread B. That is, thread B can see “x == 42”. Does it feel like a double take? This is the significant addition of volatile semantics to the 1.5 concurrency toolkit (java.util.Concurrent), which relies on volatile semantics for visibility, as discussed in more detail later.

4. Rules of pipe lock

Happens-before refers to the happens-before the lock is unlocked.

To understand this rule, it is important to understand “what does a pipeline mean?” A pipe is a generic synchronization primitive. In Java, synchronized is the implementation of a pipe in Java.

A lock in a pipe is implicitly implemented in Java. For example, in the following code, the lock is automatically locked before entering a synchronized block, and the lock is automatically released after executing the block.

Synchronized (this) {if (this.x < 12) {this.x = 12; }} // The account is automatically unlockedCopy the code

So in combination with rule 4 — the lock rule in pipe, it can be understood as follows: Let’s say the initial value of x is 10. After thread A finishes executing the code block, x will change to 12 (automatic lock release). When thread B enters the code block, it will see that thread A writes to x, so thread B will see that x==12. This is also intuitive and should be easy to understand.

5. Thread start() rule

This one is about thread starting. It means that after main thread A starts child thread B, child thread B can see what the main thread does before it starts child thread B.

In other words, if thread A calls thread B’s start() method (that is, starting thread B in thread A), the start() operation is happens-before any operation in thread B.

Thread B = new Thread(()->{var==77}); Var = 77; B.start();Copy the code

6. Thread join() rule

This one is about thread waiting. It means that the main thread A waits for the child thread B to complete (by calling the join() method of the child thread B), and when the child thread B completes (the join() method of the main thread A returns), the main thread can see the operation of the child thread. By “seeing,” of course, we mean operating on shared variables.

In other words, if join() of thread B is called in thread A and returns successfully, then any operation in thread B happens-before the return of the join() operation.

Thread B = new Thread(()->{var = 66; }); Start (); // start(); All changes to shared variables made by the child thread are visible after the main thread calls b.coin (). In this case, var==66Copy the code

final

On the other hand, is there a way to tell the compiler to optimize better? This can be the final keyword.

When final modifies a variable, it is intended to tell the compiler that the variable is inherently immutable and can be optimized as much as possible. The Java compiler did optimize so hard prior to 1.5 that it got the optimizations wrong.

Of course, after 1.5, the Java memory model imposed constraints on the rearrangement of variables of final type. Now as long as we provide the right constructor and no “escape”, we should be fine.

“Escape” is a bit abstract, so let’s take an example. In the following example, the constructor assigns this to the global variable global.obj. This is “escape”. So we must avoid “escape”.

final int x; Public FinalFieldExample() {x = 3; y = 4; Global. Obj = this; }Copy the code

The happens-before rule was first proposed in a paper called Time, Clocks, and the Ordering of Events in a Distributed System. In this paper, The semantics of happens-before are causality. In the real world, if A is the cause of B, then A must have happened Before B, which is A realistic understanding of the meaning of happens-before.

In the Java language, the semantics of happens-before are essentially visibility. A happens-before B means that event A is visible to event B, regardless of whether event A and event B occur in the same thread. For example, if event A occurs on thread 1 and event B occurs on thread 2, the happens-before rule guarantees that event A is also seen on thread 2.

Mutex

How to solve the atomic problem?

The source of the atomicity problem is thread switching. Wouldn’t it be possible to disable thread switching to solve this problem? The operating system relies on CPU interrupts to do thread switching, so disabling CPU interrupts can prevent thread switching.

This worked in the early days of single-core cpus, and there were many applications, but it was not suitable for multi-core scenarios. To illustrate this problem, we use a 32-bit CPU as an example of a write to a long variable. The long variable is 64-bit, and a write to a 32-bit CPU is split into two writes (write high 32 bits and write low 32 bits, as shown in the figure below).On single-core CPU scenarios, the same time there is only one thread execution, prohibit the CPU interrupt, means that the operating system, not to schedule threads is also banned thread switches and get right to the use of the CPU thread execution can constantly, so the two write operation must be: either are implemented, or are not implemented, atomicity.

However, in the multi-core scenario, two threads may be executing at the same time. One thread is executing on CPU-1 and the other thread is executing on CPU-2. In this case, CPU interruption is prohibited to ensure the continuous execution of the THREADS on the CPU, but not only one thread at a time. If both threads write long variables 32 bits too high at the same time, then you might get the weird Bug we mentioned at the beginning.

The condition “only one thread is executing at a time” is so important that we call it mutually exclusive. If we can ensure that changes to shared variables are mutually exclusive, then atomicity is guaranteed for both single-core and multi-core cpus.

Simple lock model

We call a piece of code that needs to be mutually exclusiveA critical region.

Improved lock model

In the real world, there is a correspondence between locks and the resources that locks protect. For example, you use your lock to protect your stuff, and I use my lock to protect my stuff. In the world of concurrent programming, locks and resources should have the same relationship.

Firstly, we need to mark out the resource to be protected in the critical area. In the figure, an element is added to the critical area: protected resource R; Second, to protect resource R, we need to create a lock for it. Finally, for this lock LR, we also need to add lock operation and unlock operation when entering and leaving the critical region. In addition, I specially use a line to make an association between the lock LR and the protected resource, which is very important. A lot of concurrency bugs occur because we ignore them, and then something like locking our door to protect their property is very difficult to diagnose because subconsciously we think we’ve locked it correctly.

Java language lock technology: Synchronized

Class X {// synchronized void foo() {// synchronized static void bar() {// synchronized static void bar() {// synchronized static void bar()} // synchronized code blocks Object obj = new Object(); Void baz() {synchronized(obj) {// synchronized(obj)}}Copy the code

When you modify a static method, you lock the Class object of the current Class, in this case Class X; When decorating a non-static method, the current instance object this is locked.

For the example above, synchronized modifies static methods equivalent to:

Class X {// synchronized(x.lass) static void bar() {// critical section}}Copy the code

Pertaining to a nonstatic method equivalent to:

Synchronized (this) void foo() {// critical section}}Copy the code

Relationship between a lock and a protected resource

The relationship between a protected resource and a lock is very important. What is the relationship? A reasonable relationship is that the relationship between the protected resource and the lock is N:1. Also take the front of the game ticket management to analogy, is a seat, we can only use a ticket to protect, if there are multiple tickets, it will be a fight. In the real world, we can protect the same resource with multiple locks, but not in the concurrent world, where locks do not match the real world lock exactly. However, it is possible to use the same lock to protect multiple resources, which in the real world is what we call “farm”.

If I change the value to a static variable and addOne() to a static method, do I have concurrency problems with get() and addOne()?

class SafeCalc { static long value = 0L; synchronized long get() { return value; } synchronized static void addOne() { value += 1; }}Copy the code

If you look closely, you’ll see that the changed code protects one resource with two locks. The protected resource is the static variable value, and the two locks are this and safecalc.class. We can visualize this relationship in the following picture. Because critical sections get() and addOne() are protected by two locks, the two critical sections are not mutually exclusive, and changes to value by addOne() do not guarantee visibility to get(), which leads to concurrency problems.

Protects multiple resources that are not associated

For example, if banking has withdrawals for the account balance (which is a resource) and changes for the account password (which is also a resource), we can solve the concurrency problem by assigning different locks to the account balance and the account password. The example code is as follows. The Account class Account has two member variables: Account balance and Account password. Withdraw () and view the balance getBalance() operations access the account balance balance, and we create a final object balLock as the lock (analogous to football tickets); While changing the password updatePassword() and viewing the password getPassword() will change the account password password, we create a final object pwLock as the lock (analogous to movie tickets). Different resources are protected by different locks.

Class Account {// Lock: private final Object balLock = new Object(); Private Integer balance; Private final Object pwLock = new Object(); // Account password private String password; Void withdraw(Integer amt){synchronized(balLock) {if (this.balance > amt){this.balance -= amt; Integer getBalance() {synchronized(balLock) {return balance; }} // Change password void updatePassword(String pw){synchronized(pwLock) {this.password = pw; String getPassword() {synchronized(pwLock) {return password; }}}Copy the code

Of course, we can also use a mutex to protect multiple resources. For example, we can use this lock to manage all resources in the account class: account balance and user password. The specific implementation is very simple, all the methods in the example program are added to the synchronization keyword synchronized.

The problem with using a lock, however, is that its performance is so poor that withdrawals, checking balances, changing passwords, and checking passwords are all sequential. With two locks, withdrawals and password changes can be done in parallel. Fine-grained management of protected resources with different locks improves performance. Another name for this type of lock is fine-grained locking.

Protects multiple resources with associated relationships

If multiple resources are related, the problem becomes a little more complicated. For example, in the transfer operation of banking business, account A decreases by 100 yuan and account B increases by 100 yuan. The two accounts are connected. So how do we solve related operations like transfer? So let’s code this problem. We declare an Account class: Account, which has a member variable: balance, and a method for transfer(). How to ensure that transfer() does not have concurrency problems?

class Account { private int balance; Void transfer(Account target, int amt){if (this.balance > amt){this.balance -= amt; target.balance += amt; }}}Copy the code

Trust your instincts to tell you the solution: just modify the transfer() method with the user synchronized keyword, and you’ll soon have the code, as shown below.

class Account { private int balance; Synchronized void transfer(Account target, int amt){if (this.balance > amt){this.balance -= amt; target.balance += amt; }}}Copy the code

In this code, there are two resources in the critical section, namely the balance of the transferout account this.balance and the balance of the transferin account target.balance, and they use a lock this. Is it really so? Unfortunately, this scheme only looks right. Why?

The problem lies in this lock. This lock can protect your own balance this.balance, but it cannot protect others’ balance target.balance, just like you cannot use your own lock to protect others’ assets, or use your own ticket to protect others’ seats.

Let’s make A specific analysis. Suppose there are three accounts A, B and C, each with A balance of 200 yuan, we use two threads to perform two transfer operations respectively: So account A goes to account B for $100, account B goes to account C for $100, and at the end of the day we want the balance of account A to be $100, the balance of account B to be $200, the balance of account C to be $300.

Let’s assume that thread 1 performs the transfer from account A to account B and thread 2 performs the transfer from account B to account C. Are these two threads executing simultaneously on two cpus mutually exclusive? We expected it, but it wasn’t. Because thread 1 locks the instance of account A (A.this), while thread 2 locks the instance of account B (B.this), the two threads can enter the critical area of transfer() at the same time. What is the consequence of entering the critical region at the same time? Thread 1 and thread 2 both read that the balance of account B is 200, resulting in a possible final balance of account B of 300 (thread 1 writes b. Balance after thread 2, and thread 2 writes B. Balance that is overwritten by thread 1). It can be 100 (thread 1 writes b. balance before thread 2, and thread 1 writes B. Balance overwritten by thread 2), but it can’t be 200.

Use the lock correctly

The lock overwrites all protected resources.

In the above example, this is the lock at the object level, so both A and B have their own locks. How do you make A and B share A lock?

If you think about it a little bit, you’ll see that there are quite a few options. For example, you can make all objects hold a unique object that is passed in when you create an Account. With the solution, it’s easy to complete the code. In this example, we make the default Account constructor private and add a constructor that takes the Object lock argument. When we create an Account Object, we pass in the same lock, so that all Account objects share the lock.

Class Account {private Object lock; private int balance; private Account(); Public Account(Object lock) {this.lock = lock; } void transfer(Account target, Synchronized (lock) {if (this.balance > amt){this.balance -= amt; synchronized(lock) {this.balance -= amt; target.balance += amt; }}}}Copy the code

This method does solve the problem, but there is a small flaw. It requires that the same object must be passed in when creating an Account object. If the lock passed in when creating an Account object is not the same object, then the door will be locked to protect his property. In a real project scenario, where the code to create the Account object would likely be spread across multiple projects, passing in a shared lock would be really difficult.

Therefore, the above scheme lacks practical feasibility, and we need a better scheme. There is a shared lock using account.class. Account.class is shared by all Account objects, and this object is created when the Java VIRTUAL machine loads the Account class, so we don’t have to worry about its uniqueness. Using account.class as the shared lock eliminates the need to pass in the Account object when creating it, making the code simpler.

class Account { private int balance; // Transfer void transfer(Account target, int amt){ synchronized(Account.class) { if (this.balance > amt) { this.balance -= amt; target.balance += amt; }}}}Copy the code

The following diagram visually illustrates how we use the shared lock account.class to protect the critical sections of different objects.

What is the nature of atomicity? In fact, it is not indivisible. Indivisibility is only an external manifestation. Its essence is that there is a requirement of consistency between multiple resources, and the intermediate state of the operation is invisible to the external. For example, there are intermediate states for writing long variables on A 32-bit machine (only 32 bits out of 64) and intermediate states for A bank transfer operation (account A is reduced by 100 before account B has had time to change). So the solution to the atomicity problem is to make sure that the intermediate state is invisible to the outside world.