The effect of Volatile

  • Memory visibility
  • Prevents command reordering




How does Volatile guarantee memory visibility?

CPU multi-level cache

CPU running speed is very fast, but the disk read and write IO speed is very slow, in order to solve this problem, the birth of memory;

And the CPU speed and memory read and write speed ratio still has a 100:1 gap, in order to solve this problem, CPU and memory between the establishment of multiple levels of cache: register, L1, L2, L3 cache.

CPU Level-3 cache. PNG

A minimum CPU cache unit is 64 bytes: after practice, 64 bytes cache is a relatively appropriate unit size, not because of the cache is too large, so that the read speed will not be reduced, nor because of the cache is too small, and to read data for many times.

When core core 1 and 2 in the different elements in the same cache line operation, even the different elements of the operation will not affect each other, but, if the same cache line data has been changed, and will be through the cache coherence protocol to other CPU with the core of the cache invalidation, return to read the data in the memory, thereby causing slow program.

  • Clever trick: In this case, the cache will not be updated repeatedly due to consistency issues by “filling” a 64Byte space.

  • Actual application scenarios:

    • disruptorConcurrent framework
    • JDK8You can annotate@Sun.misc.ContendedImplement two objects or variables that are never in the same block cache line.




Why does the CPU rearrange instructions?

When our program runs, it is converted by the compiler into statements that the CPU can execute, such as the following program:

int a = 1;
int b = 1;

a = a + b;
a++;
Copy the code

A simple observation shows that lines 4 and 5 are executed in the same order as each other without any inconsistency in the final result.

This makes CPU instruction rearrangement possible!

The root cause is that the CPU running speed does not match the memory reading speed. The statement in line 4 will make the CPU search for the value of B in the memory, so that the CPU has too much idle time. Considering the efficiency problem, the CPU executes the next statement first in the idle time of reading memory to improve the running efficiency of the program.


Reorder not allowed:
  • Happen -before: Six rules

  • As if serial: the execution result of a single thread cannot be changed

  • DCL(Double Check Lock) singleton mode: prevents an object from being returned by another thread before it has been created


Hot Spot implementation of volatile memory barriers:

Locking the CPU bus invalidates data in the CPU cache and causes other cpus to temporarily stop working to prevent instruction reordering.