How does the memory model solve the cache consistency problem?

GitHub 2.2K Star Java engineers become god’s path, not to learn?

GitHub 2.2K Star Java engineers become god’s way, really not to know?

GitHub 2.2K Star Java engineer becomes god’s way, really sure not to check out?

Next time someone asks you what the Java memory model is, send them this article where we’ve covered the ins and outs of the Java memory model.

As we mentioned in this article, there is a certain difference in processing speed between CPU and main memory. In order to match this gap and improve computer power, people have added layers of caching between CPU and main memory. Each CPU will have L1, L2, and even L3 caches. In a multi-core computer, there will be multiple cpus, so there will be multiple sets of caches, and the data between these sets of caches may be inconsistent. To solve this problem, there is an in-memory model. Memory model defines the specification of multithreaded program read/write behavior in shared memory system. These rules are used to regulate the read and write operations of memory, so as to ensure the correctness of instruction execution.

How does the memory model guarantee cache consistency?

Let’s try to answer that question. First, cache consistency is a problem caused by the introduction of caches, so it is an issue that many CPU vendors must address. In order to solve the problem of inconsistent cache data mentioned above, many schemes have been proposed. Generally speaking, there are the following two schemes:

1. Lock the bus with LOCK#.

2. Through the Cache Coherence Protocol.

In the early days of cpus, cache inconsistencies were solved by placing a LOCK# lock on the bus. Since the CPU communicates with other components through the bus, locking the bus prevents other cpus from accessing other components (such as memory) so that only one CPU can use the variable’s memory. The LCOK# lock is signaled on the bus, so the other CPU can read the variable from its memory and do something about it only after this code completes. This solves the cache inconsistency problem.

However, because other cpus cannot access memory during the bus lock, this can cause inefficiency. Hence the second solution, which addresses cache consistency through a cache consistency protocol.

Cache consistency protocol

Cache Coherence Protocol, best known for Intel’s MESI Protocol, ensures that copies of shared variables used in each Cache are consistent.

The central idea of MESI is: When a CPU writes data, if it finds that the variable is a shared variable, that is, a copy of the variable exists in other cpus, it sends a signal to inform other cpus to set the cache line of the variable to invalid state. Therefore, when other cpus need to read the variable, they find that the cache line of the variable is invalid. Then it will re-read from memory.

In the MESI protocol, each cache may have four states, which are:

M(Modified) : This line of data is valid. The data is Modified and inconsistent with the data in memory. The data exists only in the local Cache.

E(Exclusive) : This line of data is valid. The data is consistent with the data in the memory and exists only in the local Cache.

S(Shared) : This line of data is valid. The data is consistent with the data in the memory. The data is stored in many caches.

I(Invalid) : This line of data is Invalid.

More details on MESI are not covered here, but MESI is a common cache consistency protocol that can be used to resolve data consistency issues between caches.

However, it is worth noting that there are two behaviors in the traditional MESI protocol that have high execution costs.

One is to mark a Cache Line as Invalid. The other is to write new data to a Cache Line when its current state is Invalid. So the CPU uses Store Buffer and Invalidate Queue components to reduce latency for these operations.

As shown in figure:

When one CPU writes, it first sends an Invalid message to the other cpus, and then writes the data currently written to the Store Buffer. It then asynchronously writes to the Cache at some point.

If the current CPU core wants to read data from the Cache, it must scan the Store Buffer before reading data from the Cache.

However, the Store Buffer of the current core is not visible to the other CPU cores until the Store Buffer is flushed from the Cache.

When a CPU core receives an Invalid message, it writes the message to its Invalidate Queue and asynchronously sets it to the Invalid state.

Unlike the Store Buffer, the current CPU core does not scan the Invalidate Queue while using the Cache, so very short periods of dirty reads can occur.

Therefore, to solve the problem of cache consistency, a typical solution is the MESI cache consistency protocol.

MESI protocol can guarantee cache consistency, but not real-time.

The memory model

After the cache consistency model, let’s look at the memory model. We said that the memory model defines a set of specifications that guarantee visibility, order, and atomicity when multiple threads access shared variables. (For more on this, please send this article to anyone who asks you what the Java memory model is.)

By extension, the Memory Model is usually referred to as the Memory Consistency Model.

We mentioned cache consistency earlier, and memory consistency again, not to confuse the reader, but to make the comparison clearer.

Cache Coherence is the consistency of data across multiple Cache copies.

Memory Consistency guarantees what values can be read by multithreaded programs when accessing the Memory.

Let’s start with the following program:

Thread1: S1: x=1 L1: r1=y Thread2: S2: y=2 L2: r2=xCopy the code

S1, S2, L1, and L2 are statement codes (S stands for Store and L stands for Load). R1 and R2 are two registers. X and y are two different memory variables. What might r1 and R2 be after two threads have finished executing?

Note that threads execute concurrently and alternately, here is the possible order of execution and corresponding results:

S1 L1 S2 L2 r1=0 r2=2 S1 S2 L1 L2 r1=2 R2 =1 S2 L2 S1 L1 r1=2 R2 =0Copy the code

These are expected, reasonable. However, with x86 architecture, it is likely that r1=0 and R2 =0 would be the result.

Without Memory Consistency, the output of the program code written by the programmer is uncertain.

Therefore, Memory Consistency is a protocol between programmers (programming language), compilers, and cpus. This protocol guarantees what the program will get when it accesses memory.

To put it simply, memory consistency is the assurance that the results of concurrent programs run as expected by the programmer (through locking, of course). It involves atomicity, orderliness, and visibility in concurrent programming. And cache consistency is all about visibility in concurrent programming.

In many implementations of memory models, cache consistency is guaranteed through the hardware level cache consistency protocol. It’s important to note that the memory model I’m talking about here is the computer memory model, not the Java memory model.

conclusion

Cache consistency issues. The problem at the hardware level refers to the data inconsistency between caches caused by multiple sets of caches in a multi-core computer.

PS: Again, in Java multithreading, each thread has its own working memory and needs to interact with main memory. The working memory here is not the same thing as the cache on the computer hardware, but it is an analogy. Therefore, the visibility problems of concurrent programming are caused by inconsistencies in local memory data between threads, and have nothing to do with the computer cache.

Cache consistency protocol. The MESI protocol is commonly used to solve cache consistency problems.

Memory consistency model. Shielding computer hardware is mainly used to solve the atomicity, order and consistency problems in concurrent programming.

The cache consistency model may be used when implementing the memory consistency model.

thinking

Finally, I leave you with one more question:

At the hardware level, cache consistency protocols already exist to ensure cache consistency (i.e., visibility in concurrent programming), so why do programmers write multithreaded code using volatile, synchronized, and other keywords to ensure visibility?

The answer to this question will be addressed in the next in-depth introduction to volatile. Welcome to my blog (www.hollischuang.com) and public account (Hollis) to learn as soon as possible.

The resources

Talk about atomic variables, locks, memory barriers

Memory Consistency and Cache Coherence

Dao no art, art can become; Art without tao, stop in tao; Welcome to pay attention to “The Way of Java” public account, together with the way bending, with the way knowledge;

How does the memory model solve the cache consistency problem?

Cache consistency protocol

The memory model

conclusion

thinking

The resources

Related Posts

Advanced Java features – Annotations: Annotations enable Excel export functionality

Pandas Learning Notes (9) : Exploratory analysis

Python clears up coding issues: from ASCII to GB series