Those who have experienced technical interviews are already familiar with these two concepts.

When I was interviewing for a job, it was almost an exaggeration to say that when asked about distribution, interviewers almost invariably asked about these two theories of distribution.

And, these two theories can also be said to be partners to learn distributed related content of the foundation!

Therefore, it is very, very necessary for friends to understand this theory and be able to explain it to others with their own understanding.

In this article, I will interpret these two concepts from my own perspective!

Theory of CAP

CAP theory/theorem originated in 2000 and was proposed by Professor Eric Brewer of University of California Berkeley in the Seminar on Distributed Computing Principles (PODC). Therefore, CAP theorem is also called Brewer’s Theorem.

Two years later, Seth Gilbert and Nancy Lynch of THE Massachusetts Institute of Technology published a proof of brewer’s conjecture, and CAP theory officially became a distributed field theorem.

Introduction to the

CAP is an acronym for Consistency, Availability and Partition Tolerance.

Brewer, the originator of CAP theory, did not define Consistency, Availability and Partition Tolerance clearly when he put forward CAP conjecture.

Therefore, there are many folk interpretations of CAP, and the generally recommended solution is the following version.

In theoretical computer science, CAP theorem states that for a distributed system, read and write operations can only be designed to satisfy two of the following three points:

  • Consistence: All nodes access the same latest copy of data
  • Availability: Non-failed nodes return reasonable responses (not error or timeout responses) in a reasonable amount of time.
  • Partition tolerance: Distributed systems can still provide services against network partitions.

What is a network partition?

In a distributed system, the network of multiple nodes was originally connected, but due to some faults (such as some node network problems) some nodes are not connected, the whole network is divided into several areas, which is called network partition.

It’s not “choose two out of three.”

Most people explain this law simply as “consistency, availability, and partition tolerance you can achieve two of them at the same time, but not at the same time.” In fact, this is a very misleading statement, and the father of CAP rewrote his previous paper in 2012, 12 years after the CAP theory was born.

When a network partition occurs, if we want to continue service, it is a choice between strong consistency and availability. That is to say, when P is the premise after network partition, C and A can be selected only after P is determined. That is to say, we must realize Partition tolerance.

In short, the fault tolerance P of partition in CAP theory must be satisfied. On this basis, only availability A or consistency C can be satisfied.

Therefore, it is theoretically impossible for a distributed system to choose CA architecture, only CP or AP architecture.

Why not also guarantee CA?

For example, if the system is “partitioned”, a node in the system is writing data. To ensure C, read and write operations on other nodes must be prohibited, which conflicts with A. C will be in conflict with A if the read and write operations on other nodes are normal.

The key to the selection lies in the current business scenario, and there is no final conclusion. For example, for scenarios requiring strong consistency, banks generally choose CP.

CAP practical application cases

I will use the registry to explore the practical application of CAP. Considering that many partners do not know what the registry is, here is a simple example of Dubbo to say.

Below is the architecture of Dubbo. What role does Registry play in this? What services are provided?

The registry is responsible for the registration and lookup of service addresses, which is equivalent to a directory service. Service providers and consumers only interact with the registry at startup, and the registry does not forward requests, which is less stressful.

Common components that can be used as registries are: ZooKeeper, Eureka, Nacos… .

  1. ZooKeeper guarantees CP. Read requests to ZooKeeper are consistent at any time. However, ZooKeeper does not guarantee the availability of each request. For example, the service is unavailable during the Leader election process or when more than half of the machines are unavailable.
  2. Eureka guarantees AP. Eureka is designed to ensure A (availability) first. There are no Leader nodes in Eureka; each node is equal and equal. So Eureka won’t be unavailable during elections or when more than half of the machines are unavailable, as ZooKeeper is. Eureka guarantees that the failure of most nodes will not affect normal service delivery, as long as only one node is available. It’s just that the data on this node may not be up to date.
  3. Nacos supports both CP and AP.

conclusion

When designing and developing distributed systems, we should not only focus on CAP issues, but also pay attention to system scalability, availability and so on

In the case of system “partitioning”, CAP theory can only satisfy CP or AP. Note that the premise here is that the system has been partitioned.

If the system does not “partition”, the network connection between nodes communication is normal, there is no P. At this point, we can guarantee both C and A.

Summary: If the system is “partitioned”, we should consider whether to choose CP or AP. If the system is not “partitioned”, we need to think about how to guarantee the CA.

Recommended reading

  1. CAP Theorem simplification (English, interesting Case)
  2. Where is the god-like CAP theory applied (Chinese, with many practical examples)
  3. Please stop calling database CP or AP
  4. Java Basic Education

By Snailclimb: CAP Theory interpretation source: Github