3. CAP Principles

Concept to explain

CAP principle, also known as CAP theorem, refers to that in a distributed system, Consistency, Availability and Partition tolerance can only realize two elements at the same time, but not all of them.

  1. C: Consistency: data is consistent across multiple replicas. This means that two users access two systems A and B. When data on system A changes, the data is synchronized to system B in A timely manner so that the data is consistent between the two users.
  2. A: Availability: The external services provided by the system must always be available. In case of any fault, the client can obtain A non-incorrect response from the server within A reasonable time.
  3. P: Partition fault tolerance: The system can still provide services against any network Partition failure encountered in a distributed system. Network partition, it can be understand that in a distributed system, different nodes distribution in different subnet, may be only one node in the network, in under normal operation of all network, due to some reason the child nodes between network failure, cause the entire node environment been split into different independent area, this is the network partition.

Why only two

User 1 and user 2 access system A and system B. System A and system B synchronize data over the network. The ideal situation is that user 1 accesses system A to modify the data, changing datA1 to data2, while user 2 accesses system B and gets datA2.

But in practice, distributed systems have eight fallacies:

  1. The network is pretty reliable
  2. Delay is zero
  3. The transmission bandwidth is infinite
  4. The network is pretty secure
  5. The topology doesn’t change
  6. There must be a caretaker
  7. The transmission cost is zero
  8. Network homogenization

We know that as long as there are network calls, the network is always unreliable.

  1. When the network fails, system A and system B cannot synchronize data, that is, we do not meet P, and the two systems can still be accessed. Then n is actually A stand-alone system, not A distributed system. Therefore, since we are A distributed system, P must be met.
  2. When P content, if the user 1 to modify the data through the system A data1 changed data2, also want to let users 2 B get data2 right through the system, so at this time is to meet C, will have to wait for the network system A and system B data synchronization, and during the synchronization, anyone can’t access the system (the system is not available), B Otherwise the data is not consistent. In this case, CP is satisfied.
  3. When P is satisfied, if user 1 changes data1 to data2 through system A and system B can continue to provide services, then it can only accept that system A does not synchronize datA2 to system B (sacrificing consistency). AP is satisfied.