CAP

CAP is actually short for Consistency, Availability, Partition Tolerence

C-Consistency

Distributed systems have many nodes. The C (Consistency) in CPA ensures the data Consistency of these nodes. As shown in the following figure, for example, the client sends A request to the server of node A to insert A piece of data. When the insertion is successful, the server A notifies the client that the insertion is successful and synchronizes the data to other server nodes. The client then queries on server B and if it finds it then it proves to be consistent. If not, it is inconsistent.

Of course, there are several kinds of consistencyStrong consistency, weak consistency, and final consistency

  • Strong consistency

This is a strong consistency, as shown in the figure above. When data is updated, any subsequent reads of the data will get the updated value. This is called strong consistency.

  • Weak consistency

When you update data from one node and nobody knows when it’s going to be synchronized to the other node servers, you might get null values from the other node servers, or you might get old values and that’s called weak consistency.

  • Final consistency

When you modify or write data on one node, it is not immediately synchronized to the server on the other node. It may be tens of seconds or minutes late. But eventually it ensures that the servers on the other nodes are up to date with the latest data, which is called final consistency.

A-Availability

CPA’s A stands for A-availability. This usability is easy to understand. It is the availability degree of your distributed system. For example, if a customer accesses your service successfully, he is available, and if he fails to access it, he is unavailable. The occurrence of such unavailability will cause the decrease of the availability degree of a system. Generally speaking, availability can be divided into 99%, 99.9%, 99.99%, 99.999%

  • 99%, only about 80 hours a year are allowed to fail
  • 99.9%, about eight hours a year, are accessible failures
  • 99.99%, less than an hour a year is accessible to failures
  • 99.999%, less than 5 minutes of access failure per year

Most of the projects that are out there right now are not even 99% usable. Or maybe 99% is a normal level. A 99.9% system is a better system. 99.99% or even 99.999% is pretty much the best system in the industry.

P-partition Tolerence

The P in CPA refers to p-partition Tolerence the system continues to operate despite arbitrary message loss or failure of part Of the system refers to a distributed system that tolerates network failures. For example, the nodes can’t communicate with each other, so it’s not going to be a very serious problem but at least it’s not going to be a whole system crash, where you send a message to each node, all of them fail, all of them fail, and the whole system goes down. The nodes of the entire distributed system do what they need to do, but they can’t communicate with each other. The distributed system is still running, but if you send a request to another node, they will still give you some default response.

CP and AP in CAP

Generally speaking, CAP can satisfy either CP or AP, because it is impossible to obtain A-availability when the system achieves C-consistency. And p-partition Tolerence is important in distributed systems. In case of partition failure, network fluctuation, packet loss and node breakdown caused by network, it is very important to ensure the normal operation of the whole system. Therefore, many distributed systems es have designed mechanisms to prevent brain splitting

CP consistency + partition tolerance

Suppose there is a network failure, but P ensures the operation of the system. However, at this time, the server nodes cannot communicate with each other and cannot synchronize data. At this point, the client will query the data, that is, the data of the node id=1name= SAN, the system is actually in an inconsistent state, because the data between nodes are different, if the client to query the data id=1name= SAN, if you want to ensure CP, Have to return A special (abnormal) to the client the result of any one node will not receive any query request at this time, return an exception (system currently in an inconsistent state, unable to query), so, inconsistent data are invisible to the client, but at this point, has sacrificed A – the Availability usability, Because you don’t see inconsistent data, you send A request and it returns an exception. The request fails. Then the distributed system is temporarily unavailable, that is, CP is guaranteed, and a-availability is not available. For example, ZooKeeper, mongodb, hbase, and so on are all CP, which means that the data is 100% consistent. However, you may fail to request inconsistent data in some cases, which is CP

If you want to guarantee CP, C-Consistency, guarantee that you can write a piece of data in any case, and then look up from any node can see the same data, it is not possible to see the old data and the new data, so Consistency is guaranteed

AP availability + partition tolerance

If there is A network failure, data is not synchronized, data is in an inconsistent state, to ensure a-availability, you have to allow both nodes to be queried by any client, so that the entire system is available, but at the expense of C-consistency. If id=1 and name= 3, the client will not be able to find the data from another node. In a variety of distributed systems, IT is impossible for CAP to have both at the same time. What I mean is that in the case of network failure, some data may not have synchronization consistency. In this case, either CP consistency + partition tolerance, or AP availability + partition tolerance. For 12306, or e-commerce platforms such projects are generally AP availability + zoning tolerance, that is, the inventory of goods or train tickets you see is wrong. Is old data, the data you see may be inconsistent. For example, if you go shopping for a product, you can still add an item to your cart, but he will definitely check the inventory before settling the order. For example, when you see a second kill, you still see 1,000 units but when you place an order and he goes to check the inventory, he finds that the inventory is already zero and you can’t buy it.

BASE

Eventual Consistency is usually BASE, Basicly Available, Soft State, and Eventual Consistency. BASE hopes that CAP can be basically realized at the same time, but it does not require 100% perfect implementation at the same time. CAP three can be basically realized at the same time, BASE, basic availability and final consistency

Basicly Availabl is basically available

Basic refers to a distributed system available in the event of a failure, allow loss of availability, under normal circumstances, is the query can check the load balance to the various nodes, node is also more than can be resisting high concurrency queries, but at this time, if you want to downgrade can be downgraded to a, all client queries the master node, compulsory so that see data for now are all the same. All from the master node. But because the client page view is too large, at the same time with a master node to support the hole, can not carry, at this time need to limit the master node to degrade, that is to say, if the traffic is too large, directly return a blank, let you come back to query later. This is to ensure that the so-called basic available, degraded measures in the inside, with the normal available is not the same, than the normal available to some poor, but still basic can be used. For example, in e-commerce promotion, some users may be directed to degraded pages in order to cope with the surge of traffic, and the service layer may only provide degraded services. This is a partial loss of usability.

Soft State Indicates the Soft State

The soft state allows multiple nodes to synchronize data. During a period of time, data on each node may be inconsistent and the data is being synchronized. This is the soft state. For example, if you query several nodes, some can query the data id=1name= zhang SAN, but some can not query the data. The soft state in BASE is the soft state in BASE

Eventual Consistency Final Consistency

Final consistency refers to that once the failure or delay is resolved, the data must be synchronized to other nodes after a period of time, and the data must be consistent at last. Although there is a soft state, it will eventually become consistent.

The resources

Ruape Technology Nest micro service System architecture 120-day training camp