The principle of CAP

C: consistency A: availability P: fault tolerance of partitions

Two-stage submission

Consistency protocol: The coordinator is selected to coordinate the scheduling among the nodes of the distribution.

  1. After receiving the request, the coordinator enters the pre-committed transaction phase, sending the request to all participants, who perform the transaction but do not commit
  2. If some participants do not respond, the transaction needs to be rolled back, and if all participants respond normally, the coordinator initiates the commit transaction to all participants.

Three-stage commit

The first step of the two-phase commit is split into three steps: CanCommit, PreCommit, and DoCommit

  1. The coordinator asks the participant if he or she is able to perform transactions normally and tests the network communication
  2. A YES, the participant executes the transaction but does not commit, and sends it to the coordinator
  3. B interrupts the transaction with a No or timeout wait, sending abort to each participant
The leader held a meeting and sent messages to all the employees. Some people did not reply. The leader then sent messages to the employees, XXX is not in, not to open.Copy the code
  1. A submits the transaction, and the participant sends an ACK to the coordinator based on whether the transaction is successful or not. All of them are received, and the coordinator also submits the transaction
  2. The participant in B 2-a-B sends an ABORT ACK to the coordinator, who performs A rollback transaction after receiving all responses

What problem does three-phase commit solve?

Synchronous blocking: 2PC participants block synchronously if they do not give feedback to the coordinator, and 3PC introduces timeout waiting to resolve synchronous blocking. Coordinator hangs: A participant commits a transaction after not receiving a message from the coordinator for a long time. Still can’t resolve the partitioning problem.

Paxos algorithm

Paxos is an algorithm designed to solve the problem of distributed data consistency based on messaging. (a) Proposals with clients, proposals with acceptors, and learners A proposal needs to be approved by half of acceptors to proceed to the second phase, which needs to be approved by only half of acceptors.

  1. Parpare stage \

If a proposer sends a pre-proposal to an acceptor, an acceptor maintains a maxN number and responds only if the number is greater than maxN. Accept the request (n, Value), Value is the real content of the proposal. Two rounds of half approval will filter out more proposals. Parpare failed and changed its n to maxN+1 to resubmit the proposal.

Paxos live lock problem

Proposer1 passes the parpare phase with id1; proposer2 passes the parpare phase with ID2; P2 Enter the acceptor to find id2<maxN, and re-enter the parpare phase with ID4

The solution

  1. Delay: After an acceptor changes maxN, it waits a little while before accepting a new parpare request, not allowing it to change immediately
  2. Selects a primary proposer to parpare

ZAB protocol (Fast Paxos algorithm)

Three types of characters:

  • Leader: Handles write requests and is responsible for initiating votes
  • Followers: processes read requests and forwards the write requests to the leader, who votes
  • Observer: non-voting followers should not be added to the cluster when the load is high, as this will increase the communication pressure between F and L

Raft algorithm

Followers wait for the Leader to send a heartbeat detection message.

Zookeeper election algorithm

  1. Updating the Logical Clock
  2. Initial vote, vote for yourself
  3. Send initial vote
  4. Receiving external votes
  5. Determine the term of external votes
A: If the number of external votes is larger than this time, update the logical clock, initialize the vote again, and then PK, cast the vote B: If the number of external votes is smaller than this time, directly discard C: The number of external votes is equal to this one. PK - change the voting according to the number of rounds, ZXID and SID - change the voting after PK is completed, archive all the votes into Revset for statistics. If one candidate receives half of the votes, the election process can be terminatedCopy the code

Raft algorithm

Elections and malfunctioning election processes

  1. At the beginning, each node waits for its TimeOut to end and then sends election requests to other nodes.
  2. If half of them agree, they are upgraded to the Leader and continuously send heartbeat detection to the child nodes to check whether the child nodes are alive. The child nodes also maintain a timer. If they do not receive heartbeat detection from the Leader within a certain period of time, the election restarts

Master slave replication process

  1. The primary commits the transaction to its raftLog first, then requests it to its children, and half of the primary commits the transaction in response. Append Entries are then presented to the child node.

Primary/secondary replication in a network partition

A network partition failure occurred

The Term of Leader2 is larger. After completing the operation, Leader1 will be automatically demoted to follower until the network partition fault recovers. Delete the data not submitted in RaftLog for synchronization