Mike Burrows, author of Google Chubby, has said that there is only one consistency algorithm in the world and that is Paxos, and the rest of them are inferior.

Paxos algorithm has a history of nearly 30 years, and is currently recognized as one of the most effective algorithms to solve the problem of consistency in distributed scenarios. However, it is difficult to understand and engineering. This paper hopes to explain the Paxos algorithm clearly with legends and easy to understand examples.

The value of Paxos algorithms


In a distributed system, in the process of asynchronous communication, there will always be network fluctuations, machine downtime and other situations, so how to quickly and safely reach an agreement on a certain value in such a complex situation? Paxos algorithm is mainly to solve such problems, in the cloth lock, master-slave replication, naming service, distributed coordination and other common scenarios, Paxos algorithm has a wide range of applications.

The basic concept


Participate in the role

In the Paxos algorithm, all participants are divided into the following roles

role Division of labor Participation in decision making
Proposer Make a proposal, proposal :[Id, proposed Value] Square root
Acceptor Accept a proposal, approve/reject a proposal, and be Chosen when approved by a majority of Quorum acceptors Square root
Learner Learn the latest selected proposal x
  • Proposal: The proposal is composed of numbers and values. Paxos algorithm requires us to ensure that the number Id of the proposal is globally unique and orderly (there are many specific implementations, which are beyond the scope of this article).
  • Quorum: literally, Quorum, in Paxos means that any two Quorum systems contain at least one public member, which can be understood as containing the majority of the members of the Acceptor set. If there are 2F+ 1-bit acceptors, the Quorum number is F+ 1-bit.
  • A Proposer, Acceptor, or Learner is only a division of roles. In a concrete implementation, a process may perform more than one role.

Necessary conditions for Paxos algorithm to be correct

Now the participants of the algorithm are divided into such three roles, so what kind of work goal is to let them complete?

A distributed algorithm has two most important attributes: activity and security

  • Activity means “what is expected will happen in the end”, and consistency is activity in the end.
  • Security means that if security rules are violated, the system will lose.

We can examine the correctness of Paxos algorithm from these two aspects

Activity:

One proposal is guaranteed to be selected, and when the proposal is selected, the process eventually gets the proposal as well.

Security:

  • A value can only be approved if it is submitted by a proposer.
  • In an execution instance of the Paxos algorithm, only one value is chosen.
  • Learners can only acquire values chosen.

So let’s look at the specific algorithm flow

Algorithm process


The algorithm description is from The Distributed Consistency Principle and Practice from Paxos to ZooKeeper by Ni Chao.

Proposal presentation and approval
  • Phase one

    1. Proposer selects a proposal number N and sends a Prepare request numbered N to more than half of the acceptors.
    2. If an Acceptor receives a Prepare request numbered N with N greater than the number of Prepare requests it has responded to, it sends back to the Proposer the highest-numbered proposal, if any, it has accepted. The Acceptor also promises not to accept any proposal numbered less than N.
  • Phase two

    1. If the Proposer receives a response to a Prepare request numbered N from more than half of its acceptors, it sends an Accept request to the proposal numbered N to more than half of the acceptors. Note: V is the value of the highest-numbered proposal in the response received. If the response does not contain any proposals, then V is decided by the Proposer itself.

    2. If an Acceptor receives an Accept request for a proposal numbered N, it accepts the proposal as long as the Acceptor has not responded to a Prepare request numbered greater than N.

Release of proposal
  1. Acceptors need to send acceptors to a subset of learners, who then notify all learners.
  2. However, due to the uncertainty of message transmission, there may be no message that is approved by the resolution. When the learners need to know how a decision has been passed, they can ask a proposer to redo the proposal. Note that a learner may also be a proposer.

The dry algorithm description can be difficult to understand, so I borrowed a very concise diagram from Paxos, which illustrates the distributed consistency protocol.

An Acceptor only needs to store the maximum number (MaxN), maximum number (AcceptN), and Value (AcceptV) of the approved proposal, and then approve the proposal according to phase 1 (2) and Phase 2 (2) rules. That is to ensure the security of approval.

A Proposer requires a unique and monotonically increasing number of proposals at phase 1, and a phase 2 proposal with a sufficient number of guarantees (that is, a majority of acceptors with a Proposer) to secure the proposal.

So why can this satisfy the security of the distributed algorithm described above? This is from the derivation of the Paxos algorithm. The full derivation can be found at Wikipedia.

Let me talk about my understanding. There are several important constraints in the derivation:

P1: An acceptor must accept a proposal when it is first received.

P1a: An acceptor accepts a proposal numbered N if and only if it has not responded to a prepare request numbered greater than N.

P2: Once a proposal with value V is approved (chosen), subsequent proposals approved (chosen) must have value V.

P2a: Once a proposal with value V is chosen, any subsequent proposal accepted by acceptors must have value V.

P2b: Once a proposal with value V is chosen, then any subsequent proposals with a proposer must have value V.

P2c: If a proposal numbered N has value V, then there is a majority, and either none of them have accepted any proposal numbered less than n, or they have accepted all proposals numbered less than n with the largest proposal having value V.

The relationship between them can be illustrated in the following figure

When Acceptor accepts only one proposal, it is possible to accept only one Value depending on P1. However, in this case, it is possible to repeat the voting process several times to achieve a consistent state, which guarantees security at the expense of some activity. As shown below:

(a) if a Proposer does not approve a proposal with a numbered number of acceptors in the same machine, but if it receives a proposal with a numbered number of acceptors in other machine houses, it is difficult for it to receive a proposal with a numbered number of acceptors in other machine houses A Proposer at this time only generates a numbered proposal with the hope that it will be approved by a majority of acceptors (two). A lucky Dog may end up with a majority of acceptors in the future, but we are already waiting for it.

So in order to be able to quickly reach consistency, and introduced the P2c and P1a, lifted the Acceptor in the P1a restrictions can only approved a proposal, but increased the number of restrictions on approval of proposals, increased the proposals are put forward on the Proposer in P2 Value Value, the two limits of direct effect, there are two:

  1. No Proposer needs to respond to an Acceptor with a prepare request (Proposer) before it formally submits a proposal with a majority of acceptors to ensure that no proposals have already been accepted. If a Prepare request has been approved, then it represents a Proposer that has been informed of the decision in the Paxos execution instance. In addition, replace the Value of the proposal with that of the approved proposal to ensure security.
  2. If an Acceptor accepts more than one proposal, it guarantees that each proposal it approves has the same Value. Thus ensuring security.

This may be A little confusing, but let’s draw A diagram of the above example and see the process for selecting proposal A and proposal B.

  • P represents promises with acceptors to proposals
  • A represents acceptors’ acceptance of proposals with respect to proposals numbered numbered with no proposals
  • PE stands for guaranteed failure, as shown in Figure 1
  • AE stands for approval failure, as shown in Figure 1
  • The proposal number consists of the timestamp and machine Id. For example, 1.2 indicates the proposal made by MACHINE Id2 when the timestamp is 1.
  • The numbers to the right of the letters represent proposal numbers. For example, P1.1 represents Acceptor promises for proposals numbered 1.1
  • For example, P1.1[1.2:A] represents the Acceptor’s Promise for proposal No. 1.1 and responds, “I have approved the proposal No. 1.2 with the value A.”

As shown in Figure 4, the resulting proposal has A value of A.

As shown in Figure 5, the resulting proposal has a value of B.

Stop and think about it for a moment. Strictly speaking, the sacrificing-activity problem described above does not solve the problem, but only reduces the probability that it will occur, and in extreme cases a state similar to “live lock” can occur. As shown in the figure below

In extreme cases, the cycle goes on and on. So in order to solve this problem, and puts forward the Multi – Paxos algorithm, Multi – Paxos algorithm in here don’t do statements, it is in the Proposer and make the concept of a Leader, at an early stage, all the Proposer in the election of a Leader, Then only the Leader can submit a proposal to an Acceptor. If the Leader has a problem, a Leader is elected. With no competition with a Proposer, the phase changes to one, increasing efficiency and eliminating the problem of live locks. However, if you think about it carefully, there may be life locks in the process of running for Leader, which IS the real reason why Raft algorithm was proposed (dog head). After all, after a long circle, a single point Leader is finally created to manage. It is better to select the Leader using the P1+ retry mechanism above. Efficiency is usually about the same, but it is slower in the election of the Leader.

conclusion


This paper analyzes the application value and specific realization principle of Paxos algorithm, hoping to let everyone in the process of learning Paxos algorithm can lose a little hair. I will probably update my understanding of the Raft algorithm in the future.


Welcome to my blog: The Code memo of Wang Lao mo