Basic concept

Message Model

RcocketMQ consists of three parts: Producer, Broker and Consumer.

Being a Producer of messages Consumer C. Broker: Stores messages. One of these Broker services corresponds to one server in production. Each Broker can store messages for multiple topics, and topics can be fragmented among different brokers. Message Queues are used to store physical addresses of messages. Each Message address is stored in multiple Message queues. ConsumerGroup consists of several Consumer services.

Producer

The Producer is responsible for producing messages, which are typically produced by the business system and sent to the Broker server. Producer supports multiple message sending modes: synchronous sending (blocking), asynchronous sending, sequential sending, and single sending. Both synchronous and asynchronous require the Broker to return confirmation.

Producer group: REFERS to a set of producers in the same category. Such producers send messages logically. If it is a transactional message, the service crashes after sending the message, and the Broker notifies the Producer service to submit or rollback the consuming message.

Consumer

The Consumer is responsible for consuming messages, and typically the backend system is responsible for asynchronous consumption. A message consumer pulls messages from the Broker and provides them to the application. From an application perspective, a Consumer provides two forms of consumption: pull consumption and push consumption.

Pull mode: The application invokes the Consumer’s pull method to pull messages from the Broker server, with the application controlling the initiative. Once the message is retrieved, the application initiates the consumption process;

Push mode: The Broker will actively push the data to the consumer after receiving it. This consumption mode has high real-time performance.

Consumer group: A collection of consumers of the same type, usually consuming the same type of message and consuming logically. This makes it easy to achieve the goal of load balancing and fault tolerance in consuming messages. The premise is that the consumer group must subscribe to the same Topic.

RocketMQ supports two message modes:

Clustered consumption: Each Consumer instance of the same Consumer Group allocates messages equally;

Broadcast consumption: Each Consumer instance of the same Consumer Group receives the full number of messages;

Topic Topic

Represents a collection of a class of messages. Each Topic contains several messages, and each message can belong to only one Topic, which is RocketQM’s basic unit for message subscription.

Data from the same Topic is fragmented to different brokers, and the unit of each fragment is called a MessageQueue. A MessageQueue is the smallest unit where a producer sends a message and a consumer consumes a message.

Proxy server

A message transfer role that stores and forwards messages. The proxy server also stores message-related metadata, including consumer groups, consumption progress offsets, and topic message queues.

Broker Server is the business core of RocketMQ and contains several important sub-modules:

Remoting Module: The entity of the entire Broker that handles requests from clients;

Client Manager: Responsible for managing clients (Producer/Consumer) and maintaining Topic subscription information of consumers;

Store Service: provides encapsulated APIS to Store messages to physical disks and query messages.

HA Service: A high availability Service that provides data synchronization between Master brokers and slaves.

Index Service: Indexes messages delivered to the Broker based on a specific Message Key to provide quick lookup of messages.

RocketMQ’s high availability model has two architectures:

Average cluster

In this mode, each node is assigned a fixed role. The master responds to requests from the client and stores messages, while the slave synchronizes and saves data from the master and reads requests from the client.

Messages can be synchronized synchronously or asynchronously.

The salient disadvantage of this situation is that when the master dies, the roles cannot be switched, meaning that the set of brokers are not available.

Dledger High availability cluster

Dledger is a high availability clustering technology introduced in Version 4.5 of RocketMQ. In this mode, when the master fails, the cluster randomly elects a node as the master.

To do this, Dledger does the following:

1. Take over the CommitLog message store of the Broker.

2. The master node is elected from the cluster.

3. Messages are synchronized from the master node to the slave node.

Dledger’s election algorithm

Dledger’s core is also its master election. Raft algorithm is used for election. The process is roughly as follows:

In Dledger mode, each node has three states: Leader, follower, and candidate.

normal

A cluster has only one Leader and the rest are followers. Followers only respond to requests from the Leader and Candidate. All requests from the client are handled by the Leader. Even if a client requests a follower, the request will be sent to the Leader for processing.

Start the analysis

At the beginning of the cluster, all nodes are in the follower state. Then a timeout signal will be sent inside the cluster. After receiving the signal, all followers will change from follower to Candidate state and start to collect votes. There will be a new election. After the Leader is identified, the Leader node sends heartbeat signals to other nodes to confirm its Leader status.

After the Leader sends a heartbeat signal, the cluster starts a timer. If the followers do not receive the heartbeat signal from the Leader within a specified time, they become candidates and request other members to vote. If they receive more than half of the votes from the Leader, The Candidate becomes the new Leader. Followers.

Specifying time: In Raft protocol, time is broken up into arbitrary time segments called terms. Term is identified by a globally unique, continuously increasing number that acts as a logical clock.

In each term, a new election will be held, which means that the followers do not receive heartbeat confirmation from the leader before the arrival of one term, even if a new leader appears before the arrival of the next term, all nodes will become candidates. Call for a new election. This also ensures that there is only one leader in each term. That is, the leader may also degenerate into followers. In a cluster, the leader is constantly changing.

In some cases, the votes may be divided among nodes, and no majority party can be formed, leading to no leader until the end of the term, so the election will continue to the next term. There is no split brain problem in Zookeeper.

In each election process, each node will store the current term number. When the node communicates with each other, it will bring its own number. If it finds that its number is smaller than another one during the communication, it will update its own number to the larger one. If the leader or Candidate realizes that his or her id is not up to date, he or she automatically converts to follower. If the request term number received is less than its own number, term refuses to execute

The election process

During the election process, Raft protocol initiates the leader election through the heartbeat mechanism. At this time, the node state starts from the follower state. Each node increases its current term and changes its state to Candidate, and then initiates a voting request to other nodes with its own term number. Which means they all vote for themselves by default. The Candidate status may change in one of the following ways:


Win the election and become the leader: If a node receives a majority of votes during a term, it will become the leader for the rest of the term and then send heartbeat signals to establish its position. (Each server can only cast one vote per term, and the vote is on a first-come, first-served basis)


Other nodes become leaders: While waiting for the vote, heartbeat signals from other servers may be received, indicating that another leader has been generated. At this time, by comparing the term number of the leader with that of the RPC, if the term number of the leader is larger than that of the other party, it indicates that the leader’s term has expired, and the LEADER will reject THE RPC and continue to maintain the candidate’s identity. If the number of the other party is greater than or equal to its own number, the status of the other party is recognized and the other party changes from Candidate to follower


The votes are divided and the election is lost: if no Candidate wins a majority of votes, no leader is elected and the candidates wait out the timeout to initiate another election. But to prevent this from happening again in the next election, Raft uses a random election timeout, where nodes randomly hibernate to prevent votes from being continuously split up. By setting timeout to a random value on an interval, there is a high probability that a Candidate will run out first and win the majority of votes.

Dledger copy synchronization

Two-phase commit: Commited, uncommitted;

A brief introduction to the process:

The Dledger on the Leader Broker receives a piece of data, marks it in the uncommitted state, and sends the uncommitted data to the DledgerServer component of the Follower Broker through its DledgerServer component.

After receiving a message, the Follower Broker’s DledgerServer component returns an ACK to the Leader Broker’s Dledger. If the Leader Broker receives half of the acks returned by the Follower brokers, it marks the message committed. This completes the two-stage data synchronization.

Routing center Name Server

The Broker message server starts to register information with all NameserVers. When the Producer sends a message, it first retrieves a list of Broker server addresses from NameServer and then selects a Broker from the list to send according to the load balancing algorithm.

The Name Server keeps a long connection to each Broker Server, checks for the Broker to be alive at 30 seconds interval, and removes it from the routing table if it detects a Broker down. However, the removal does not immediately notify Producer. The purpose of this design is to reduce the complexity of NameServer implementation, and the logical implementation of this part is placed on the message sender to provide a fault tolerance mechanism to ensure the high availability of message sending. (Ensuring that messages are sent is also mentioned earlier)

The high availability of NameServer itself is achieved by deploying multiple Nameservers that do not communicate with each other.

The Message (the Message)

The physical carrier of messages/information, the smallest unit that produces and consumes data, and each message must belong to a Topic. Each message in RocketMQ has a unique ID and carries a key with a business identity. The system supports Message query by Message ID and key.

Message also has a Tag Tag. It is used to distinguish different types of messages under the same topic and realize multi-dimensional expansion of different modules under the same business. Tag tags can be thought of as second-dimension message filtering, and RocketMQ also provides SQL message filtering.

See docs/cn/best_practice.md for details on the specific fields of Message.