sequence

Long, long ago, the way of communication between people was to talk face to face. It was simple and reliable, but it also had great disadvantages. For example, when you are the leader of an army and every subordinate reports to you as soon as something happens, one of them is fine, but when you have dozens or hundreds of subordinates and they talk to you at all hours of the day and everywhere, you might not hear anything and your head might explode. At this time, you said stop, stop to me, to report the situation, to the door line up, one by one, this is called flow cutting peak, a group of people do not rush, are obediently give me line up.

Then you one by one to listen to, listen to a full 24 hours, not too sleepy, not thinking about this, so it could be days jealous person of outstanding ability, so you and he said, come, pen and paper, is to write on the paper to report the news, tell Lv Xiucai after finish, then listen to the instructions Lv Xiucai, along the house is on the right foot, stacked neatly according to the instructions of position, This is called asynchronous processing. You can finally control the progress of information retrieval and go to bed happily.

And the person who reports can write it down on paper, stack it up, and then they can step back and do what they need to do, instead of waiting at the door, and that’s called decoupling. Peak clipping, asynchronous, decoupling. These are the three most common scenarios for message queues. The subordinates in the story are the role of message producers, the land at the base of the wall on the right side of the house is the message persistence, Lv Xiucai is the message scheduling center, and you are the role of message consumers. Where to stack the news reported by subordinates, and where to find the news, depends on Lu xiucai’s amazing memory, so that the news can be accurately put and consumed.

The message dispatch center is the star of the day

In RocketMQ, there is a role similar to Lu’s, named NameServer, which is the overall control of distributed message scheduling. It is the soul of RocketMQ, and without it, RocketMQ would fall apart and fail to work. So, how does it work? Let’s start with a map of the RocketMQ house:

Messy as spider silk? Don’t be afraid. In other words, forget about the picture.

Let’s take an analogy to real life. If a person wants to send an express to another person, he needs to check which post offices there are online, and then choose one of them to deliver the express to it, and then the post office will deliver the express to the target person.

Need to complete the whole business process, first of all need to post their own information registration to the satellite network, satellite scheduling is responsible for the information, so that the sender can choose to know what the post office, the recipient via satellite network know which express mail to the post office, you can contact the post office communication is suitable for the distribution of time, while the post office is responsible for receiving and distribution storage express mail. The RocketMQ schematic diagram looks like this:

Producer: A message Producer used to send messages to the message server, which is the sender in the figure. NameServer: Routing registry, which is the satellite in the picture. Broker: Message storage server, which is the post office in the figure. Consumer: The message Consumer, which is not the focus of today, is the recipient.

Thus, NameServer, as the coordinator of distributed message queues, plays the role of information route registration and discovery.

Routing registered

After the post office is completed, it needs to be connected to the satellite network and put itself into the satellite network management, which is equivalent to announcing to the public that the post office has started operation and can receive and deliver mail. After the post office is connected to the Internet, how to make the satellite continuously and timely perceive the adjustment of the post office online and its own information, so that the satellite can coordinate the post office at any time? At this time, you need the post office to send a regular message to the satellite: “beep beep beep ———— I am post office C, number SHC, address XXXXX, belong to Shanghai Cluster, China, online, at this moment, March 15, 2019, 13:21 seconds” the satellite received the message, take a small book to record: “Post office B, BJB, Beijing, March 15, 2019, 13:10 seconds, alive…” “Post Office A, BJA, Beijing, March 15, 2019, 13:15 seconds, alive…” “Post office C, SHC, Shanghai, March 15, 2019, 13:21 seconds, alive…” .

The above story tells the basic principle of NameServer route registration. A NameServer is a satellite that maintains a Broker table that stores Broker information dynamically. The Broker acts like a post office. It starts by traversing the NameServer list, initiating registration requests, maintaining long connections, and then sending heartbeat packets to NameServer every 30 seconds. The heartbeat package contains BrokerId, Broker address, Broker name, cluster name of Broker, and so on. NameServer then receives the heartbeat package and updates the timestamp to record the Broker’s last lifetime. When NameServer processes heartbeat packets, there are multiple brokers operating on a single Broker table. To prevent unsafe simultaneous modification of the Broker table, the route registration operation introduces the ReadWriteLock read/write lock. This design feature allows multiple message producers to read and write messages concurrently. High concurrency is guaranteed when messages are sent, but NameServer can process only one Broker heartbeat packet at a time, and multiple heartbeat packets are processed sequentially. This is also the classic use scenario for read-write locks, where you read more and write less.

Routing out

Suddenly one day, the post office C room into the rat, bite the power cord goes down, the satellite business failure, do not know the post office C C will still with the post office post office table information to the sender (producers), the sender to contact the post office to send express mail, C C room down, but the post office to suspend business, is paralyzed, nature also cannot receive the express mail. On the other hand, the management of Post Office C has been blamed for complaints from customers who have been waiting for their parcels for a long time because they have not been collected by post office C and cannot deliver them to recipients. So the postal service department to research discussion, how to make the satellite can sense to the post office “lost”, thus automatic troubleshooting the post office, delivery will be in charge of the business to other normal post office processing, so as not to because a certain problems, the post office to the post office part of the business can’t handle. After much discussion, they finally settled on a plan to have the satellite scan the list of post offices at regular intervals. If the difference between the time a post office reported its information and the time it was scanned exceeded a preset threshold, the post office would be judged “lost” and the post office information would be removed from the list. In this way, the senders query the post office table is normal operation of the post office information. After the new function was put into operation, the effect was good. We no longer had to worry about business stagnation caused by a post office failure, and we lived the life of making tea and newspapers again.

This story also plays out at RocketMQ. Normally, if the Broker is closed, the long connection to NameServer will be disconnected. Netty’s channel closure listener will listen for the disconnection event and then cull the Broker. In exceptional cases, NameServer has a timed task that scans the Broker table every 10 seconds, and if a Broker’s latest heartbeat packet timestamp is more than 120 seconds from the current time, it will also determine that the Broker is invalid and remove it.

A careful observer will notice that NameServer does not actively notify the producer after clearing the deactivated Broker. The producer will request NameServer every 30 seconds to get the latest routing table, which means that the message producer will always have a 30-second delay in realizing that the Broker server is down in real time. So during those 30 seconds, the producer is still sending messages to the deactivated Broker, so how can the high availability of message sending be guaranteed? To solve this problem, start by talking about the Broker’s load strategy. The message sending queue uses polling by default and the exception retry mechanism is used to ensure that messages are sent in a highly available manner. When a Broker breaks down, the sender cannot immediately perceive the Broker break down. However, when a message producer sends an abnormal message back to the Broker, the producer will select a message queue on another Broker to avoid the failure of the Broker. Combined with the retry mechanism, the producer will select a message queue on another Broker to avoid the failure of the Broker. The clever implementation of high availability of message sending also reduces the complexity of the NameServer implementation by eliminating the need for NameServer to notify a large number of unsteady producers.

Another design advantage in reducing the complexity of NameServer implementation is that NameServer servers are independent of each other, meaning that data is not exactly the same between NameServer servers at any one time, but the exception retry mechanism makes this difference irrelevant.

Routing discovery

The satellites in the sky are finite and immutable, while the senders on earth are varied and changeable. So the sender wants to know which post offices are available, and the obvious way to do that is to send a request to the satellite and pull the post office list information, rather than wait for the satellite to deliver it to everyone. So in RocketMQ, NameServer does not actively push the meeting client. Instead, the client pulls the latest routing information for the topic.

Theory of CAP

As the registration and discovery center, NameServer is the soul of the whole distributed message queue scheduling. When it comes to distribution, we cannot escape CAP theory, C is Consistency, A is Availability, P is Partiton Tolerance. For distributed architecture, network conditions are not controllable. Network partitions are inevitable, so they must be fault tolerant. Therefore, NameServer must choose between AP and CP. Since NameServer is independent of each other, it is obvious that it is an AP design.

The reason why Zookeeper is replaced

ZooKeeper provides coordination services for distributed applications. So why did RocketMQ build its own wheels and develop a hypervisor for the cluster? Because ZooKeeper provides powerful features, including automatic Master elections, RocketMQ’s architecture eliminates the need for Master elections and requires a lightweight metadata server. Middleware is very stable, RocketMQ’s NameServer has very little code and is easy to maintain, so you don’t need to rely on another middleware, reducing overall maintenance costs.

What did you learn?

  1. Implementation principle of heartbeat in long-link programming model

  2. The classic use of read/write locks in multithreaded programming

  3. The pursuit of simple, efficient and reliable implementation

Say what’s in the back

Want to study the NameServer source code, please click the link: https://github.com/MrChiu/RocketMQ-Study/tree/release-4.3.2/namesrv I marked in attached comments, easy to read through the code