Abstract: If you choose RabbitMQ, you will need to know how it works. In general, if you want to learn RabbitMQ or verify your business project, you can use a single instance deployment on a local or test environment. However, for the reliability, concurrency, throughput and message stacking capabilities of the MQ middleware itself, clustering using RabbitMQ is generally considered in production environments. It is not difficult to set up a message queue product as mature as RabbitMQ and there have been many blogs about how to set up a RabbitMQ message queue cluster, but there may still be some people who do not understand the principle behind it, which will prevent them from tuning the cluster further if they encounter performance problems. This paper mainly introduces the principle of RabbitMQ cluster, how to build a load balancing medium and small scale RabbitMQ cluster, and finally gives a production environment to build a high availability, high reliability and high throughput RabbitMQ cluster design.

The RabbitMQ cluster solution

RabbitMQ, the message queue middleware product itself, is written in Erlang, which is naturally distributed (by synchronizing magic cookies across nodes in an Erlang cluster). Therefore, RabbitMQ naturally supports Clustering. This eliminates the need for RabbitMQ to implement HA schemes and store cluster metadata via ZooKeeper, as ActiveMQ and Kafka do, respectively. Clustering is a way to ensure reliability while scaling horizontally to increase message throughput capacity. Let’s take a look at the RabbitMQ cluster as a whole:

RabbitMQ cluster solution 1.jpg

The three nodes in the RabbitMQ cluster are the same on all nodes. The complete data of a Queue (the Queue that holds messages) only exists on the node it was created on. The other nodes only know the metadata of the queue and a pointer to the owner node of the queue.

(1) Synchronize RabbitMQ cluster metadata

RabbitMQ clusters always synchronize four types of internal metadata (similar to indexes) : a. Queue metadata: queue name and its properties; B. Switch metadata: switch name, type, and properties; C. Bind metadata: A simple table shows how to route messages to queues; D. host metadata: provides namespaces and security attributes for queues, switches, and bindings within a Vhost; Therefore, when a user accesses any RabbitMQ node, the queue/user/exchange/vhost information queried by rabbitmqctl is the same.

(2) Why only metadata synchronization is used in RabbitMQ clusters

I’m sure many of you will ask that if you want to implement the HA scheme, the complete data of all queues in the RabbitMQ cluster should be stored on all nodes. In this case, if any node fails or becomes unavailable, the user’s client can publish and subscribe messages as long as it can connect to other nodes. I think RabbitMQ’s authors have designed it primarily for the performance and storage of the cluster itself. First, storage space. If each cluster node has a complete copy of all queues, the storage space of each node will be very large, and the message backlog capacity of the cluster will be very weak (the message backlog capacity cannot be improved through the expansion of cluster nodes). Second, performance. The publisher of the message needs to copy the message to every cluster node, and the overhead of network and disk synchronous replication increases significantly for persistent messages.

(3) The basic principle of RabbitMQ cluster to send/subscribe messages

The RabbitMQ cluster works as follows:

How the RabbitMQ cluster works

Scenario 1. The client directly connects to the node where the queue resides

If a message producer or message consumer connects to node 1 to publish or subscribe messages through the client of AMQP-Client, then the message sending and receiving in the cluster is only related to node 1, and there is no problem. What if the client is connected to node 2 or node 3 (where queue 1 data is not stored)?

Scenario 2. The client connects to the node where non-queue data resides

If the message producer is connected to node 2 or node 3, and the complete data of queue 1 is not on these two nodes, then these two nodes mainly play a role of routing and forwarding in the process of sending messages. According to the metadata on these two nodes (namely, as mentioned above: Pointer to the owner node of the queue) is forwarded to node 1, and the final message is stored on queue 1 of node 1. Similarly, if the message consumer is connected to node 2 or node 3, these two nodes will also act as routing nodes for forwarding and will pull messages from queue 1 of node 1 for consumption.

2. RabbitMQ cluster setup

(1) Components to be installed to set up the RabbitMQ cluster

To set up a RabbitMQ cluster, install the following component packages on each VM: A.jdk 1.8B.ellang runtime environment, here using otP_SRC_19.3.tar. gz (200MB+) C. labbitMQ Server component, Install rabbitmq-server-generic-UNIX-3.6.10.tar. gz. Install rabbitmq-server-generic-UNIX-3.6.10.tar. gz. Install rabbitmq-server-generic-UNIX-3.6.10.tar. gz. Students who need to refer to these steps to complete the installation.

(2) Set up a RabbitMQ cluster consisting of 10 nodes

This section focuses on cluster setup. You need to ensure that the three components are correctly installed on each machine and that the RabbitMQ instance on each VM can be started properly. A. Edit the RabbitMQ cookies to ensure that all nodes use the same values. SCP the cookies from one machine to the other nodes. Cookie the default path for the/var/lib/rabbitmq /. Erlang. Cookies or $HOME /. Erlang. Cookies, by cookies to determine whether each other can communication between nodes. B. Configure the hosts files on each node (vim /etc/hosts).

xxx.xxx.xxx.xxx rmq-broker-test-1
xxx.xxx.xxx.xxx rmq-broker-test-2
xxx.xxx.xxx.xxx rmq-broker-test-3
......
xxx.xxx.xxx.xxx rmq-broker-test-10
Copy the code

C. Start the RabbitMQ service one node at a time

rabbitmq-server -detached
Copy the code

D. View the running status of each node and cluster

rabbitmqctl status, rabbitmqctl cluster_status
Copy the code

E. Rmq-broker-test-1 as the primary node, on RMQ-broker-test-2:

rabbitmqctl stop_app 
rabbitmqctl reset 
rabbitmqctl join_cluster rabbit@rmq-broker-test-2 
rabbitmqctl start_app 
Copy the code

The steps on the remaining nodes are the same as on the RMQ-Broker-test-2 virtual machine. D. There are only two types of nodes in a RabbitMQ cluster: memory nodes and disk nodes. A single-node system runs only disk nodes. In a cluster, you can choose to configure some nodes as memory nodes. The memory node stores metadata information about all queues, switches, bindings, users, permissions, and vhosts in memory. Disk nodes store the information on disks, but memory nodes have higher performance. To ensure high availability of the cluster, it is necessary to ensure that there are more than two disk nodes in the cluster to ensure that the cluster can still provide access services when one disk node crashes. In the preceding operations, you can use the following methods to set the newly added node to a memory node or a disk node:

[root@mq-testvm1 ~]# rabbitmqctl join_cluster rabbit@rmq-broker-test-1 --ram # Change the node type as follows  [root@mq-testvm1 ~]# rabbitmqctl changeclusternode_type disc | ramCopy the code

E. You can run the rabbitmqctl cluster_status command to check the cluster status. The RabbitMQ cluster status of the 10 nodes (3 are disks and 7 are memory nodes) is as follows:

Cluster status of node 'rabbit@rmq-broker-test-1' [{nodes,[{disc,['rabbit@rmq-broker-test-1','rabbit@rmq-broker-test-2',  'rabbit@rmq-broker-test-3']}, {ram,['rabbit@rmq-broker-test-9','rabbit@rmq-broker-test-8', 'rabbit@rmq-broker-test-7','rabbit@rmq-broker-test-6', 'rabbit@rmq-broker-test-5','rabbit@rmq-broker-test-4', 'rabbit@rmq-broker-test-10']}]}, {running_nodes,['rabbit@rmq-broker-test-10','rabbit@rmq-broker-test-5', 'rabbit@rmq-broker-test-9','rabbit@rmq-broker-test-2', 'rabbit@rmq-broker-test-8','rabbit@rmq-broker-test-7', 'rabbit@rmq-broker-test-6','rabbit@rmq-broker-test-3', 'rabbit@rmq-broker-test-4','rabbit@rmq-broker-test-1']}, {cluster_name,<<"rabbit@mq-testvm1">>}, {partitions,[]}, {alarms,[{'rabbit@rmq-broker-test-10',[]}, {'rabbit@rmq-broker-test-5',[]}, {'rabbit@rmq-broker-test-9',[]}, {'rabbit@rmq-broker-test-2',[]}, {'rabbit@rmq-broker-test-8',[]}, {'rabbit@rmq-broker-test-7',[]}, {'rabbit@rmq-broker-test-6',[]}, {'rabbit@rmq-broker-test-3',[]}, {'rabbit@rmq-broker-test-4',[]}, {'rabbit@rmq-broker-test-1',[]}]}]Copy the code

(3) Configure HAProxy

HAProxy provides high availability, load balancing, and proxy based on TCP and HTTP applications, and supports virtual hosting. It is a free, fast, and reliable solution. According to official data, its maximum support is 10GB of concurrency. HAProxy supports network switching from Layer 4 to Layer 7, that is, all TCP protocols are covered. That said, Haproxy even supports load balancing for Mysql. To implement soft load balancing for RabbitMQ clusters, you can select HAProxy. The article about how to install HAProxy has also been written by a lot of students before, here will not repeat, there is a need for students can refer to the online practice. This section mainly describes the specific configuration after the HAProxy component is installed. HAProxy uses a single configuration file to define all properties, from the front-end IP to the back-end server. The load-balancing configuration for a cluster of seven RabbitMQ nodes is shown below (the other three disk nodes are used to store the configuration and metadata of the cluster, not load). Meanwhile, HAProxy runs on another machine. HAProxy configurations are as follows:

Log output: log output: log output Log 127.0.0.1 local0 info # maxconn 4096 # Change current working directory chroot /apps/ SVR /haproxy # Run haproxy process UID 99 with the specified UID Run the haproxy process with the specified GID GID 99 run the haproxy process with the specified GID GID 99 run the haproxy process with the specified GID GID 99 run the haproxy process with the specified GID GID 99 / apps/SVR/haproxy/haproxy pid # # default configuration defaults apply global logging configuration log global # default mode mode {TCP HTTP | | health} # TCP is 4 layer, HTTP is the 7 layer, Health Returns only OK mode TCP # Log type tcplog option TCplog # No health check log information option DontLogNULL # The service is considered unavailable if three attempts fail. Retries 3 # Maximum number of connections available to each process Maxconn 2000 # Connection timeout Connect 5s # Client timeout Client 120S # Server timeout MaxCONN 2000 # Connection timeout Timeout Connect 5s client timeout client 120S Server timeout Server 120S Listen rabbitmq_cluster bind 0.0.0.0:5672 IP addresses 1 to 7 are IP addresses of RabbitMQ cluster nodes. Server RMq_node1 IP1:5672 check inter 5000 rise 2 fall 3 weight 1 server rmq_node2 ip2:5672 check inter 5000 rise 2 fall 3 weight 1 server rmq_node3 ip3:5672 check inter 5000 rise 2 fall 3 weight 1 server rmq_node4 ip4:5672 check inter 5000 rise 2 fall 3 weight 1 server rmq_node5 ip5:5672 check inter 5000 rise 2 fall 3 weight 1 server rmq_node6 ip6:5672 check inter 5000 rise 2 fall 3 weight 1 server RMq_node7 IP7:5672 Check Inter 5000 rise 2 fall 3 weight 1 # HaProxy Monitor bind 0.0.0.0:8100 mode HTTP option Httplog stats enable STATS URI /stats stats refresh 5sCopy the code

Listen Rabbitmq_cluster bind 0.0.0.0:5671 specifies the client IP address and port number. The load balancing algorithm configured here is roundrobin – weighted polling. Server RMq_node1 IP1:5672 check Inter 5000 rise 2 Fall 3 Weight 1 identifies and defines the back-end RabbitMQ service. The main meanings are as follows: (a) Server: Defines the RabbitMQ service identifier in HAProxy. (b) IP1:5672: specifies the service address of the back-end RabbitMQ. (c) The “Check Inter” section: indicates the number of milliseconds at which the RabbitMQ service is checked for availability; (d) The “Rise” section: how many health checks are required for the RabbitMQ service to be reconfirmed as available after a failure; (e) the “Fall” section: indicates the number of failed health checks before HAProxy stops using the RabbitMQ service.

[root@mq-testvm12 conf]# HAProxy -f haproxy.cfgCopy the code

After starting, you can see the following interface diagram of HAproxy:

RabbitMQ cluster Haproxy deployment UI. JPG

(4) RabbitMQ cluster architecture design

After the RabbitMQ10 cluster and HAProxy soft elastic load balancing configuration, a small and medium scale RabbitMQ cluster can be built. However, in order to be used in the actual production environment, it is necessary to monitor the performance parameters of each instance of the cluster according to the actual business requirements. Kafka can be used as a monitoring queue for RabbitMQ clusters in terms of performance, throughput and message stacking capability. Therefore, a small and medium scale RabbitMQ cluster architecture is presented here:

Architecture design for RabbitMQ small cluster (with monitoring part attached).png

Message producers and consumers can use the soft load of HAProxy to distribute requests to nodes 1 to 7 in the RabbitMQ cluster, where nodes 8 to 10 serve as disk nodes to store cluster metadata and configuration information. The monitoring section is not covered here for space reasons, but how to use the RabbitMQ HTTP API for monitoring statistics will be explained in more detail in a later section.

Third, summary

This paper mainly introduces the working principle of RabbitMQ cluster and how to set up a small and medium scale RabbitMQ cluster with load balancing capability, and finally gives the RabbitMQ cluster architecture design diagram. Limited to the author of shallow talent, the content of this article may not understand the place in place, if there is unreasonable explanation of the place also hope to discuss the message together.