RocketMQ 5.0 is the cloud native messaging event flow hyper-converged platform. The topic is divided into three parts: First, we will take you through the history of RocketMQ 4, the preferred business messaging platform, and the evolution of version 4.x. Second, we’ll cover RocketMQ 5.0 developments and some of the new features. Finally, the RocketMQ 5.0 roadmap will be presented so that the community can participate in contributing to RocketMQ 5.0.

This article is adapted from the RocketMQ X EventMesh OpenDay Talk

The author | financial sense

Today’s topic is RocketMQ 5.0, the cloud native message event Flow hyper-converged platform, which is divided into three parts:

First, I’ll take you through the history of RocketMQ 4, the preferred business messaging space, and the evolution and evolution of version 4.x.

Second, we’ll cover RocketMQ 5.0 developments and some of the new features.

Finally, the RocketMQ 5.0 roadmap will be presented so that the community can participate in contributing to RocketMQ 5.0.

RocketMQ history

RocketMQ has gone through four generations of architectures since its inception, evolving with enterprise IT architectures from the SOA era to the microservices era to today’s cloud native era. RocketMQ was born out of Alibaba, which had some self-developed messaging engines in its early days, such as Notify on Taobao and Napoli on B2B. However, both Napoli and Notify are based on relational databases and bring some risks.

So in 2011, Alibaba developed MetaQ using file system as storage. After much exploration and rewriting of MetaQ 2.0, the first generation of RocketMQ was born and named RocketMQ 3.0. Alibaba made RocketMQ open source in 2013 and donated it to the Apache community in 2016. In 2017, RocketMQ graduated from Apache and became an Apache Foundation Top Level Open Source project.

With RocketMQ entering the Apache Foundation, RocketMQ 4.x has grown rapidly, with numerous releases, and has made significant leaps in terms of multi-copy capabilities, message types, and message governance. At the same time, the community ecology is also thriving, with nearly 500 contributors worldwide.

With the advent of cloud native and the rise of real-time computing, RocketMQ is getting a full upgrade with the release of RocketMQ 5.0. We and our community defined RocketMQ 5.0 as a cloud-native messaging, event, and streaming hyper-converged platform.

4 review RocketMQ

Looking back at RocketMQ 4, we have been emphasizing that RocketMQ is the business message of choice. Many companies use RocketMQ for their core transaction links, and many even build two messaging systems, one Kafka for data analysis and the other RocketMQ for business messaging.

Here are a few clues to why RocketMQ has become the consensus of choice for so many businesses:

First, RocketMQ is a highly reliable financial grade product. RocketMQ is proven at a very large scale compared to other messaging middleware. Almost all of Alibaba’s messaging links are built on RocketMQ, including the core transaction links. RocketMQ, for example, supported the flow of more than trillions of messages on Singles’ Day, while the messaging service on Alibaba Cloud also served tens of thousands of enterprises. These large enterprises also have high requirements for SLA. These practices and a large number of real-world scenarios of customer service play a crucial role in the stability of messaging systems.

Second, RocketMQ has a minimalist architecture and is easy to maintain. The cluster consists of NameServer, which discovers routes, and Broker, which acts as the actual data storage cluster. As you can see from the architecture diagram, RocketMQ uses a two-node cluster in which the secondary node synchronizes data to the primary node through synchronous or asynchronous replication. This deployment pattern ensures high availability of services.

By deploying multiple groups of brokers, even if the Master of one group of brokers becomes unavailable, messages can be sent to other groups of masters, and consumers can read from slaves. NameServer is completely stateless. Even if all Nameservers go down, the storage service will not be affected because the client has saved routing information. In addition, RocketMQ is easy to operate and maintain by simply increasing the number of Broker groups. If a group of brokers becomes faulty, it can be disabled and the route will be removed immediately without affecting other services.

RocketMQ deployment is also very simple, with JAR deployment requiring only two lines of command to pull RocketMQ up. Deploying on K8s is even easier, and if the RocketMQ Operator is used, the entire cluster can be pulled up with a single kubectl apply command.

Third, there is a wealth of message types. RocketMQ supports normal messages, sequential messages, delayed messages, retry messages, dead-letter messages, transactional messages, and more. In terms of message governance, RocketMQ supports message query, message playback, message trace, AND ACL access control in addition to the usual subscription, broadcast, and cluster modes. RocketMQ is also one of the few messaging products in the industry that natively supports server filtering, providing users with richer usage scenarios and the ability to leverage server computing resources. RocketMQ not only supports Tag filtering for messages, but also provides innovative support for SQL92 filtering. Tag filtering already meets most of the filtering requirements, and SQL92 filtering can be considered for particularly complex scenarios. RocketMQ has the richest message types and message governance compared to other messaging middleware.

Finally, RocketMQ features high throughput and low latency. During Alibaba singles Day, RocketMQ supported trillion-level flood peaks and maintained millisecond response.

Next, I’ll take you through the evolution of the RocketMQ 4.x release. Since the early days of open source, RocketMQ has supported plain, sequential, and delayed messages, which meet most business scenarios.

After RocketMQ version 4.3.0, transaction messages were released to resolve upstream and downstream data inconsistencies in a two-phase manner.

In RocketMQ 4.4.0, RocketMQ adds the message trace feature, which enables users to better locate the path where each message is placed and received to help troubleshoot problems. It also adds ACL permission control to improve RocketMQ management and security.

In version 4.5.0, RocketMQ introduced multiple copies, also known as Raft mode. In Raft mode, if the Master of a set of brokers dies, the other slaves in the Broker will re-elect the Master. This enables automatic failover within Broker groups and addresses issues such as highly available sequential messages, further improving RocketMQ availability.

In version 4.6.0, we introduced a lightweight Pull Consumer that gives users access to a more stream-friendly API. This release also introduced support for a new Request-Reply message, enabling RocketMQ to make synchronous RPC calls. RocketMQ is a better way to break down network isolation calls between networks. RocketMQ also supports IPV6 in this release and is the first message middleware to support IPV6.

In version 4.7.0, RocketMQ reconstructs the master/slave synchronous replication process, pipelinizes the synchronous replication and flush processes by asynchronizing threads, resulting in a nearly multifold improvement in synchronous double-write performance.

In version 4.8.0, RocketMQ Raft mode has seen a qualitative improvement, including several times more performance through asynchronization, batch replication, and stability using OpenChaos to complete tests including downtime, kill processes, OOM, various network partitions, and latency. Fixed important bugs. In terms of functions, the Preferred Leader is supported so that the Broker group can preferentially select the master, and batch messaging is supported.

In version 4.9.0, improvements to observability include support for OpenTracing, transaction messages, and Pull Consumer Trace.

As you can see, RocketMQ has improved over the years in terms of performance, stability, reliability, and observability. And along the way, companies other than Alibaba have made remarkable contributions to code building, proving that RocketMQ has become a diverse and thriving community.

In addition to the development of the RocketMQ main warehouse, we are also encouraged by the development of the RocketMQ ecosystem projects, especially with the integration of cloud native hot technologies, such as the RocketMQ Operator and RocketMQ Docker projects for cloud biogenic deployments. On the microservice development framework, the RocketMQ community also builds the RocketMQ Spring Boot Starter access method to facilitate the rapid integration and communication between the microservice system of open source users and the RocketMQ message queue. Spring Cloud Stream Binder and Spring Cloud Bus’s RocketMQ implementation are also included on this basis.

RocketMQ was one of the first messaging products to integrate with Envoy in terms of Service Mesh, and now also integrates with Dapr. In terms of Serverless, RocetMQ Knative Source repository has been open Source adapted to Cloud Events and the community.

In terms of observability, RocketMQ supports OpenTracing, OpenTelemetry, Prometheus Exporter, and more.

In the Eventing space, we have our Own RocketMQ Connector, which can do data interaction and synchronization between various external components such as MySQL, ElasticSearch and RocketMQ, as well as a data flow between MQ clusters. In Streaming, RocketMQ 5.0 will release rocketMQ-Streams, a native lightweight real-time computing framework, while RocketMQ is actively integrating with existing big data frameworks such as Flink, Storm, and Spark.

We can see RocketMQ not only as a conduit for business messages, but also as a conduit for event flow, some offline computing of business data, and lightweight real-time computing. With messages, events and streams, RocketMQ has developed into a complete self-closed loop ecosystem and is gradually becoming a hyper-converged processing platform for messages, events and streams.

RocketMQ5.0 overview

Before formally introducing RocketMQ 5.0, we need to answer the question: Why do we need RocketMQ 5.0? After talking to many contributors and doing a lot of o&M with RocketMQ, there are two main reasons:

First, the open source community is increasingly demanding, and as RocketMQ is adopted by a large number of enterprises, each user has a rich business scenario. RocketMQ 4.x mainly serves the business message field, so how to process these high-value data through real-time data calculation of RocketMQ has become an important direction for enterprises to explore in the next step. This is why RocketMQ is actively branching out from messaging into flow computing.

Secondly, the quality of cloud messaging services is constantly improving. As a deep participant and contributor to RocketMQ, Ali Cloud’s messaging services now serve tens of thousands of enterprises. As customers demand more service, and Alibaba’s own business grows, RocketMQ has become more demanding. One of the key issues in RocketMQ 5.0 is how to create an architecture that meets the needs of different users and scenarios.

As a result, RocketMQ 5.0 is a new cloud-born and cloud-grown architecture with a wide range of practical business scenarios. After continuous exploration and practice, RocketMQ 5.0 has the following features:

1. High SLA and low cost: Cloud-consistent availability, high performance and low cost

2. Schedulable: remolding and building of any component to adapt to diverse scenarios

3. Scalable: An open and rich ecosystem

4. Scalability: Extreme automatic capacity expansion/reduction

5. Standardization: community standards, in line with industry standards

RocketMQ 5.0 is a hyper-converged platform for cloud-native message and event flows, which we explain based on the architecture diagram:

1. Lightweight SDK

RocketMQ 5.0 provides a lightweight client with excellent integration and be integrated capabilities. At the same time, load balancing, logical point management and other complex logic are put on the server to achieve stateless. In terms of protocol selection, in addition to the original protocol, it fully supports the cloud native communication standard gRPC protocol.

2. Minimalist architecture

RocketMQ 5.0 still does not introduce any external dependencies, keeping the operating burden extremely low. At the same time, the loose coupling between nodes, any service node can be migrated at any time. RocketMQ 5.0 will be designed for failure, with the failure and migration of any service node tolerated.

3. Separable storage and computing architecture

Broker nodes become truly stateless service nodes and have no Topic Banding. That is to say, messages can be sent and consumed on any computing node, one access point can broker all traffic, and the computing layer and storage layer can be expanded flexibly independently. After storage computing is separated, compute nodes can handle different types of protocols, including Remoting, gRPC, MQTT, AND AMQP. In addition, ACL, subscription relationship, and multi-lease control are placed on the compute node. The most important point is that it can be divided and combined. It can support small clusters or super-large clusters, and can adapt to multiple business scenarios, reducing the burden of operation and maintenance.

4. Multi-mode storage

RocketMQ Raft mode takes the form of three replicas. When combined with a cloud disk that already has three replicas, you get nine replicas. While 9 duplicates provide greater reliability, they also result in significant cost wastage. So RocketMQ 5.0 solves this problem with multimode storage. For example, on common block storage devices, two-copy or three-copy deployment can be implemented based on availability requirements. Single copy is used on the cloud to better support cloud disk output and make full use of cloud infrastructure to reduce operation and maintenance costs.

5. Extensive use and integration of cloud native infrastructure

Support for Projects such as OpenTelemetry and Prometheus to enhance RocketMQ observability. To better support the K8s ecosystem, for example, the RocketMQ Operator can pull up the RocketMQ cluster with a single command, and complete the full life cycle management of releasing data to gray scale, automatic elastic scaling and other support.

Core feature 1: Separable storage and computing architecture

Next, the separable separable architecture of storage computing is described in detail. Separable means that RocketMQ can start the Broker in the same process, as it does now, or can be deployed separately. Computing nodes can be truly stateless when deployed separately. RocketMQ is careful to introduce a storage and computing separation architecture. Integrated deployment brings many benefits, such as providing nearby computing capacity and reducing bandwidth costs in big data scenarios. In service message scenarios, the integrated deployment can reduce latency. At the same time, storage computing separation also has many advantages, such as flexible capacity expansion, which can be tailored to specific computing resources or storage resources.

RocketMQ 5.0 will therefore provide a detachable storage and computing separation architecture that can be adapted to a variety of scenarios. Compute nodes are completely stateless. Management and control, such as protocol adaptation and traffic tenant, are implemented on compute nodes. In addition, the management of the load balancing logical bits of the entire client can be raised to the compute node through POP consumption mode, without Queue Binding, and any compute node can send and receive. In addition, because there is no state, you can do elastic scaling in seconds without making any Rebalance.

At the same time, RocketMQ 5.0 optimizes the storage cluster. In the storage cluster we retain natively storage support for multiple message types, including transaction messages, timed messages, retry messages, dead-letter messages, etc. In terms of copy selection, support is provided for different scenarios, including multiple copies on local block devices and single copies on cloud disks. Take advantage of on-cloud infrastructure to reduce costs with multi-mode storage capabilities.

Another important point is the support for multivariate indexes. The RocketMQ store is now a CommitLog, and background threads distribute build indexes such as ConsumeQueue and index. RocketMQ 5.0 will provide comprehensive indexing enhancements to support more diverse indexes. For example, by adding indexes to Batch processes, messages can be sent, stored, and received in batches, increasing RocketMQ Batch capabilities. In RocketMQ 5.0, messages and KV are better combined to build query indexes to enhance KV capabilities. RocketMQ 5.0 can accommodate different scenarios with a single data and multiple indexes.

Core feature 2: Stream and batch integrated data access mode

First, we introduce a new consumption mode — POP consumption mode. The diagram in the upper left is the existing load balancing architecture on the consumer side of RocketMQ 4.0. For example, topics are now spread across three brokers, with a total of nine queues. In cluster mode, one consumer group has three consumers. According to the. So each consumer is assigned three queues.

However, there are some problems with this. For example, a Consumer suddenly hangs. It cannot consume messages but is still connected to the Broker and therefore does not rebalance the elimination. Those queues get stuck.

This is essentially a binding problem, and once the Rebalance occurs, the Consumer and queue are bound. RocketMQ 5.0 addresses this issue by introducing a new consumption method called POP consumption. It unbinds the Rebalance. A queue can be consumed by any number of consumers, and the concurrency is controlled by a queue lock at the Broker end.

In POP consumption, clients consume requests directly to the queues of each Broker, which allocates messages back to waiting clients. The client then notifies the Broker of the ACK result. The Broker then marks the result of the message consumption. If there is no response or consumption fails due to timeout, it retries.

The Broker does three things for each POP request:

1. Lock the corresponding queue and obtain messages from the Store layer.

2. Then write CK message, indicating that the message will be consumed by POP;

3. Finally commit the current loci and release the lock.

CK messages are timed messages that record the loci of the POP message. When the client times out and does not respond, the CK message is re-consumed by the Broker and written to the retry queue for the LOCi of the CK message. If the Broker receives an ACK for the client’s consumption result, it deletes the CORRESPONDING CK message and determines whether to retry based on the result.

As can be seen from the overall process, POP consumption does not need Reblance, which can avoid consumption delay caused by Rebalance. Meanwhile, the client can consume all queues of the Broker, so as to avoid the accumulation caused by machine hanging.

With POP, PUSH, PULL and other modes, RocketMQ is able to access data in one stream and batch. For example, in the Streaming scenario, the original PUSH mode can ensure good sequential consumption. However, in batch processing and other scenarios with low order requirements, we can use POP consumption to read the same queue with high concurrency and speed up data reading. On the other hand, the POP consumption model also makes the client more lightweight, with a lot of logic on the server side and more user-friendly for writing multilingual clients.

Core feature three: extreme elastic expansion

The diagram above shows RocketMQ’s existing architecture. For example, we can make the traffic of Broker 1001 flow naturally to 1002 by blocking write operations. But in the field of Streaming, the upper service generally requires that the storage queue is always fixed, so as to ensure the sequence and integrity of Streaming data processing, which requires that the expansion and shrinkage will not cause the change of the number of queues. RocketMQ 5.0 Preview therefore provides a logical queue concept that combines the original physical queue logic and allows a logical queue to be spread across different brokers. For example, a logical queue 0 to 100 on Broker 1001, 100 to 1000 on Broker 1002, and 1000 to 2000 on Broker 1003 can be combined to form a large logical queue.

Logical queuing is a very lightweight operation because it is a Binding process, and therefore provides the ability to scale up in seconds. There is no data replication at all. As soon as the Broker is expanded and bound, traffic is allocated. In addition, we also provide dual-mode queue compatibility. Usually, the default is the original physical queue, and only after a Topic is specified, the logical queue will be used.

Core feature four: Lightweight real-time computing

RocketMQ 5.0 will also feature RocketMQ Streams, a lightweight real-time computing framework. It is designed to help users do the lightweight data processing and computing required for most business scenarios using only RocketMQ resources without relying on external heavyweight computing products.

RocketMQ Stream has few dependencies and is easy to deploy. It uses RocketMQ Rabalance to scale horizontally and supports common operators such as Map and Fliter, as well as Window, Join, and dimension tables. RocketMQ Streams is also compatible with the Flink SQL standard and provides UDF/UDAF/UDTF capabilities in addition to providing native dependency free support compared to other message-based real-time computing platforms.

On the other hand, in the real-time computing ecosystem, RocketMQ is actively working with other big data frameworks, including Flink and Spark. In particular, rocketMQ-Flink Connector, based on the latest standards, will also graduate in the near future.

RcoketMQ 5.0 Landscape

RocketMQ 5.0 will be released this year. The 5.0 Preview version has been discussed and the code is in the Github repository. The 5.0 Preview version will feature logical queues and streaming and batch access. We will release RocketMQ Streams, a real-time streaming computing framework, with batch processing and batch indexing capabilities in RocketMQ 5.0. Among the milestones RocketMQ 5.0 will complete gRPC protocol support, a new lightweight client, complete AMQP, MQTT protocol support, and a separable storage and computing architecture.

We are also looking forward to having more partners join the Apache RocketMQ community to build the next generation of cloud native Messaging engine and create a hyper-integrated processing platform for Messaging, Eventing and Streaming.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.