Editor’s recommendation:

The birth of cloud native is to solve the problems of traditional applications in architecture, fault handling, system iteration and other aspects, while open source contributes to the core of enterprises to build cloud native architecture. The author of this paper has had different thoughts and practices on the open source industry and cloud native streaming system solutions during his full devotion to open source and daily participation in cloud native.

The following article comes from CSDN and is written by Li Penghui

This article is from “New Programmer · Cloud Native and Full Digital Practice”. Penghui Lee is a member of Apache Pulsar PMC and principal Engineer of StreamNative. CSDN Tang Xiaoyin.

With the change of business and environment, the trend of cloud native is more and more obvious. Now it is the era of enterprise transformation from cloud computing to cloud native. The concept of cloud native has been widely recognized by enterprises after several years of practice, and cloud application management has become a necessary option for enterprise digital transformation. It can be said that today’s developers are either using products and tools based on cloud native technology architectures or are the developers of those products and tools.

Cloud native becomes the foundational strategy

So what is cloud native? Everyone has a different interpretation. In my opinion, first of all, cloud native is an application developed to run on the cloud, a solution for enterprises to continuously deliver business quickly, reliably, and at scale. Several key words of cloud native, such as containerization, continuous delivery, DevOps, microservices, etc. are all explaining its characteristics and capabilities as a solution, while Kubernetes has laid the foundation of cloud native with its pioneering declarative API and modulator mode.

Second, cloud native is a strategy. Cloud native was born to solve the problems of traditional applications in architecture, troubleshooting, system iteration and so on. Moving from traditional applications to the cloud is seen as a strategic shift rather than a technological upgrade. Enterprise cloud is facing the comprehensive integration of application development, system architecture, enterprise organizational architecture, and even commercial products. Whether to join the tide of cloud native is a strategic decision that will affect the long-term development of enterprises from all aspects.

Cloud native with open source background

In recent years, most of the open source projects related to architecture adopt the design of cloud native architecture, and open source contributes the backbone force for enterprises to build cloud native architecture.

Open source technologies and ecosystems are trusted, and the cloud provides users with scalability and reduced resource waste. The relationship between cloud native and open source can also be seen in the ongoing development of cloud native by the CNCF open Source Foundation. Many open source projects are inherently built for cloud native architectures, which are the basic software features that users will prioritize when they go to the cloud.

Take the Apache Software Foundation, a neutral open source software incubator and governance platform. The Apache Way, summed up by the Apache Software Foundation in the long term open source governance, is widely believed to be the “community is greater than code”, that is, projects without a community cannot last long. An open source project with a highly active community and code, honed in multiple scenarios by developers around the world, can be constantly refined, constantly updated and iterated, and can produce a rich ecosystem to meet different user needs. The combination of cloud native tide and the current open source environment will make those excellent technologies with the continuous upgrading of the technical environment to bring forth the new and stand out, and the technologies that do not adapt to The Times will gradually fall behind, or even be eliminated. As I said before, cloud native is a strategic decision, and the most advanced and reliable technology will be the first choice for an enterprise’s strategic decision.

Message flow data systems for the cloud

The above mentioned mentioned the importance of open source in the cloud native environment, so how should a cloud native open source project be designed, planned and evolved? How to choose message and flow system for enterprise digital transformation in cloud native era? In this article, I will take the design and planning of Apache Pulsar, an open source cloud native messaging and streaming data system that I have devoted myself to, as an example. Hope to provide you with reference ideas, and for the search for message and flow data system solutions to bring inspiration.

Reviewing history: The dual track system of message and flow

Message queues are typically used to build core business application services and streams are typically used to build real-time data services including data pipes and so on. Message queues have a longer history than streams and are known to developers as messaging middleware, which focuses on the communications industry, with common systems such as RabbitMQ and ActiveMQ. Relatively speaking, streaming system is a new concept, which is mostly used for moving and processing large amounts of data. Operational data, such as log data and click events, are presented in the form of streams. Common streaming systems include Apache Kafka and AWS Kinesis.

For previous technical reasons, messages and flows have been treated separately in two models. Enterprises need to build a variety of different systems to support these two business scenarios (see Figure 1), resulting in a large number of “dual-track” phenomena in infrastructure, resulting in data isolation, data islands, data flow failure, greatly increased difficulty in governance, and high architecture complexity and operation and maintenance costs.

Figure 1 “Dual-track system” caused by different systems set up by enterprises to support business scenarios

The need for a unified real-time data infrastructure that integrated message queue and flow semantics resulted in Apache Pulsar. Messages are stored once on an Apache Pulsar topic, but can be consumed in different ways through different subscription models (see Figure 2), which solves many of the problems caused by the traditional “dual track” of messages and flows.

Figure 2 Apache Pulsar integrated message queue and flow semantics

Key elements to achieve natural cloud native

As mentioned above, the cloud native era brings developers the ability to rapidly expand and shrink capacity, reduce resource waste and accelerate business implementation. With a natural cloud-native messaging and streaming data infrastructure like Apache Pulsar, developers can better focus on application and microservice development rather than waste time maintaining complex infrastructure systems.

Why is Apache Puslar “natural cloud native”? This has to do with the underlying architecture in which the prototype was originally designed. The cloud native architecture of storage and computing separation and hierarchical fragmentation greatly alleviates the expansion, operation and maintenance difficulties encountered by users in the message system, and can provide users with high-quality services at a lower cost on the cloud platform, which can well meet the needs of the message system and streaming data system in the cloud native era.

Biology has a conclusion, called “structure and function adapt”. From single-celled protists to mammals, life structures are becoming more complex and have more advanced functions. Similarly, “architecture and function fit” is reflected in Apache Pulsar, which has the following points:

  • The storage and computing separation architecture ensures high scalability and gives full play to the elastic advantages of the cloud.
  • Cross-region replication meets cross-cloud data backup requirements.
  • Layered storage, which can make full use of cloud native storage such as AWS S3, effectively reducing data storage costs.
  • Lightweight function computing framework Pulsar Functions, similar to AWS Lambda platform, introduces FaaS into Pulsar. Function Mesh is a Kubernetes Operator that enables users to use Pulsar Functions and connectors in Kubernetes, giving full play to Kubernetes resource allocation, flexible scaling, flexible scheduling and other features.

Infrastructure: Storage computing separation, hierarchical sharding

As mentioned above, Pulsar was born with a cloud-native design, a storage and computing separation architecture based on BookKeeper, an open source project of the Apache Software Foundation. BookKeeper is a highly consistent, distributed Append-only log abstraction, similar to messaging systems and streaming data scenarios, where new messages are constantly appended, just right in the messaging and streaming data space.

In the Pulsar architecture, the data service and the data store are two separate layers (see Figure 3). The data service layer consists of stateless Broker nodes and the data store layer consists of Bookie nodes. Each node of the service layer and the storage layer is equivalent. The Broker is only responsible for the service support of messages, but does not store data. This provides independent capacity expansion and high availability for the service layer and storage layer, greatly reducing the service unavailability time. The peer-to-peer storage node in BookKeeper ensures that multiple backups can be accessed concurrently and that services can be provided even if only one copy of the storage is available.

FIG. 3 Pulsar architecture

In this layered architecture, the service layer and storage layer can be independently expanded to provide flexible capacity expansion, especially in an elastic environment (such as cloud and container), which can automatically expand and shrink to dynamically adapt to traffic peaks. At the same time, it significantly reduces the complexity of cluster expansion and upgrade, and improves the availability and manageability of the system. The design is also container friendly.

Pulsar stores topic partitions at a much smaller fragment granularity (see Figure 4). These fragments are evenly broken up and distributed across Bookie nodes in the storage tier. This shard centric data storage approach takes topic partitioning as a logical concept, dividing it into multiple smaller shards that are evenly distributed and stored in the storage layer. Such a design can lead to better performance, more flexible scalability, and higher availability.

FIG. 4 Fragmented storage model

As you can see from Figure 5, while most message queue or flow systems (including Apache Kafka) are monolithic architectures, message processing and message persistence (if provided) are on the same node within the cluster. Such architectures are designed to be deployed in small environments, where traditional message queue or flow systems face performance, scalability, and flexibility issues when used on a large scale. With the increase of network bandwidth and the significant reduction of storage latency, the architectural advantages of storage computing separation become more obvious.

Figure 5. Traditional single architecture vs. storage computing layered architecture

Read and write the difference

Following that, let’s look at the differences in writing, reading, and so on.

Let’s start with write. On the left side of Figure 6 is the application of a single architecture, in which data is written to the leader and the leader copies the data to other followers. This is a typical architecture design with no separation of storage and computation. On the right side of Figure 6 is the storage computing split application, where data is written to the Broker, which writes to multiple storage nodes in parallel. If three copies are required, the return of two copies is considered successful if the choice is strong consistency and low latency. If the Broker has the role of leader, it is limited by the resources of the leader machine, because the leader returns and we can confirm that the message was written successfully.

FIG. 6 Write comparison between single architecture and layered architecture

In the right-side equivalent hierarchical architecture, any two of the three nodes return a write as a success. When we performed performance tests on AWS, we found that there was also a few milliseconds difference in latency between the two structures when they flushed: topics that landed on the leader in a stand-alone system had latency, while topics that fell on the leader in a tiered architecture were less affected by latency.

In real-time data processing, real-time reads represent 90% of the scenarios (see Figure 7). In a hierarchical architecture, real-time reading can be done directly through the topic tail cache of the Broker without touching storage nodes, greatly improving the efficiency and real-time reading of data.

FIG. 7 Comparison of real-time data read by single architecture and layered architecture

Architecture also makes a difference when reading historical data. As you can see from Figure 8, in a singleton architecture, the leader is directly found when the message is played back and the message is read from disk. In a storage-computing architecture, data needs to be loaded to the Broker and returned to the client to ensure that the data is read sequentially. When there is no strict requirement for sequential reading of data, Apache Pulsar supports reading data segments from multiple storage nodes in parallel at the same time. Even reading data of a topic can use the resources of multiple storage nodes to improve reading throughput. Pulsar SQL is also read in this way.

Figure 8 Comparison of historical data read by single architecture and layered architecture

IO isolation

BookKeeper has a good IO isolation of data writes and reads. BookKeeper can specify two types of storage devices. Figure 9 shows the Journal disk on the left where writeHeadLog is stored, and the data is actually stored on the right. Even when historical data is being read, the latency of writes is not affected as much as possible.

Figure 9 IO isolation for BookKeeper

Pulsar’s IO isolation allows users to select different resource types if they are using cloud platform resources. As Journal disks do not need to store a large amount of data, many cloud users will configure them according to their own requirements to achieve low cost and high quality of service. For example, Journal disks use resources with low storage space, high throughput and low latency, and data disks select devices that can store a large amount of data.

Enlarge shrinks capacity

The separation of storage calculations allows the Broker and BookKeeper to scale up and down separately. The process for scaling up and down topic is described below. Assuming that N topics are distributed on different brokers, the transfer of topic ownership can be carried out within 1s when new brokers join, which can be regarded as the transfer of stateless topic groups. In this way, part of the topic can be quickly moved to a new Broker.

For storage nodes, multiple data fragments are spread across different BookKeeper nodes, and expansion adds a new BookKeeper without causing historical data to be replicated. After a period of data writing, each topic will perform fragment switching, that is, switch to the next data fragment. Bookies are re-selected to place data on the switch, thus achieving gradual balance. If a BookKeeper node fails, BookKeeper automatically replenishes the number of copies, and the topic is not affected during this process.

Cross-cloud multiple data backup

Pulsar supports cross-cloud data backup (see Figure 10), allowing for bidirectional data synchronization by forming cross-room clusters. Many foreign users deploy across clusters at different cloud vendors and can quickly switch to another cluster if a problem occurs in one cluster. Asynchronous replication creates only subtle data synchronization gaps, but provides a higher quality of service, and the status of subscriptions can be synchronized across clusters.

Figure 10 Cross-cloud data backup

Enter the era of serverless architecture

Pulsar Functions and Function Mesh bring Pulsar into the era of serverless architecture. Pulsar Functions is a lightweight computing framework designed to provide a very simple platform for deployment and operation. Pulsar Functions are lightweight and simple, and can handle simple ETL jobs (extract, transform, load), real-time aggregation, event routing, etc., covering more than 90% of flow processing scenarios. Using the philosophy of Serverless and function-as-a-service (FaaS), Pulsar Functions allow data to be processed “nearby” and value to be mined in real time (see Figure 11).

FIG. 11 Single Pulsar Function message flow

Pulsar Functions are just single application Functions. Function Mesh (open source) was created to associate multiple Functions together to achieve data processing goals. Function Mesh also uses a serverless architecture and is a Kubernetes Operator that allows developers to use Pulsar Functions and various Pulsar connectors natively on Kubernetes. Give full play to Kubernetes resource allocation, elastic expansion, flexible scheduling and other characteristics. For example, Function Mesh relies on Kubernetes’ scheduling capabilities to ensure that Functions are fail-safe and can properly schedule Functions at any time.

Function Mesh consists of Kubernetes Operator and Function Runner. The Kubernetes Operator monitors the Function Mesh CRD and creates a Kubernetes resource (StatefulSet) to run functions, connectors, and Mesh in Kubernetes. The Function Runner is responsible for calling the Function and connector logic, processing the events received from the input stream, and sending the processing results to the output stream. At present, Function Runner is implemented based on Pulsar Functions Runner.

When the user creates a Function Mesh CRD (see Figure 12), the Function Mesh controller receives the submitted CRD from the Kubernetes API server, processes the CRD and generates the corresponding Kubernetes resources. For example, when the Function Mesh controller handles Function CRD, it creates StatefulSet, and each of its pods starts a Runner to invoke the corresponding Function.

Figure 12. CRD processing process of Function Mesh

The Function Mesh API is based on the existing Kubernetes API implementation, so the Function Mesh resources are compatible with other Native Kubernetes resources. Cluster administrators can use existing Kubernetes tools to manage Function Mesh resources. Function Mesh uses Kubernetes Custom Resource Definition (CRD). Cluster administrators can customize resources and develop event flow applications through CRD.

Users can use the Kubectl CLI tool to submit CRDS directly to the Kubernetes cluster instead of sending Function requests to the Pulsar cluster using the Pulsar-admin CLI tool. The Function Mesh controller monitors the CRD and creates Kubernetes resources, running custom Function, Source, Sink, or Mesh. The advantage of this approach is that Kubernetes stores and manages Function metadata and health state directly, avoiding the metadata and health state inconsistencies that might exist in Pulsar’s existing solution.

conclusion

In this article, I share my thoughts on the open source industry in the cloud native environment and the technical practice of cloud native streaming platform solutions. As a dedicated open source person, I am happy to see that the open source industry is booming as more and more people embrace the concept and become open source developers and contributors. I want to be one of the countless developers who are marching on the open source path and helping more enterprises accelerate cloud native and digital progress.

Author’s brief introduction

Penghui Li is a member and Committer of Apache Pulsar PMC, Apache Software Foundation. He is currently working at StreamNative as chief Architect. He has been working on messaging systems, microservices and Apache Pulsar for a long time. In 2019, he promoted Pulsar to zhaopin.com to build an internal unified messaging service, and later joined Apache Pulsar commercialization company StreamNative. Complete the transition from an open source project user to an open source project developer. He and his team at StreamNative are responsible for supporting massive messaging scenario users landing Apache Pulsar.