background

Recently, we are making technology selection for new business, which involves the choice of message-oriented middleware. Combined with our actual situation, WE hope it can meet the following requirements:

  • Friendly cloud native support: because now the workhorse language isGoAt the same time, the operation and maintenance can be simple enough.
  • Officially multilingualSDK: And some morePython,JavaThe relevant code needs to be maintained.
  • It is better to have some convenient features, such as: delayed messages, dead letter queues, multi-tenancy, etc.

There are, of course, features such as horizontal scaling, throughput, and low latency that almost all mature message-oriented middleware can meet.

Based on the above screening conditions, Pulsar came into our view.

As a top-level project under Apache, the above features are well supported.

Let’s see what makes it special.

architecture

The official architecture diagram shows that Pulsar consists of the following components:

  1. BrokerStateless components, which can be extended horizontally, mainly for producer and consumer connections; Similar to Kafka’s broker, but without data storage capabilities, it is easier to scale.
  2. BookKeeperClustering: Mainly used for persistent storage of data.
  3. ZookeeperUsed to storebrokerBookKeeperMetadata of.

The whole thing seems to rely on more components than Kafka does, which does provide system complexity; But the same benefits are clear.

Pulsar’s storage is separated from computation, and when it needs to be expanded, it is very simple to add brokers without any other mental burden.

When storage becomes a bottleneck, BookKeeper only needs to be expanded. There is no need to manually rebalance BookKeeper.

Kafka is much more complicated to do the same.

features

multi-tenant

Multi-tenant is a required feature that isolates data of different services and teams in the same cluster.

persistent://core/order/create-order
Copy the code

Take this topic name as an example. Under the tenant core, there is a namespace of order, which is ultimately the topic name of create-Order.

In practice, tenants are generally divided by service team, and namespaces are different services of the current team. This makes it clear to manage the topic.

It’s usually the comparison that hurts. How does this work in message-oriented middleware without multi-tenancy:

  1. Simply not so fine, all lines of business mixed, when the team is small may not be a problem; Once the business increases, it can be very difficult to manage.
  2. You’re doing a layer of abstraction before topic, but you’re essentially implementing multi-tenancy.
  3. Each business team maintains its own cluster, which of course solves the problem, but also increases the operational complexity.

The above is very intuitive to see the importance of multi-tenancy.

Function calculation

Pulsar also supports lightweight function calculations, such as messages that need to be data-cleaned, transformed, and then published to another topic.

A simple function can be written for such requirements, and Pulsar provides an SDK to easily process the data and publish it to the broker using official tools.

Prior to that, such simple requirements might have required the flow processing engine itself.

application

In addition to the upper application, such as producer, consumer, such as the concept and use of everyone is similar.

For example, Pulsar supports four consumption modes:

  • Exclusive: Exclusive mode, where only one consumer can start and consume data; throughSubscriptionNameIndicate the same consumer), the scope of application is smaller.
  • FailoverFailover mode: More than one can be started simultaneously in exclusive modeconsumer, once aconsumerThe rest can be quickly replaced, but only oneconsumerCan consume; Some scenarios are available.
  • SharedShared mode: can have N consumers running at the same time, message according toround-robinPolling post to eachconsumer; When aconsumerOutage noack, the message will be delivered to other consumers. This consumption pattern can increase consumption power, but messages cannot be ordered.
  • KeySharedShared mode: Based on shared mode; Equivalent to the sametopicThe messages in the same group can only be orderly consumed by the same consumer.

The third shared consumption pattern should be the most used, and the KeyShared pattern can be used when there is an order requirement for messages.

SDK

The officially supported SDKS are quite rich; I also packaged an SDK for internal use on top of the official SDK.

Since we used a lightweight dependency injection library like DIG, it would look something like this:

	SetUpPulsar(lookupURL)
	container := dig.New()
	container.Provide(func(a) ConsumerConfigInstance {
		return NewConsumer(&pulsar.ConsumerOptions{
			Topic:            "persistent://core/order/create-order",
			SubscriptionName: "order-sub",
			Type:             pulsar.Shared,
			Name:             "consumer01",
		}, ConsumerOrder)

	})

	container.Provide(func(a) ConsumerConfigInstance {
		return NewConsumer(&pulsar.ConsumerOptions{
			Topic:            "persistent://core/order/update-order",
			SubscriptionName: "order-sub",
			Type:             pulsar.Shared,
			Name:             "consumer02",
		}, ConsumerInvoice)

	})

	container.Invoke(StartConsumer)
Copy the code

Two of the container.provide () functions are used to inject the consumer object.

Container.invoke (StartConsumer) will fetch all consumer objects from the container and start consuming at the same time.

At this time, WITH my limited Go development experience, I was also thinking about a question: is dependency injection needed in Go?

Let’s start with the benefits of using a library like Dig:

  • Objects are managed by the container, which makes it easy to implement singletons.
  • When the dependencies of each object are complex, you can reduce a lot of code to create and obtain the object, and the dependencies are clearer.

There are also disadvantages:

  • Tracing through code is not as intuitive as seeing at a glance how a dependent object was created.
  • At odds with the simplicity that Go espouses.

For Java developers who have used Spring, it must smell very familiar. But it didn’t seem like a new requirement for Gopher, who had no exposure to such requirements at all.

At present, there are all kinds of Go dependency injection libraries emerging in endlessly on the market, and there are also many large factories producing them, which shows that there is still a market.

I’m sure there are a lot of Gopher’s who would hate to introduce some of the complexity of Java into Go, but I think dependency injection itself is not limited by language, and each language has its own implementation, but Spring in Java is not just a dependency injection framework, it has a lot of complexity. It’s intimidating to many developers.

If only dependency injection is a segmentation requirement, it is not complicated to implement and does not add much complexity. If you take the time to read the source code, you can quickly grasp the concept based on understanding it.

Going back to the SDK itself, the Go SDK currently has fewer features than the Java version (to be exact, only the Java version has the most features), but it has all the core features and does not affect daily use.

conclusion

This paper introduces some basic concepts and advantages of Pulsar, and discusses dependency injection of Go. If you’re looking for a technology like us, consider Pulsar.

We will continue to share Pulsar content in the future, and those with relevant experience can leave their opinions in the comments section.