Recently, the Apache Pulsar messaging middleware is very hot, known as the next generation of messaging, today, let’s take a look at how awesome it is?

An overview of the

Apache Pulsar is a Pub/Sub messaging platform that uses Apache BookKeeper to provide persistence. It is a server-side messaging middleware, originally developed by Yahoo and opened source in 2016. It is currently being incubated under the Apache Foundation. It can provide the following features:

  • Cross-region replication
  • multi-tenant
  • Zero data loss
  • Zero Rebalancing time
  • Unified queue and flow models
  • High scalability
  • High throughput
  • Pulsar Proxy
  • function

architecture

Pulsar uses a hierarchical structure to isolate the storage mechanism from the broker. This architecture provides Pulsar with the following benefits:

  • Standalone Extension Broker
  • Standalone extended storage (Bookies)
  • Easier to container ZooKeeper, Broker and Bookies
  • ZooKeeper provides configuration and state storage for the cluster

In a Pulsar cluster, one or more agents process and load balance incoming messages from producers, dispatch messages to consumers, communicate with the Pulsar configuration store to handle various coordination tasks, store messages in BookKeeper instances (aka Bookies), Dependency on cluster-specific ZooKeeper cluster tasks, and so on.

  • A BookKeeper cluster consisting of one or more BookIE handles persistent storage of messages.
  • The ZooKeeper cluster specific to this cluster handles the coordination tasks between the Pulsar clusters.

More about architecture is introduced of the Pulsar, please refer to: https://pulsar.apache.org/doc…

Four subscription models

There are four subscription modes in Pulsar: exclusive, shared, failover, and key\_shared. These patterns are shown in the figure below.

Details refer to: https://pulsar.apache.org/doc…

Performance is better than Kafka

The best part about Pulsar is performance. Pulsar is much faster than Kafka. Compared to Kafka, Pulsar is 2.5 times faster and has 40% less latency.

Source: https://streaml.io/pdf/Gigaom…

Note: The comparison is for 1 topic on 1 partition, containing 100 bytes of messages, and Pulsar can send 220,000+ messages per second.

The installation

The binary version installs Pulsar
# download official binary package/root @ centos7 ~ # wget https://archive.apache.org/dist/pulsar/pulsar-2.8.0/apache-pulsar-2.8.0-bin.tar.gz # extract [root@centos7 ~]# tar ZXF apache-pulsar-2.8.0-bin.tar.gz [to ~]# CD apache-pulsar-2.8.0 [to] Apache-pulsar-2.8.0]# ll total 72 drwxr-xr-x 3 root root 225 Jan 22 2020 bin drwxr-xr-x 5 root root 4096 Jan 22 2020 conf drwxr-xr-x 3 root root 132 Jul 6 11:47 examples drwxr-xr-x 4 root root 66 Jul 6 11:47 instances drwxr-xr-x 3 root root 16384 Jul 6 11:47 lib -rw-r--r-- 1 root root 31639 Jan 22 2020 LICENSE drwxr-xr-x 2 root root 4096 Jan 22 2020 Licenses-rw-r --r-- 1 root root 6612 Jan 22 2020 NOTICE -rw-r--r-- 1 root root 1269 Jan 22 2020 README #bin
Docker installation (emphasis)
[root@centos7 ~]# docker run -it \ -p 6650:6650 \ -p 8080:8080 \ --mount source=pulsardata,target=/pulsar/data \ --mount Pulsar source = pulsarconf, target = / / conf \ apachepulsar/pulsar: pulsar 2.8.0 \ bin/standalone

Port 8080 is used for HTTP protocol access and port 6650 is used for Pulsar protocol (Java, Python, etc.) access.

The official visualization tool Pulsar Manager allows you to manage multiple Pulsar visually. https://pulsar.apache.org/doc…

[root@centos7 ~]# docker pull apachepulsar/pulsar-manager:v0.2.0 [to ~]# docker run-it \ -p 9527: 9527-p 7750:7750 \ -e SPRING_CONFIGURATION_FILE=/pulsar-manager/pulsar-manager/application.properties \ Apachepulsar/pulsar - manager: v0.2.0

Set administrator user and password

[root@centos7 ~]# CSRF_TOKEN=$(curl http://localhost:7750/pulsar-manager/csrf-token) curl \ -H 'X-XSRF-TOKEN: $CSRF_TOKEN' \ -H 'Cookie: XSRF-TOKEN=$CSRF_TOKEN; ' \ -H "Content-Type: application/json" \ -X PUT http://localhost:7750/pulsar-manager/users/superuser \ -d '{"name": "admin", "password": "admin123", "description": "test", "email": "[email protected]"}' {"message":"Add super user success, please login"}

Enter http://server_ip:9527 directly in your browser to log in as follows

Enter the user just created with the password, configuration management server

The list of

Toptic list

Toptic details

Client configuration

Java client

Here is an example of a Java consumer configuration using a shared subscription:

import org.apache.pulsar.client.api.Consumer;
import org.apache.pulsar.client.api.PulsarClient;
import org.apache.pulsar.client.api.SubscriptionType;

String SERVICE_URL = "pulsar://localhost:6650";
String TOPIC = "persistent://public/default/mq-topic-1";
String subscription = "sub-1";

PulsarClient client = PulsarClient.builder()
        .serviceUrl(SERVICE_URL)
        .build();

Consumer consumer = client.newConsumer()
        .topic(TOPIC)
        .subscriptionName(subscription)
        .subscriptionType(SubscriptionType.Shared)
        // If you'd like to restrict the receiver queue size
        .receiverQueueSize(10)
        .subscribe();
The Python client

Here is an example of a Python consumer configuration using a shared subscription:

from pulsar import Client, ConsumerType

SERVICE_URL = "pulsar://localhost:6650"
TOPIC = "persistent://public/default/mq-topic-1"
SUBSCRIPTION = "sub-1"

client = Client(SERVICE_URL)
consumer = client.subscribe(
    TOPIC,
    SUBSCRIPTION,
    # If you'd like to restrict the receiver queue size
    receiver_queue_size=10,
    consumer_type=ConsumerType.Shared)
C + + client

Here is an example of a C++ consumer configuration using shared subscriptions:

#include <pulsar/Client.h>

std::string serviceUrl = "pulsar://localhost:6650";
std::string topic = "persistent://public/defaultmq-topic-1";
std::string subscription = "sub-1";

Client client(serviceUrl);

ConsumerConfiguration consumerConfig;
consumerConfig.setConsumerType(ConsumerType.ConsumerShared);
// If you'd like to restrict the receiver queue size
consumerConfig.setReceiverQueueSize(10);

Consumer consumer;

Result result = client.subscribe(topic, subscription, consumerConfig, consumer);

More configuration and operational guidelines, the official document writing is clear, the official document: https://pulsar.apache.org/docs/

conclusion

As the next generation of distributed message queues, Plusar has a number of attractive features that make up for some of the weaknesses of competing products, such as geographical replication, multi-tenancy, scalability, read/write isolation, and so on.