Introduction: This article will provide a brief introduction to rocketMQq-Exporter’s design implementation. Readers will learn how rocketMQQ-Exporter is implemented and how to build their own RocketMQ monitoring system through RocketMQQ-Exporter. The RocketMQ online interactive tutorial is now available in the KnowingDo Hands-on Lab and on the PC at start.aliyun.com.

The author | Chen Houdao Feng Qing source | alibaba cloud native public number

takeaway: This article will provide a brief introduction to the rocketMQq-Exporter design implementation. Readers can learn how rocketMQQ-Exporter is implemented and how to build their own RocketMQ monitoring system through RocketMQQ-Exporter. The RocketMQ online interactive tutorial is now available in the KnowingDo Hands-on Lab and on the PC at start.aliyun.com.

RocketMQ Cloud Native Series:

  • Ali RocketMQ how to make double 11 peak below 0 failure
  • What is the spark when RocketMQ meets Serverless?
  • RocketMQ Operator is a powerful tool for RocketMQ operation and maintenance management in cloud native time
  • Evolution of message-oriented middleware in the cloud native era
  • Customizing DevOps Platform based on RocketMQ Prometheus Exporter

RocketMQ – Exporter project making address: https://github.com/apache/rocketmq-exporter

The main content of the article includes the following aspects:

  1. RocketMQ introduction
  2. Prometheus profile
  3. A concrete implementation of RocketMQ-Exporter
  4. Rocketmq-exporter monitoring indicators and alarm indicators
  5. RocketMQ-Exporter uses the example

RocketMQ introduction

RocketMQ is a distributed messaging and streaming data platform with low latency, high performance, high reliability, trillion-scale capacity, and flexible scalability. In simple terms, it consists of a Broker server and a client. The client is a publisher (Producer) that sends messages to the Broker server. The other is the message Consumer client, where multiple consumers can form a Consumer group to subscribe to and pull messages stored on the Consumer Broker server.

Because of its high performance, high reliability and high real-time characteristics, it is more and more combined with other protocol components in various message scenarios such as MQTT, and more and more widely used. For such a powerful messaging middleware platform, there is still a lack of monitoring and management platform in the actual use.

The most widely used monitoring solution in the open source world today is Prometheus. Compared with other traditional monitoring systems, Prometheus has the advantages of easy management, monitoring the internal running state of the service, powerful data model, powerful query language PromQL, efficient data processing, scalability, easy integration, visualization, openness and so on. And with Prometheus, a monitoring platform can be quickly built to monitor RocketMQ.

Prometheus profile

The following diagram shows the basic architecture of Prometheus:

  1. Prometheus Server

Prometheus Server is the core part of the Prometheus component. It is responsible for obtaining, storing, and querying monitoring data. Prometheus Server can manage targets statically or dynamically using the Service Discovery to obtain data from targets. Secondly, the Prometheus Server needs to store the collected monitoring data. As a time series database, the Prometheus Server stores the collected monitoring data on the local disk in time series. Finally, Prometheus Server provides customized PromQL language to query and analyze data.

  1. Exporters

The Exporter exposes the Endpoint of monitoring data collection to the Prometheus Server through THE HTTP service, and the Prometheus Server can obtain the monitoring data to be collected by accessing the Endpoint provided by the Exporter. Rocketmq-exporter is such an Exporter that first collects data from the RocketMQ cluster and then normalizes the collected data to meet the requirements of the Prometheus system using a third-party client library provided by Prometheus. Prometheus can regularly fetch data from The Exporter.

The current RocketMQ Exporter has been official included Prometheus, the address is: https://github.com/apache/rocketmq-exporter.

A concrete implementation of RocketMQ-Exporter

Currently in the Exporter, the implementation principle is shown in the figure below:

The whole system is implemented based on spring Boot framework. Since MQ itself provides relatively comprehensive data statistics, as far as Exporter is concerned, the statistics provided by the MQ cluster are simply pulled out and processed. So the basic logic of RocketMQ-Exporter is to start multiple scheduled tasks internally to periodically pull data from the MQ cluster, then normalize the data and expose it to Prometheus through the endpoint. It mainly consists of the following three functional parts:

  • The MQAdminExt module retrieves statistics within the MQ cluster by encapsulating the interface provided by the MQ system client.
  • MetricService is responsible for processing the resulting data returned by the MQ cluster into the formatted data required by Prometheus.
  • The Collect module is responsible for storing the normalized data, and when Prometheus pulls the data from The Exporter on a regular basis, the Exporter exposes the collected data from the Collector via HTTP on the /metrics endpoint.

Rocketmq-exporter monitoring indicators and alarm indicators

Rocketmq-exporter is mainly used in conjunction with Prometheus for monitoring. Here’s a look at what monitoring indicators and alarm indicators are currently defined in Expoter.

  • Monitoring indicators

Rocketmq_message_accumulation is an aggregation indicator that needs to be aggregated based on other reported metrics.

  • The alarm indicator

The value threshold is not fixed for each consumer. Currently, the value threshold is set based on the number of messages produced by the producer in the past five minutes. Users can set the threshold based on the actual situation. The value of the alarm indicator is a threshold and a symbolic value, which can be set according to the actual RocketMQ use. The focus here is on consumer pileup alarm metrics. In previous monitoring systems, because there is no powerful PromQL language like Prometheus, when dealing with consumer alarm issues, it is necessary to set alarms for each consumer, which requires the maintenance personnel of the RocketMQ system to add, Or it can be added automatically when a new consumer is detected in the background of the system. In Prometheus, this is done by the following statement:

(sum(rocketmq_producer_offset) by (topic) - on(topic)  group_right  sum(rocketmq_consumer_offset) by (group,topic)) 
- ignoring(group) group_left sum (avg_over_time(rocketmq_producer_tps[5m])) by (topic)*5*60 > 0

With the PromQL statement, you can not only create a consumption alarm accumulation alarm for any consumer, but also set the consumption accumulation threshold to a threshold related to the sending speed of the producer. This greatly increases the accuracy of the consumption accumulation alarm.

RocketMQ-Exporter uses the example

  1. Start the NameServer and Broker

To verify the RocketMQ spring-Boot client, first make sure the RocketMQ service is downloaded and started correctly. Refer to the RocketMQ master for quick Start. Ensure that the startup NameServer and Broker are started correctly.

  1. Compile RocketMQ – Exporter

If you are currently using git, you need to download and compile git source code:

git clone https://github.com/apache/rocketmq-exporter
cd rocketmq-exporter
mvn clean install
  1. Up and running

Rocketmq-exporter has the following run options:

The above run options can either be changed in the configuration file after downloading the code or set from the command line.

The compiled JAR package is called RocketmQ-telegraph-0.0.1-snapshot. jar and can be run as follows.

Java jar rocketmq - exporter - 0.0.1 - the SNAPSHOT. Jar [- rocketmq. Config. NamesrvAddr = "127.0.0.1:9876"...
  1. Install the Prometheus

First, download the Prometheus installation package from the official download address of Prometheus. This section uses the Linux OS as an example. The installation package is Prometheus – 2.1.0-Rc.1.linux-amd64.tar.gz. Do the following to start the Prometheus process.

The tar - XZF Prometheus - 2.7.0 - rc. 1. Linux - amd64. Tar. GZCD Prometheus - 2.7.0 - rc. 1. Linux - amd64 /. / Prometheus --config.file=prometheus.yml --web.listen-address=:5555

The default listening port number for Prometheus is 9090. To avoid conflicts with other processes on the system, we reset the listening port number to 5555 in the startup parameters. Then visit http://< through a browser; If the server IP address is >:5555, verify that Prometheus has been installed successfully. The following information is displayed:

Since the RocketMQ-Exporter process is already started, you can crawl the RocketMQ-Exporter data through Prometheus. You only need to change the configuration file that Prometheus started.

The overall configuration file is as follows:

# my global config
global:
   scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
   evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
   # scrape_timeout is set to the global default (10s).
 
 
 # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
 rule_files:
   # - "first_rules.yml"
   # - "second_rules.yml"
   

 scrape_configs:
   - job_name: 'prometheus'
     static_configs:
     - targets: ['localhost:5555']
   
   
   - job_name: 'exporter'
     static_configs:
     - targets: ['localhost:5557']

After the configuration file is changed, restart the service. After restart, you can query the indicators reported by RocketmQ-Exporter on the Prometheus interface, such as rocketmq_broker_TPS, and the results are as follows:

  1. Adding an Alarm Rule

After Prometheus can display RocketMQ-Exporter metrics, you can configure RocketMQ alarm metrics in Prometheus. Add the following alarm configuration items to the Prometheus configuration file: *. Rules indicates that multiple files with the suffix “rules” can be matched.

rule_files: # # - "first_rules. Yml" - "second_rules. Yml" - / home/Prometheus/Prometheus - 2.7.0 - rc. 1. Linux - amd64 / rules / *. Rules

The current alarm configuration file is warn. Rules. The details of the file are as follows: The threshold is only used as an example. You need to set the specific threshold according to the actual situation.

###
# Sample prometheus rules/alerts for rocketmq.
#
###
# Galera Alerts

groups:
- name: GaleraAlerts
  rules:
  - alert: RocketMQClusterProduceHigh
    expr: sum(rocketmq_producer_tps) by (cluster) >= 10
    for: 3m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.cluster}} Sending tps too high.'
      summary: cluster send tps too high
  - alert: RocketMQClusterProduceLow
    expr: sum(rocketmq_producer_tps) by (cluster) < 1
    for: 3m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.cluster}} Sending tps too low.'
      summary: cluster send tps too low
  - alert: RocketMQClusterConsumeHigh
    expr: sum(rocketmq_consumer_tps) by (cluster) >= 10
    for: 3m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.cluster}} consuming tps too high.'
      summary: cluster consume tps too high
  - alert: RocketMQClusterConsumeLow
    expr: sum(rocketmq_consumer_tps) by (cluster) < 1
    for: 3m
    labels:
      severity: warning
    annotations:
      description: '{{$labels.cluster}} consuming tps too low.'
      summary: cluster consume tps too low
  - alert: ConsumerFallingBehind
    expr: (sum(rocketmq_producer_offset) by (topic) - on(topic)  group_right  sum(rocketmq_consumer_offset) by (group,topic)) - ignoring(group) group_left sum (avg_over_time(rocketmq_producer_tps[5m])) by (topic)*5*60 > 0
    for: 3m
    labels:
      severity: warning
    annotations:
      description: 'consumer {{$labels.group}} on {{$labels.topic}} lag behind
        and is falling behind (behind value {{$value}}).'
      summary: consumer lag behind
  - alert: GroupGetLatencyByStoretime
    expr: rocketmq_group_get_latency_by_storetime > 1000
    for: 3m
    labels:
      severity: warning
    annotations:
      description: 'consumer {{$labels.group}} on {{$labels.broker}}, {{$labels.topic}} consume time lag behind message store time
        and (behind value is {{$value}}).'
      summary: message consumes time lag behind message store time too much 

Finally, you can view the alarm display in Prometheus, where red indicates the items currently in the alarm state and green indicates the normal state.

  1. Grafana dashboard for RocketMQ

Prometheus’ own indicator display platform is not as good as the current popular display platform Grafana. In order to better display RocketMQ indicators, Grafana can be used to display the indicators obtained by Prometheus.

First of all to the website to download: https://grafana.com/grafana/download, here is still in binaries are installed, for example.

Wget https://dl.grafana.com/oss/release/grafana-6.2.5.linux-amd64.tar.gz tar - ZXVF grafana - 6.2.5. Linux - amd64. Tar. Gz CD Grafana 5.4.3 /

In the defaults.ini file in the conf directory, change the grafana listening port to 55555 and run the following command:

./bin/grafana-server web

Then visit http://< through a browser; If the server IP address >:55555, you can verify that grafana has been successfully installed. The default user name and password of the system are admin and admin. You are required to change the password when logging in to the system for the first time. After changing the password, the following information is displayed:

Clicking the Add Data Source button will ask you to select the data source.

Select the data source as Prometheus and set the address of the data source to the address of the Prometheus started in the previous step.

Going back to the home screen will require you to create a new Dashboard.

Click Create Dashboard. You can create a dashboard manually or by importing a configuration file. The RocketMQ dashboard configuration file has been uploaded to the Grafana website.

Click the New Dashboard dropdown button.

Select Import Dashboard.

This time can arrive Grafana website to download the current for RocketMQ create good configuration file, the address is: https://grafana.com/dashboards/10477/revisions, as shown in the figure below:

Click Download to download the configuration file. Download the configuration file and then copy the content of the configuration file and paste it into the paste content shown in the figure above.

Finally, the configuration file is imported into Grafana as described above.

The final result looks like this:

Author’s brief introduction

Chen Houdao, formerly worked for Tencent, Shanda, Douyu and other Internet companies. Currently, I am working in Suntech, where I am responsible for the design and development of infrastructure. Distributed message queues, microservices architecture and landing, DevOps, and monitoring platforms are well studied.

Feng Qing, formerly of Huawei. Currently, I am working in Suntech And responsible for the development of basic components in the infrastructure team of Suntech. The original link to this article is ali Cloud original content, shall not be reproduced without permission.