Usage Scenarios:

Based on personal experience and practice, the conclusions are as follows:

Monitoring of floating values with natural (physical) upper bounds, such as physical memory, collections, maps, values, etc. Monitoring of logically bounded floating values, such as backlogged messages, backlogged tasks (in thread pools), is essentially monitoring of collections or maps.

For a more practical example, suppose you want to send a message or push to a logged-in user. The message is placed on a blocking queue and a thread consumes the message for other operations: Public Class GaugeMain {

private static final MeterRegistry MR = new SimpleMeterRegistry();
private static final BlockingQueue<Message> QUEUE = new ArrayBlockingQueue<>(500);
private static BlockingQueue<Message> REAL_QUEUE;

    static {
        REAL_QUEUE = MR.gauge("messageGauge", QUEUE, Collection::size);
    }

    public static void main(String[] args) throws Exception {
        consume();
        Message message = new Message();
        message.setUserId(1L);
        message.setContent("content");
        REAL_QUEUE.put(message);
    }

    private static void consume() throws Exception {
        new Thread(() -> {
            while (true) {
                try {
                    Message message = REAL_QUEUE.take();
                    //handle message
                    System.out.println(message);
                } catch (InterruptedException e) {
                    //no-op
                }
            }
        }).start();
    }
Copy the code

}

The above example is poorly written and is intended to demonstrate its use. Do not use it in a production environment.

TimeGauge

TimeGauge is a specialization of Gauge. Compared to Gauge, the constructor has a TimeUnit parameter that specifies the base TimeUnit for the ToDoubleFunction entry. Public class TimeGaugeMain {public class TimeGaugeMain {

private static final SimpleMeterRegistry R = new SimpleMeterRegistry(); public static void main(String[] args) throws Exception{ AtomicInteger count = new AtomicInteger(); TimeGauge.Builder<AtomicInteger> timeGauge = TimeGauge.builder("timeGauge", count, TimeUnit.SECONDS, AtomicInteger::get); timeGauge.register(R); count.addAndGet(10086); print(); count.set(1); print(); } private static void print()throws Exception{ Search.in(R).meters().forEach(each -> { StringBuilder builder = new StringBuilder(); builder.append("name:") .append(each.getId().getName()) .append(",tags:") .append(each.getId().getTags()) .append(",type:").append(each.getId().getType()) .append(",value:").append(each.measure()); System.out.println(builder.toString()); }); }}Copy the code

Name: / / output timeGauge, tags: [], type: the GAUGE, the value: [Measurement {statistic = ‘value’, Name: value = 10086.0}] timeGauge, tags: [], type: the GAUGE, value: [Measurement {statistic = ‘value’, value = 1.0}]

DistributionSummary

Summary is primarily used to track the distribution of events, and in Micrometer, the corresponding class is DistributionSummary. It is used in much the same way as a Timer, but its recorded values do not depend on time units.

A common usage scenario: Use DistributionSummary to measure the payload size of requests hitting the server. Create a DistributionSummary instance with MeterRegistry as follows: DistributionSummary summary = registry. Summary (“response.size”); Create the following by builder stream: DistributionSummary summary = DistributionSummary .builder(“response.size”) .description(“a description of what this Summary does”) // Optional. BaseUnit (“bytes”) // Optional. Tags (“region”, “test”) // Optional.scale(100) // Optional.

Many of the build parameters in DistributionSummary are related to scaling and histogram representation, as shown in the next section.

Usage Scenarios:

Based on personal experience and practice, the conclusions are as follows:

1. Measurement of recorded values independent of time units, such as server payload, cache hit ratio, etc.

For a more concrete example: Public Class DistributionSummaryMain {

private static final DistributionSummary DS = DistributionSummary.builder("cacheHitPercent") .register(new SimpleMeterRegistry()); private static final LoadingCache<String, String> CACHE = CacheBuilder.newBuilder() .maximumSize(1000) .recordStats() .expireAfterWrite(60, TimeUnit.SECONDS) .build(new CacheLoader<String, String>() { @Override public String load(String s) throws Exception { return selectFromDatabase(); }}); public static void main(String[] args) throws Exception{ String key = "doge"; String value = CACHE.get(key); record(); } private static void record()throws Exception{ CacheStats stats = CACHE.stats(); BigDecimal hitCount = new BigDecimal(stats.hitCount()); BigDecimal requestCount = new BigDecimal(stats.requestCount()); DS.record(hitCount.divide(requestCount,2,BigDecimal.ROUND_HALF_DOWN).doubleValue()); }Copy the code

}

Histogram and percentage configuration

Histogram and percentage configuration are applicable to Summary and Timer, which are relatively complex and can be supplemented after thorough research.

Integration based on SpirngBoot, Prometheus and Grafana

JVM applications that integrate the Micrometer framework collect metrics using the Micrometer API in memory. Therefore, additional storage systems are required to store these metrics, monitoring systems are required to collect and process the data, and UI tools are required to display the data. The average big guy only likes to look at cool charts and animations.

A common storage system is chronology database, mainstream include Influx, Datadog, etc. The dominant monitoring system, primarily for data collection and processing, is Prometheus(commonly known as Prometheus, but hereafter as such). The UI that I’m showing you is the one that’s been used a lot so far is Grafana.

In addition, Prometheus already has a built-in implementation of a sequential database, so a relatively complete metric monitoring system relies only on the target JVM applications, the Prometheus component and the Grafana component. Let’s take a moment to build one from scratch, using CentOS7.

Micrometer is used in SpirngBoot

The Spring-boot-starter – Actuator dependencies in SpringBoot are integrated with Micrometer, and the metrics endpoints use Micrometer for many functions. Prometheus endpoints are also enabled by default. In fact, the Spring-boot-Bresen-Autoconfigure that the actuator relies on is integrated with an OUT-OF-the-box API for many frameworks.

The Prometheus package integrates support for Prometheus, making the use of the actuator easily expose Prometheus endpoints as clients for Prometheus to collect data. This endpoint allows Prometheus(server software) to collect Micrometer metrics for applications.

We first introduce spring-boot-starter-actuator and Spring-boot-starter – Web to implement a Counter and Timer as an example. Rely on: Org.springframework. boot Spring-boot-dependencies 2.1.0.release pom import org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-starter-actuator org.springframework.boot Spring-boot-starter -aop org. projectLombok Lombok 1.16.22 IO. Micrometer Micrometer-Registry – Prometheus 1.1.0

// entity @data public class Message {// entity @data public class Message {// entity @data public class Message {

private String orderId; private Long userId; private String content; } @Data public class Order { private String orderId; private Long userId; private Integer amount; private LocalDateTime createTime; } @restController Public Class OrderController {@autoWired private OrderService OrderService; @PostMapping(value = "/order") public ResponseEntity<Boolean> createOrder(@RequestBody Order order){ return ResponseEntity.ok(orderService.createOrder(order)); } } @Slf4j @Service public class OrderService { private static final Random R = new Random(); @Autowired private MessageService messageService; Public Boolean createOrder(Order Order) {int ms = r.nextint (50) + 50; TimeUnit.MILLISECONDS.sleep(ms); Log.info (" Save order simulation time {} ms..." , ms); Metrics. Counter ("order.count", "order.channel", order.getChannel()).increment(); Message = new Message(); Message.setcontent (" Simulated SMS...") ); message.setOrderId(order.getOrderId()); message.setUserId(order.getUserId()); messageService.sendMessage(message); return true; } } @Slf4j @Service public class MessageService implements InitializingBean { private static final BlockingQueue<Message> QUEUE = new ArrayBlockingQueue<>(500); private static BlockingQueue<Message> REAL_QUEUE; private static final Executor EXECUTOR = Executors.newSingleThreadExecutor(); private static final Random R = new Random(); static { REAL_QUEUE = Metrics.gauge("message.gauge", Tags.of("message.gauge", "message.queue.size"), QUEUE, Collection::size); } public void sendMessage(Message message) { try { REAL_QUEUE.put(message); } catch (InterruptedException e) { //no-op } } @Override public void afterPropertiesSet() throws Exception { EXECUTOR.execute(() -> { while (true) { try { Message message = REAL_QUEUE.take(); Log.info (" Simulate sending SMS,orderId:{},userId:{}, Content :{}, Time :{} ms ", message.getorderId (), message.getUserId(), message.getContent(), Message.getContent (), R.nextInt(50)); } catch (Exception e) { throw new IllegalStateException(e); }}}); }} @component@aspect public class TimerAspect {@around (value = "execution(*) club.throwable.smp.service.*Service.*(..) )") public Object around(ProceedingJoinPoint joinPoint) throws Throwable { Signature signature = joinPoint.getSignature(); MethodSignature methodSignature = (MethodSignature) signature; Method method = methodSignature.getMethod(); Timer timer = Metrics.timer("method.cost.time", "method.name", method.getName()); ThrowableHolder holder = new ThrowableHolder(); Object result = timer.recordCallable(() -> { try { return joinPoint.proceed(); } catch (Throwable e) { holder.throwable = e; } return null; }); if (null ! = holder.throwable) { throw holder.throwable; } return result; } private class ThrowableHolder { Throwable throwable; }Copy the code

}

The configuration of YAML is as follows: Server: port: 9091 Management: server: port: 10091 Endpoints: Web: exposure: include: ‘*’ base-path: /management

Note that following SpringBoot-2.x, configuring permissions for exposing Web endpoints is quite different from that in 1.x.

Endpoints must be exposed as Web endpoints before they can be accessed. To disable or enable endpoint support, use the following methods: ${endpoint ID}. Enabled =true/false

You can check the kerbing-API documentation to see the features of all endpoints supported. This is the official documentation for version 2.1.0.release. I don’t know if the links will be broken in the future. Endpoint is open only to support, but not exposed as a Web endpoint, is through http:// {host} : {management port} / {management. Endpoints. Web. Base – path} / {endpointId} visit.

The configuration to expose a monitoring endpoint as a Web endpoint is: management.endpoints.web.exposure.include=info,health management.endpoints.web.exposure.exclude=prometheus

Management. Endpoints. Web. Exposure. Exclude monitoring is used to specify not exposed as a web endpoint endpoint, Specify multiple commas in English when management endpoints. Web. Exposure. Include the default specified only info and health two endpoints, we all can be specified directly exposed endpoint: Management. Endpoints. Web. Exposure. Include = *, if use YAML configuration, remember to add single quotes’ ‘. It is dangerous to expose all Web monitoring endpoints. If you want to do this in the production environment, ensure that http://{host}:{management.port} cannot be accessed through the public network (that is, the ports accessed by the monitoring endpoints can only be accessed through the Intranet. This makes it easier for the Prometheus server to collect data from this port.

Installation and configuration of Prometheus

The latest version of Prometheus is 2.5, but since I haven’t played With Docker in depth, download the Docker package and unpack it. Wget github.com/prometheus/… tar xvfz prometheus-.tar.gz cd prometheus-

Scrape_configs scrape_configs = “Prometheus”, Prometheus. Yml

The job name is added as a label job=<job_name> to any timeseries scraped from this config.

  • job_name: ‘prometheus’

    metrics_path defaults to ‘/metrics’

    scheme defaults to ‘http’.

    Here configure the URL path to pull metrics, and here select the Prometheus endpoint of the application

    metrics_path: /management/prometheus static_configs:

    Set the host and port here

    • targets: [‘localhost:10091’]

The path of the configuration pull metrics for localhost: 10091 / management/metrics, remember the previous section mentioned before application in a virtual machine started. Then start the Prometheus app:

Parameter –storage.tsdb.path= The path where data is stored./data is the default path

./prometheus –config.file=prometheus.yml

The default startup port referenced by Prometheus is 9090. After the startup is successful, the following logs are generated:

In this case, access TTP ://${vm host}:9090/targets to see the Job currently executed by Prometheus

Go to TTP ://${vm host}:9090/graph to find the metric Meter we defined and some metric meters that have been defined for JVM or Tomcat in spring-boot-starter-actuator.

Let’s call the /order interface of the application and look at the rder_count_total ‘ ‘ethod_cost_time_seconds_sum defined earlier in the monitoring application

As you can see, the Meter information has been collected and displayed, but it is clearly not detailed and cool enough, so we need to use Grafana’S UI for embellishes.

Installation and use of Grafana

The Grafana installation process is as follows: wget s3-us-west-2.amazonaws.com/grafana-rel… Sudo yum localinstall grafana 5.3.4-1. X86_64. RPM

After the installation is complete, run the service grafana-server start command to start the grafana-server. The default startup port is 3000. Run TTP ://${host}:3000 to start the grafana-server. The initial account password is admin, and the permission is administrator. You then need to add a data source in the Home panel to connect to the Prometheus server so that you can pull metrics from it. The data source add panel is as follows:

A port that points to the Prometheus server. You can then arbitrarily add any panels you want, adding a Graph panel for the order count metric

When configuring the panel, specify Title in the base (General) :

Next of importance is the configuration of Metrics, which specifies the data source and Prometheus query:

It is best to consult the official documentation for Prometheus to learn a little about the use of its query language PromQL, which can support multiple PromQL queries in one panel.

The two items mentioned above are basic configurations. Other configuration items are generally auxiliary functions, such as early warning and other auxiliary functions shown in the chart. We will not expand them here, but can dig out the usage mode on Grafana’s official website. Then we call the order interface again, after a period of time, the chart data will be automatically updated and displayed:

Then add the Meter of the Timer used in the project to monitor the execution time of the method. After completion, it is roughly as follows: