preface

The purpose of this post is to optimize the performance of Node services, focusing on the collection of application performance information, and looking for APM suitable for devs. Therefore, we did a survey of some APM software that supports Node.js. Before introducing each software separately, we will first explain the purpose of using each software: It can detect slow HTTP requests and check Node service status (including memory CPU usage) and database status (in this case mongodb) to help Dev accurately analyze the cause of slow request processing.

The experience report below will be evaluated against this goal.

Software list

APM software for this experience includes:

The name of the introduce Open source/charge status
atatus Support for multiple languages, including front-end No open source, charge
newrelic Support for multiple languages, including front-end Open source, charge
keymetrics Pm2, Node application manager Open source, free quota
Pandora.js Ali produced, Node application manager Open source, free (self-built service)
alinode Underlying Node monitoring Open source, free (configuration instance on Ali Cloud)
statsd + graphite + grafana Monitoring three sets, high development freedom Open source, self-built services

A, atatus

Atatus provides the NPM tool tool atatus-Node for monitoring data collection. For details, see the official documents.

Usage report: Not 100% sampling rate, not collecting the various dimensions of information advertised on the official website, using the Koa framework, basically only seeing the total HTTP response time.

Second, the newrelic

Newrelic.com/, which is basically the same as Atatus, also offers the NPM toolkit NewRelic.

Third, keymetrics

It is biased towards application life cycle management. You can see the CPU and memory usage of Node service. It has the function of collecting errors (crash), and the domestic access is very slow.

Four, Pandora. Js

Pandora. Js is a Node.js application monitoring manager. It integrates various types of capabilities such as monitoring, link tracing, debugging, process management, and so on.

It’s an interesting thing, but in summary it’s powerful but immature, both Pandora itself and the entire Node ecosystem.

(I) Application management

Can be standard PM2, do not repeat.

(2) Application measurement

Operating system indicators include Load, CPU, memory, disk, network, and TCP. Node.js metrics (memory usage); You can also customize indicators, similar to what STATSD does, but the subsequent storage and presentation is very simple, not as easy as StatSD + Graphite + Grafana.

(3) Link tracking

This is an “advanced” function provided by Pandora. In theory, link tracing can analyze where an HTTP takes more time, such as a GET request, to query db and call third-party services. Link tracing records the time of each link:

The implementation of link tracing relies on an experimental Node feature called async_hooks. Because it is new, there are still many problems. For example, the Promise implementation of mongoose and Superagent leads to async_hooks. For details, see the issue we mentioned in Pandora. That is to say, if you happen to use these two frameworks, Pandora link tracking will be disabled.

Fifth, alinode

Node. Js performance platform https://cn.aliyun.com/product/nodejs, alinode Node to do the application of information collection, the runtime level rather than the above each frame in the application layer of information collection, So Alinode can monitor process data, heap snapshots, heap timelines, CPU profiles, GC traces, and other very low-level information. It can be very helpful if your application performance bottleneck is in the Node service itself. If your application performance bottleneck is in the DB, That would be using a DB monitoring tool.

Vi. Statsd + Graphite + Grafana

The interesting thing about this three-piece set is that the monitoring indicators are self-defined, and the koalas use this three-piece set to monitor the processing time of the request, to record the processing time of each request. In this way, grafana can quickly see the current application volume and trend, and quickly analyze which interfaces are slow.

Trend change of interface access volume:

Slowest interface:

The advantage of this tool is that Grafana provides very intuitive diagrams. We will discuss in depth how to build this set of tools later.

conclusion

Getting back to our goal of “helping Dev figure out exactly why requests are being processed slowly”, Pandora. Js is ideally suited to this goal, but we’ll have to wait for it to mature. The most simple and powerful is statSD + Graphite + Grafana three pieces, without considering atatus newRelic, etc. Alinode can be useful if you use Node for high-concurrency services.


Copyright belongs to the author of this article, without authorization, please do not reprint, thank you.