introduce

This article is an introduction to using ELK to monitor mysql, basic monitoring of some key indicators, of course, according to the different business, there may be different indicator requirements, but using this method to monitor, the principle will not change, very suitable for entry.

ELK is a very powerful software combination, which is open source on Github and has a surprisingly large number of stars. Interested friends can understand that the learning curve of this tool is quite steep. It is recommended to use simple tools like mysqlBeat mentioned in this paper as a collection tool to start, and do not use the official beat at first. On the one hand, the default configuration reports all data and wastes storage space. On the other hand, the complex nested table structure (Document) makes learning more difficult. More specific reasons will be mentioned later. If you enter a key:value, for example INNODB_PAGE_SIZE:16384, all records with a pagesize of 16KB will be listed. But that doesn’t mean you won’t fall into the trap. There’s still a lot of learning to do with ELK.

Monitoring tools

mysqlbeat

Mysqlbeat is a highly customisable mysql monitoring agent that queries information_schema.global_status and reports it to ElasticSearch for storage and visualization through Kibana. Small amount of code, recommended reading, github address.

The data reported

After installation mainly to the/etc/mysqlbeat/mysqlbeat yml file configured (different platforms path may have differences), has the following Settings:

Output: Address of ElasticSearch cluster (you can also output it to logstash). You can set multiple addresses at the same time, such as hosts: [” 192.168.1.1:9200 192.168.1.2 instead: “, “9200”] the template: ElasticSearch mapping template path, the default is/etc/mysqlbeat/mysqlbeat template. Json, defines the document fields (beginners can be understood as a relational database table structure), if you want to add or modify occasionally fields, Please set the overwrite: The most important part of this template configuration is the Queries field, which defines a series of SQL statements. Mysqlbeat generates a table by executing these statements. This table is the data you want to monitor. It has only two fields, VARIABLE_NAME and VARIABLE_VALUE, which represent the name and value of the monitor you want to monitor, respectively. Value has two types

Type, the first is a difference because the global status of some of the data is continuously accumulate, so to get data in 1 s, need to use the current time to take the value minus the previous interval to take to the value, and then divided by the number of seconds between, of course, these don’t need you to complete, you only need to add a suffix behind monitoring of __DELTA can: CONCAT(VARIABLE_NAME, ‘__DELTA’) The second type is a type like memory values that does not require differential operations. Anyway, it’s easy to use, look at the configuration, and try it right away

The data show

Using this tool, it is easy to configure the visualization of the exported data. I currently use the standard Kibana AS the UI. Here is a comparison of the mysqlBeat visual configuration and the official Beat visual configuration

The X-axis on the left is the time axis, and the Y-axis is the average QPS, which is very clear. Look at the picture on the right, I only show the X axis, the X axis is an external time axis, and there is a filter nested inside. This effect is a division of vertical space, which is not intuitive for beginners. You can imagine that I spent a lot of time to realize this display, but even so, There’s no denying the fact that ElasticSearch itself is pretty powerful.

Monitoring indicators

QPS and TPS

QPS is the number of queries per second, i.e. QUESTIONS field in information_schema.global_status TPS is the number of transactions per second, It is the sum of COM_ROLLBACK and COM_COMMIT in information_schema.global_status

There are two reasons why a “mysql connection error” may occur when using a database

The number of connections reaches the maximum configured memory or the number of threads is insufficient (one thread for each connection), so the following monitoring needs to be set

THREADS_CONNECTED: Specifies the current number of connected connections, as compared to MAX_CONNECTIONS. ABORTED_CONNECTS is used when the number of connections exceeds 80%, or when the number of connections increases abruptly. In this case, one or both of the following indicators will increase. CONNECTION_ERRORS_MAX_CONNECTIONS: The connection fails because the number of current connections exceeds the upper limit. CONNECTION_ERRORS_INTERNAL: The parameter cache is used to check whether the connection fails due to insufficient memory or threads

The importance of caching in the Internet age is immeasurable, the two mainstream database engines InnoDB and MyISAM cache function is different, the former cache includes index and actual data, while MyISAM only cache index, it gives the data cache to the operating system, here we have the same monitoring principle, but the field is different:

Monitoring cache usage Monitoring cache hits Two parameters are required for cache usage, cache usage size and total cache size

MyISAM: KEY_BLOCKED_USED/(KEY_BLOCKED_UNUSED + KEY_BLOCKED_USED) InnoDB: INNODB_BUFFER_POOL_PAGES_DATA/(INNODB_BUFFER_POOL_PAGES_FREE + INNODB_BUFFER_POOL_PAGES_DATA) Cache hits only require cache visits and disk visits. This set of fields is not easy to remember. Remember that the variable name for cache reads is multiple requests suffix than the variable name for disk reads.

MyISAM: read KEY_READ_REQUESTS/(KEY_READS + KEY_READ_REQUESTS); Write hit KEY_WRITE_REQUESTS/(KEY_WRITE_REQUESTS + KEY_WRITES) InnoDB: Cache hit INNODB_BUFFER_POOL_READ_REQUESTS/(INNODB_BUFFER_POOL_READ_REQUESTS + INNODB_BUFFER_POOL_READS) The read/write hit ratio of the cache should be based on the most recent period (say, 10s) (the authors use a cumulative value), so that the data is more realistic, while a large base will flatten the data and make it difficult to detect outbreaks.

TODO

Mysql > update mysqlBeat (‘ mysqlBeat ‘, ‘mysqlBeat’, ‘mysqlBeat’, ‘mysqlBeat’, ‘mysqlBeat’, ‘mysqlBeat’, ‘mysqlBeat’); Use ElastAlert spike monitoring sudden increase or decrease of rule, rule sets the threshold frequency, alarm latency in these circumstances, the priority is lower, generally the upper interface of the monitor latency, because in general, mysql is the bottleneck.