Click “Migrant brother Linux operation and Maintenance” above to follow me,

Top of the public account, wonderful content first time.

Author: Lao Zhang

Original: http://blog.51cto.com/sumongodb/1955184

MySQL DBA tools are required for the MySQL database. It is convenient to help us manage our database and make our work more efficient.

The tool, called Percona-Toolkit (a Sharp Swiss Army knife), is a collection of commands from percona, an important branch of MySQL. Here are some of the longest ones we use in production.

Toolkit download address: https://www.percona.com/downloads/percona-toolkit/LATEST/

Before a partner in the background asked me about machine learning, algorithm of the public account, this is not, pay attention! Machine Learning algorithm and Natural Language Processing (MY public account for a long time)

A PhD candidate in Harbin Institute of Technology (NLP) on the road to technical improvement, provides you with a series of notes, experiences and insights on my way to study. No.1 in the field of natural language processing, no.1 in the field of machine learning algorithms, and jun grow!

The installation process is simple. Decompress the package tar -zxvf percona-Toolkit -3.0.3_x86_64.tar.gz as a binary package, you can directly go to the percona-Toolkit -3.0.3/bin directory.

Tip 1:

pt-online-schema-change

It organizes table structures online, collects fragments, and adds fields and indexes to large tables. Avoid operations that lock tables and block read and write operations. For MySQL 5.7, you can do DDL online without using this command.

The presentation process is as follows: Because it is a test environment, I will not create a table with a large amount of data, mainly to let you understand the process. This is the data inside the table and the structure of the table

mysql> select count(*) from su; + -- -- -- -- -- -- -- -- -- -- + | count (*) | + -- -- -- -- -- -- -- -- -- -- + | 100000 | + -- -- -- -- -- -- -- -- -- -- + 1 row in the set (0.03 SEC) mysql > desc su; +-------+------------------+------+-----+-------------------+-----------------------------+| Field | Type | Null | Key |  Default | Extra |+-------+------------------+------+-----+-------------------+-----------------------------+| id | int(10) unsigned | NO | PRI | NULL | auto_increment | | c1 | int(11) | NO | | 0 | | | c2 | int(11) | NO | | 0 | | | c3 |  int(11) | NO | | 0 | | | c4 | int(11) | NO | | 0 | | | c5 | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | c6 | varchar(200) | NO | | | |Copy the code

Procedure for adding fields online:

[root@node3 bin]# ./pt-online-schema-change --user=root --password=root123 --host=localhost --alter="ADD COLUMN city_id INT" D=test,t=su --executeNo slaves found. See --recursion-method if host node3 has slaves.Not checking slave lag because no slaves were found and --check-slave-lag was not specified.Operation, tries, wait: Analyze_table, 10, 1 COPY_rows triggers, 10, 1 drop_triggers, 10, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1Altering `test`.`su`... Creating new table... Created new table test._su_new OK.Altering new table... Altered `test`.`_su_new` OK.2017-08-10T14:53:59 Creating triggers... 2017-08-10T14:53:59 Created triggers OK.2017-08-10T14:53:59 Copying approximately 100163 rows... 2017-08-10T14:54:00 Copied rows OK.2017-08-10T14:54:00 Analyzing new table... 2017-08-10T14:54:00 Swapping tables... 2017-08-10T14:54:00 Swapped original and new tables OK.2017-08-10T14:54:00 Dropping old table... 2017-08-10T14:54:00 Dropped old table `test`.`_su_old` OK.2017-08-10T14:54:00 Dropping triggers... 2017-08-10T14:54:00 Dropped triggers OK.Successfully altered `test`.`su`.Copy the code

City_id = city_id;

mysql> desc su; +---------+------------------+------+-----+-------------------+-----------------------------+| Field | Type | Null | Key  | Default | Extra |+---------+------------------+------+-----+-------------------+-----------------------------+| id | int(10) unsigned | NO | PRI | NULL | auto_increment | | c1 | int(11) | NO | | 0 | | | c2 | int(11) | NO | | 0 | | | c3 |  int(11) | NO | | 0 | | | c4 | int(11) | NO | | 0 | | | c5 | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | c6 | varchar(200) | NO | | | | | city_id | int(11) | YES | | NULL | | +---------+------------------+------+-----+-------------------+-----------------------------+Copy the code

Tip 2:

pt-query-digest 

Function: Now capture the TOP 10 slow SQL statements online. We all know that most database performance problems are caused by SQL statements, so we need to catch them as criminals. Do relevant optimization in time.

The process is as follows: You can sample slow SQL statements according to the time interval. Since is a SQL statement that can be adjusted

[root@node3 bin]# ./pt-query-digest --since=24h /data/mysql/slow.log > 1.logCopy the code

View the SQL report to summarize what are the slow statements and to see the consumption against time.

The following is only part of the reporting process

cat 1.log# Profile# Rank Query ID Response time Calls R/Call V/M Item# ==== ================== ============= ===== ============ ==============# 1 0x040ADBE3A1EED0A2 16.8901 87.2% 1 16.8901 0.00 CALL insert_su# 2 0x8E44F4ED46297D4C 1.3013 6.7% 3 0.4338 0.18 INSERT SELECT test._su_new test.su# 3 0x12E7CAFEA3145EEF 0.7431 3.8% 1 0.7431 0.00 DELETE su# MISC 0xMISC 0.4434 2.3% 3 0.1478 0.0 <3ITEMS># Query 1: 0 QPS, 0x concurrency, ID 0x040ADBE3A1EED0A2 at byte 19060 ____# Scores: V/M = 0.00# Time range: all events occurred at 2017-08-02 12:12:07# Attribute pct total min max avg 95% stddev median# ============ === ======= ======= ======= ======= ======= ======= =======# Count 2 1# Exec time 47 18s 18s 18s 18s 18s 0 18s# Lock time 0 103us 103us 103us 103us 103us 0 103us# Rows sent 0 0 0 0 0 0 0 0# Rows examine 0 0 0 0 0 0 0 0# Query size 0 21 21 21 21 21 0 21# String:# Databases test# Hosts localhost# Users root# Query_time distribution# 1us# 10us# 100us# 1ms# 10ms# 100ms# 1s# 10s+ ################################################################call insert_su(50000)\GCopy the code

As you can see, the report lists the proportion of SQL statement response time and the execution time of SQL statements. So we can intuitively see which statements are wrong. (Only one SQL is listed here)

Tip 3:

The Pt-heartbeat function monitors the primary-slave delay. Monitor how much time the slave is behind the master. Environment: 192.168.56.132 primary library, 192.168.56.133 secondary library

Perform the following operations on the primary library:

[root@node3 bin]# ./pt-heartbeat --database test --update --create-table --daemonize -uroot -proot123Copy the code

Test monitors the synchronized library for me and creates a monitor table heartbeat under that library, which the background process updates from time to time.

Master-server-id is the server-id of the master library, -h (master library IP address)

[root@node4 bin]# ./pt-heartbeat --master-server-id=1323306--monitor --database test -uzs -p123456 -h 0.00s [0.00s, 0.00s, 0.00s]0.00s [0.00s, 0.00s, 0.00s]0.00s [0.00s, 0.00s, 0.00s]0.00s [0.00s, 0.00s, 0.00s]0.00s 0.00s, 0.00s]0.00 S [0.00s, 0.00s, 0.00s]0.00 S [0.00s, 0.00s, 0.00s]Copy the code

The time is 0s, and there is no delay at present.

Tip 4:

The pt-table-checksum function checks the primary and secondary replication consistency

Mysql > update master_src; update master_src; update master_src; This_src and master_src values are checked to determine whether the replication is consistent.

To compare the differences in the test library, execute on the main library:

[root@node3 bin]# ./pt-table-checksum --no-check-binlog-format --nocheck-replication-filters --databases=test -- Replicate =test. Checksums --host= 192.168.56.132-UZs-P123456 TS ERRORS DIFFS ROWS SKIPPED TIME Heartbeat08-10t16:01:02 0 011 0 0.013 test.heartbeat08-10T16:01:02 0 01 0 0.015 test. su08-10t16:01:02 0 01 0 0.011 test.tCopy the code

Diff is 0, indicating that there is no difference between the primary and secondary test libraries.

To compare which tables in the test library are different (you need to add replicate-check-only), execute on the main library:

[root@node3 bin]# ./pt-table-checksum --no-check-binlog-format --nocheck-replication-filters --databases=test -- Replicate =test. Checksums --replicate-check-only --host= 192.168.56.132-uzs-p123456Differences on node4TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARYtest.t 1 1 1Copy the code

Table T in test library is inconsistent with primary and secondary data.

Tip 5:

The pt-slave-restart function monitors primary/secondary errors and tries to restart the MySQL primary/secondary database. Note: Skip the error command to resolve the problem of multiple data in the secondary database (error code 1062). If there are few data from the library and errors are skipped, the problem of master-slave synchronization cannot be solved from the root (error code 1032). It is necessary to find out what the missing data is. If there are too many missing data, it is suggested to rebuild the master-slave environment.

Error 1062 from library:

Slave_IO_Running: YesSlave_SQL_Running: NoLast_Errno: 1062Last_Error: Could not execute Write_rows event on table test.t; Duplicate entry '1' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000006, end_log_pos 757482Copy the code

Need to execute on the slave library:

[root@node4 bin]# ./pt-slave-restart -uroot -proot123 --error-numbers=10622017-08-10T16:28:12 p=... ,u=root node4-relay-bin.000002 751437 1062Copy the code

After skipping the error, check the primary/secondary result:

Slave_IO_Running: YesSlave_SQL_Running: YesCopy the code

The synchronization status is consistent again.

Tip 6:

The Pt-IOProfile function helps locate I/O problems based on I/O throughput.

[root@node3 bin]# ./pt-ioprofile Thu Aug 10 16:33:47 CST 2017Tracing process ID 3907     total       read     pwrite      write      fsync filename 13.949355   0.839006   0.000000   0.286556  12.823793 /data/mysql/mysql-bin.000006  7.454844   0.000000   2.913702   0.000000   4.541142 /data/mysql/ib_logfile0  0.000193   0.000000   0.000000   0.000193   0.000000 /data/mysql/slow.logCopy the code

Read: Reads data from a file. The file to be read is identified by the file descriptor, and the data is read into a predefined buffer. Write: Writes the buffer data to a file. Pread: Synchronization problems are caused by the fact that the kernel may temporarily suspend the process between lseek and READ calls. Calling pread is equivalent to calling Lseek and READ sequentially, and the two operations are equivalent to a bound atomic operation. Pwrite: Synchronization problems are caused because the kernel may temporarily suspend the process between lseek and write calls. Calling pwrite is equivalent to calling LSEEK and WRITE sequentially, which are equivalent to a bundled atomic operation. Fsync: Ensures that all changes to the file have been correctly synchronized to the hard disk. This call blocks until the device reports I/O completion. Filename: specifies the name of the file to interact with the disk

In this report, you can see which file occupies the most I/O time and interacts with the disk the most, which is convenient for locking I/O problems.

Because there are many commands in this tool set, today I will introduce you some of the more common ones, and others you are interested in can be studied privately. The official address: https://www.percona.com/doc/percona-toolkit/LATEST/index.html

Recommended reading

[strongly recommended] carefully arranged | public number article directory

The ten years story sequel of the elder brother of migrant worker: hang piao ten years, withdraw bully all now!

Hematemesis recommended 10 suggestions! Increase your salary by one level

Redis backup, disaster recovery and high availability combat

Of these 20 Docker commands, how many do you know?

HTTP Status Code, from now on the wrong!

Share experience of ten million scale high performance, high concurrency network architecture

MySQL sub – database sub – table scheme, summed up very good!

HTTPS popular science literacy, read the people are suddenly enlightened!

, end,

— Writing is not easy, your forwarding is the biggest support for me —

Let’s have fun together

More than 30,000 people are watching
















Focus on Linux system operation and maintenance, operation and maintenance architecture, all kinds of high availability architecture technology, keen to share open source technology, love open source, embrace open source, I am the 1980s migrant brother, wechat public account: migrant brother Linux operation and maintenance, I bring salt for myself, who said that the 1980s can not live a wonderful life!!

Click the menu “wechat group” to join the group and communicate with your friends!

Long press concern public number

Love and embrace open source