First acquaintance with MGR

For many of you who are unfamiliar with MGR, MGR(MySQL Group Replication) is a new high availability and high scale solution released by Oracle MySQL in December 2016 with MySQL 5.7.17. Has the following features:

  • High consistency, based on native replication and Paxos protocol group replication technology, and provided in the way of plug-ins, to provide consistent data security;

  • High fault tolerance, as long as not most of the nodes are broken can continue to work, there is an automatic detection mechanism, when different nodes produce resource contention conflict, there will be no error, according to the first come first principle for processing, and built-in automatic brain crack protection mechanism;

  • After a new node is added, the status of the new node is automatically synchronized from other nodes until the status of the new node is consistent with that of other nodes. If a node is removed, other nodes automatically update group information and maintain new group information.

  • High flexibility, with single master mode (Figure 1) and multiple master mode (Figure 2). In single master mode, the master is automatically selected and all update operations are carried out on the master. In multi-master mode, all servers can process updates simultaneously.

Single master mode (Figure 1)

Multi-master mode (Figure 2)

The ARCHITECTURE diagram of MGR is shown as follows: it mainly consists of APIs layer, component layer, responsible for protocol module and API+Paxos engine layer.

Ii. Evolution of MGR Technology

2.1 Primary/Secondary Replication

Traditional MySQL replication provides a simple master-slave replication method by default. This architecture has one master and one or more slaves. When a master node commits a transaction and then asynchronously sends it to another slave node, the slave library reexecutes the relay log content to achieve a consistent master copy. By default, data on all nodes in the cluster is consistent.

MySQL asynchronous replication

2.2 Semi-Synchronous Replication

Asynchronous replication has certain risk of data loss. In version 5.6, MySQL introduced semi-synchronous replication and added a synchronization operation to the synchronous data protocol. In this way, the primary node needs to confirm that at least one secondary node has received and returned an ACK during the COMMIT operation, so that the primary node can correctly commit data.

MySQL semi-synchronous replication

2.3 set of copy

MySQL MGR cluster distributed cluster consisting of at least 3 server nodes, a share-nothing replication scheme, each server node has a complete copy.

MySQL group replication protocol

Iii. Technical characteristics of MGR

3.1 Fault Detection

Group replication comes with a fault detection mechanism that reports which group member is unresponsive and how to determine whether that member is excluded from the cluster group. Fault detection is a distributed service in group replication. Assume that server A does not receive any message from server B within A predetermined period of time. If other members in the group also do not receive any message from server B, confirm that SERVER B is faulty. In this way, other members decide to remove the missing group member from the cluster.

In this case, server B cannot contact other service nodes. As the minimum number of arbitration members cannot be reached, it is in an independent state and cannot provide services externally.

3.2 fault-tolerant

MySQL group replication is built on the Paxos distributed algorithm to provide distributed coordination between different servers. Therefore, it requires the majority of servers to be active to reach the quorum member count to make a decision. This has a direct impact on the number of failures a system can tolerate without affecting itself or its overall function. Number of servers required to tolerate F faults (n) N = 2 x f + 1.

In practice, this means that in order to tolerate a failure, the group must have three servers. If one server fails, there are still two servers that form the majority (two-thirds) to allow the system to continue operating automatically. However, if the second server unexpectedly goes down, the group is locked (with only one server) because there is no majority to reach the election (cannot elect itself). Here is a small table illustrating the above formula:

3.3 Member Management

Group replication manages members based on the Group View (later referred to as the View). The View generally shows the member status of the Group in a period of time. If no member changes during this period, that is, no member joins or exits the Group, the View changes once a member joins or exits the Group. And use the view ID to track the changes and distinguish the time sequence. Let’s take a look at a diagram to demonstrate:

For the ordinal part, the ordinal number of the first view starts from 1 during initialization, and there is only one member for the boot master. In order to initialize the node, the ordinal number of any member joining or exiting in the future needs to be increased by 1. You can query the current view in replication_group_MEMBER_STATS of the Performance_SCHEMA system library.

Iv. MGR Installation experience

Learn about any new technology from deployment, installation is relatively simple, we prepare the following test nodes:

10.10.1.214

10.10.1.217

10.10.6.91

The installation version is 8.0.24. Decompress the installation package and initialize it:

su - mysql
 
wget http:/ / mirrors.ustc.edu.cn/mysql-ftp/Downloads/MySQL-8.0/mysql-8.0.24-linux-glibc2.12-x86_64.tar
 
tar -xf mysql-8.024.-linux-glibc212.-x86_64.tar
 
cd mysql-8.024.-linux-glibc212./bin/mysqld --initialize --datadir=/home/mysql/mysql-8.024.-linux-glibc212.-x86_64/data --basedir=/home/mysql/mysql-8.024.-linux-glibc212.-x86_64
 
./bin/mysqld_safe --defaults-file=conf/my.cnf &
Copy the code

4.1 General Configuration Description

Configuration code

[mysqld]
bind-address=0.0. 0. 0
datadir=/home/mysql/mysql-8.024.-linux-glibc212.-x86_64/data
basedir=/home/mysql/mysql-8.024.-linux-glibc212.-x86_64
port=3306
socket=/home/mysql/mysql-8.024.-linux-glibc212.-x86_64/data/mysqld.sock user=mysql #1
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW
innodb_buffer_pool_size=1g
# 8.0Transaction_write_set_extraction =XXHASH64 # Start loading group replication plugin_load_add='group_replication.so'Group_replication_group_name ="8d3cebd8-b132-11eb-8529-0242ac130003"Recommended value :off group_replication_STARt_on_boot =off # Port following the local IP address33061Group_replication_local_address = is a user-defined cluster communication port"10.10.1.214:33061"Group_replication_group_seeds = initializes the cluster member list"10.10.1.214:33061,10.10. 1.217:33061,10.10. 6.91:33061"Group_replication_bootstrap_group =off Otherwise, the communication port cannot be accessed. Loose-group_replication_ip_whitelist ='10.10.1.214, 10.10.1.217, 10.10.6.91'
Copy the code

4.2 Single-active Deployment mode

4.2.1 Starting Node Initialization

Mysql > SET SQL_LOG_BIN=0;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> CREATE USER rpl_user@'%' IDENTIFIED BY 'password';
 
Query OK, 0 rows affected (0.01 sec) 
 
mysql> GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> GRANT BACKUP_ADMIN ON *.* TO rpl_user@'%';
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> FLUSH PRIVILEGES;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> SET SQL_LOG_BIN=1;
 
Query OK, 0 rows affected (0.00 sec)
  
mysql> CHANGE REPLICATION SOURCE TO SOURCE_USER='rpl_user', SOURCE_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';
 
Query OK, 0 rows affected, 2 warnings (0.05 sec)
  
mysql> INSTALL PLUGIN group_replication SONAME 'group_replication.so'; mysql> SHOW PLUGINS; +---------------------------------+----------+--------------------+----------------------+---------+ | Name | Status | Type | Library | License | +---------------------------------+----------+--------------------+----------------------+---------+ | group_replication  | ACTIVE | GROUP REPLICATION | group_replication.so | GPL | + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- + mysql > # start guide node SET  GLOBAL group_replication_bootstrap_group=ON; Query OK,0 rows affected (0.00 sec)
 
mysql> START GROUP_REPLICATION;
 
Query OK, 0 rows affected, 1 warning (2.33 sec)
 
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
 
Query OK, 0 rows affected (0.00 sec)
 
mysql> SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | group_replication_applier | 4cf69361-b22b-11eb-a2c9-fa163ebefc6a |10-10-1-214                                     |        3306 | ONLINE       | PRIMARY     | 8.024.| +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+Copy the code

4.2.2 Adding a Secondary Node

mysql> SET SQL_LOG_BIN=0;
 
Query OK, 0 rows affected (0.00 sec)
 
mysql> CREATE USER rpl_user@'%' IDENTIFIED BY 'password';
 
Query OK, 0 rows affected (0.01 sec)
 
mysql> GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
 
Query OK, 0 rows affected (0.03 sec) 
 
mysql> GRANT BACKUP_ADMIN ON *.* TO rpl_user@'%';
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> FLUSH PRIVILEGES;
 
Query OK, 0 rows affected (0.00 sec)
 
mysql> SET SQL_LOG_BIN=1;
 
Query OK, 0 rows affected (0.00 sec)
 
mysql> CHANGE REPLICATION SOURCE TO SOURCE_USER='rpl_user', SOURCE_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';
 
Query OK, 0 rows affected, 2 warnings (0.05 sec)
 
mysql> START GROUP_REPLICATION;
 
Query OK, 0 rows affected, 1 warning (3.33Mysql > SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | group_replication_applier | 4cf69361-b22b-11eb-a2c9-fa163ebefc6a |10-10-1-214                                     |        3306 | ONLINE       | PRIMARY     | 8.024.         |
 
| group_replication_applier | 53f39dba-b22b-11eb-bfdb-fa163e42784d | 10-10-1-217                                     |        3306 | ONLINE       | SECONDARY   | 8.024.| +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+2 rows in set (0.00 sec)Mysql > SELECT * FROM performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | group_replication_applier | 4cf69361-b22b-11eb-a2c9-fa163ebefc6a |10-10-1-214                                     |        3306 | ONLINE       | PRIMARY     | 8.024.         |
 
| group_replication_applier | 53f39dba-b22b-11eb-bfdb-fa163e42784d | 10-10-1-217                                     |        3306 | ONLINE       | SECONDARY     | 8.024.         |
 
| group_replication_applier | 56779526-b22b-11eb-a28e-fa163e1f9809 | 10-10-6-91                                      |        3306 | ONLINE       | SECONDARY     | 8.024.| +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+3 rows in set (0.00 sec)
Copy the code

4.3 Multi-Active Deployment mode

The multi-active deployment mode is similar to the single-active deployment mode. The following operations are performed only when the cluster is joined:

set global group_replication_single_primary_mode=off;
Copy the code

Monotonic are all ON.

4.3.1 Starting Node Initialization

mysql> set global group_replication_single_primary_mode=off;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> SET GLOBAL group_replication_bootstrap_group=ON;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> start group_replication;
 
Query OK, 0 rows affected, 1 warning (2.16 sec) 
 
mysql> SET GLOBAL group_replication_bootstrap_group=OFF;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> select * from  performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | group_replication_applier | 4cf69361-b22b-11eb-a2c9-fa163ebefc6a |10-10-1-214                                     |        3306 | ONLINE       | PRIMARY     | 8.024.| +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+1 row in set (0.00 sec)
Copy the code

4.3.2 Adding Other Nodes

mysql> set global group_replication_single_primary_mode=off;
 
Query OK, 0 rows affected (0.00 sec) 
 
mysql> START GROUP_REPLICATION;
 
Query OK, 0 rows affected, 1 warning (3.26sec) mysql> select * from performance_schema.replication_group_members; +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+ | group_replication_applier | 4cf69361-b22b-11eb-a2c9-fa163ebefc6a |10-10-1-214                                     |        3306 | ONLINE       | PRIMARY     | 8.024.         |
 
| group_replication_applier | 53f39dba-b22b-11eb-bfdb-fa163e42784d | 10-10-1-217                                     |        3306 | ONLINE       | PRIMARY     | 8.024.         |
 
| group_replication_applier | 56779526-b22b-11eb-a28e-fa163e1f9809 | 10-10-6-91                                      |        3306 | ONLINE       | PRIMARY     | 8.024.| +---------------------------+--------------------------------------+-------------------------------------------------+-- -----------+--------------+-------------+----------------+3 rows in set (0.00 sec)
Copy the code

4.4 Test Experience

Mysql > create database; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys  | +--------------------+4 rows in set (0.00 sec) 
 
mysql> create database test;
 
Query OK, 1 row affected (0.01 sec)Mysql > show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | sys  | | test | +--------------------+5 rows in set (0.00 sec)
Copy the code

5. Application scenarios

  • In a 2N+1 cluster, as long as N+1 nodes of the cluster are still alive, the database can stably provide services to the outside. It is suitable for financial scenarios, where data must be lost at zero and the availability is at four or even five n9.

  • It is applicable to replace the current master/slave high availability version to solve the single point of write problem.

  • Infrastructure environments that respond to business needs with flexible extension nodes, such as private clouds.

Six, summarized

Although MySQL was launched in 2016 MGR this feature, at the same time we also know there are a lot of benefits, and there are bold company USES to test even deployment of online environment, according to public information netease, drops are used, some domestic commercial Banks have used, but there are still many people are in a state of wait-and-see, basically have the following reasons:

The demand is not particularly strong

  • Semi-synchronous and asynchronous replication with MySQL is sufficient to meet service requirements in many service scenarios, and MHA third-party components meet the requirements in most scenarios.

Distributed new thing

  • The concept of self-distribution has existed for many years, but there are still many unsolved bugs when we search the official bug library due to the short launch of MGR. It is difficult for users to troubleshoot problems, and the difficulty of reoccurrence of problems due to distributed design is also an obstacle.

Ecological immaturity

  • Few official solutions are fully developed for building the entire high availability architecture, and a more mature ecosystem is needed for large-scale use.

Anything new has a process of acceptance. It just takes time to filter and polish.

Reference documentation

Dev.mysql.com/doc/refman/…

Author: Liu Shilin, Vivo Internet Database Team