Depots table

An overview of the

  • Relational database itself is relatively easy to become the system bottleneck, single storage capacity, connection number, processing capacity are limited.
  • When the amount of data in a single table reaches 1000W or 100G, the performance of many operations deteriorates seriously even after adding slave libraries and optimizing indexes due to the large number of query dimensions. At this point, it is necessary to consider the shard, the purpose of the shard is to reduce the burden of the database, shorten the query time.
  • The core content of database distribution is nothing more than data Sharding, and the positioning and integration of data after Sharding.
  • Data sharding is to store data in multiple databases, reducing the amount of data in a single database, alleviating the performance problem of a single database by expanding the number of hosts, and thus improving the operation performance of the database.

Data segmentation

Vertical (vertical) segmentation

  • There are two kinds of vertical segmentation: vertical library and vertical table.
  • Vertical branch is to store different tables with low correlation in different databases according to the business coupling. The method is similar to splitting a large system into several small systems, which are independently divided according to service categories.
  • Vertical split table is based on the “column” in the database. If a table has many fields, you can create an extended table and split the fields that are not often used or whose field length is large into the extended table.
  • Advantages of vertical segmentation:
    • Solve the business system level coupling, business clarity
    • Hierarchical management, maintenance, monitoring, and expansion of data of different services
    • In high concurrency scenarios, vertical sharding improves the bottleneck of I/O, database connections, and hardware resources on a single machine
  • Disadvantages of vertical segmentation:
    • Some tables cannot join and can only be solved through interface aggregation, which improves the complexity of development
    • Distributed transaction processing is complex
    • There is still the problem of large amount of data in a single table (horizontal segmentation is required)

Horizontal (horizontal) segmentation

  • When an application is difficult to fine-grained vertical segmentation, or the number of data lines after segmentation is huge, there is a single library read/write and storage performance bottleneck, then horizontal segmentation is needed.
  • Horizontal segmentation is divided into sub-table in the database and sub-table in the database. According to the internal logical relationship of the data in the table, the same table is dispersed to multiple databases or multiple tables according to different conditions. Each table contains only part of the data, so that the amount of data in a single table becomes smaller and the distributed effect is achieved.
  • The database partition table only solves the problem of large amount of data in a single table, but does not distribute the table to the library of different machines, so for reducing the pressure of MySQL database, help is not very big, we still compete with the same physical machine CPU, memory, network IO, best through the partition table to solve.
  • After horizontal shard, the same table will appear in multiple databases/tables with different contents in each library/table.
  • Advantages of horizontal segmentation:
    • There is no performance bottleneck caused by large amount of single database data and high concurrency, which improves system stability and load capacity
    • The application end does not need to split service modules
  • Disadvantages of horizontal segmentation:
    • Transaction consistency across shards is difficult to guarantee
    • Cross-library join associated query performance is poor
    • Data expansion for many times is difficult and maintenance is very large

MyCAT

An overview of the

  • Mycat is a Java-based distributed database system middleware, which provides a solution for distributed storage in high concurrency environment

  • Suitable for massive data writing storage requirements

  • Supports MySQL, Oracle, Sqlserver, and Mongodb

  • Provides data read and write separation services

  • Provides data sharding service

  • Open source software developed based on Alibaba Cobar

Fragmentation rules

  • Sharding – by – intfile enumeration method
  • Fixed fragment rule
  • The scope is auto-sharding-long
  • The mod – long modulus method
  • Date column partitioning sharding-by-date
  • Sharding -by-pattern is used for wildmatching
  • ASCII pattern matching sharding-by-Prefixpattern
  • Programmatically specify sharding-by-substring
  • String split hash resolves sharding-by-stringhash
  • It’s a hash sharding-by-murmur

The working process of the

  1. Parse the tables involved in SQL commands

  2. Then look at the configuration of the table, if there is a sharding rule, then get the value of the SHARding field in the SQL command, and match the sharding function, get the sharding list

  3. The SQL command is then sent to the corresponding database server for execution

  4. Finally, all sharding result data is collected and processed, and returned to the client

Deploy the MyCAT service

Graph LR c(client:192.168.1.10)--> MC (mycat:192.168.1.15) MC -->db1(db1:192.168.1.11) MC --> DB2 (db2:192.168.1.12) graph LR C (client:192.168.1.10)--> MC (mycat:192.168.1.15) MC --> DB1 (db1:192.168.1.11) MC --> DB2 (DB2 :192.168.1.12) MC - > db3 (db2:192.168.1.13)

Configure the MyCat server

  • Deploy the MyCat runtime environment
[root@mycat1 ~]# yum -y install java-1.8.0-openJDK

[root@mycat1 ~]# which java
/usr/bin/java
[root@mycat1 ~]# java -versionOpenjdk version "1.8.0_222-EA" OpenJDK Runtime Environment (build 1.8.0_222-EA-b03 25.222 - b03, mixed mode)Copy the code
  • Installing software Packages
[root@mycat1 ~]# tar xf Mycat - server - 1.6 - RELEASE - 20161028204710 - Linux. Tar. Gz
[root@mycat1 ~]# mv mycat /usr/local/

Set the PATH environment variable
[root@mycat1 ~]# vim /etc/profile.d/mycat.sh
export PATH=$PATH:/usr/local/mycat/bin
[root@mycat1 ~]# source /etc/profile.d/mycat.sh
Copy the code
  • Modify configuration file: Set connection account and logical library
Keep the default Settings
[root@mycat1 ~]# vim /usr/local/mycat/conf/server.xml
The description of the user and logical library in this file is as follows:
<user name="root">		<!-- User name used to connect to mycat service -->
     <property name="password">123456</property>   <!-- password used by user to connect to mycat user -->
     <property name="schemas">TESTDB</property>    <!-- Logical library name -->
</user>
<user name="user">
                <property name="password">user</property>
                <property name="schemas">TESTDB</property>
                <property name="readOnly">true</property>	  <!Mycat is read-only. If you connect to mycat, you can only read the records. If you do not write the records, you can read the records.
</user>
Copy the code
  • Modifying a configuration file: Configure data fragmentation
[root@mycat1 ~]# vim /usr/local/mycat/conf/schema.xml<? The XML version = "1.0"? > <! DOCTYPE mycat:schema SYSTEM "schema.dtd"> <mycat:schema xmlns:mycat="http://io.mycat/"> <schema name="TESTDB" checkSQLschema="false" sqlMaxLimit="100"> <!SQL > select * from TESTDB;
		<!-- auto sharding by id (long) -->
		<table name="travelrecord" dataNode="dn1,dn2,dn3" rule="auto-sharding-long" />  <!SQL > select * from travelRecord;

		<!-- global table is auto cloned to all defined data nodes ,so can join
			with any table whose sharding node is in the same data node -->
		<table name="company" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />
		<table name="goods" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3"/ > <!-- random sharding using mod sharind rule -->
		<table name="hotnews" primaryKey="ID" autoIncrement="true" dataNode="dn1,dn2,dn3"
			   rule="mod-long"/ > <!-- <table name="dual" primaryKey="ID" dataNode="dnx,dnoracle2" type="global"
			needAddLimit="false"/> <table name="worker" primaryKey="ID" dataNode="jdbc_dn1,jdbc_dn2,jdbc_dn3"
			rule="mod-long" /> -->
		<table name="employee" primaryKey="ID" dataNode="dn1,dn2,dn3"
			   rule="sharding-by-intfile" />
		<table name="customer" primaryKey="ID" dataNode="dn1,dn2,dn3"
			   rule="sharding-by-intfile">
			<childTable name="orders" primaryKey="ID" joinKey="customer_id"
						parentKey="id">
				<childTable name="order_items" joinKey="order_id"
							parentKey="id" />
			</childTable>
			<childTable name="customer_addr" primaryKey="ID" joinKey="customer_id"
						parentKey="id" />
		</table>
		<!-- <table name="oc_call" primaryKey="ID" dataNode="dn1$0-743" rule="latest-month-calldate"
			/> -->
	</schema>
	<!-- <dataNode name="dn1$0-743" dataHost="localhost1" database="db$0-743"
		/> --
	
        
        
        
        <!Define the host name of the database and the library to store the data.
	*<dataNode name="dn1" dataHost="mysql1" database="db1" />
	<dataNode name="dn2" dataHost="mysql2" database="db2" />
	<dataNode name="dn3" dataHost="mysql3" database="db3"/ > <!Mysql1 host name = mysql1 host name = mysqL1
	<dataHost name="mysql1" maxCon="1000" minCon="10" balance="0"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<writeHost host="hostM1" url="192.168.1.11:3306" user="mycatadmin"
				   password="[email protected]">
		</writeHost>
	</dataHost>

	<dataHost name="mysql2" maxCon="1000" minCon="10" balance="0"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<writeHost host="hostM2" url="192.168.1.12:3306" user="mycatadmin"
				   password="[email protected]">
		</writeHost>
	</dataHost>

	<dataHost name="mysql3" maxCon="1000" minCon="10" balance="0"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<writeHost host="hostM3" url="192.168.1.13:3306" user="mycatadmin"
				   password="[email protected]">
		</writeHost>
	</dataHost>
</mycat:schema>
Copy the code

Configuring the Database Server

[root@mysql1 ~]# mysql -uroot [email protected]
mysql> create database db1 default charset utf8mb4;
mysql> grant all on*. *to mycatadmin@The '%' identified by '[email protected]';

[root@mysql2 ~]# mysql -uroot [email protected]
mysql> create database db2 default charset utf8mb4;
mysql> grant all on*. *to mycatadmin@The '%' identified by '[email protected]';

[root@mysql3 ~]# mysql -uroot [email protected]
mysql> create database db3 default charset utf8mb4;
mysql> grant all on*. *to mycatadmin@The '%' identified by '[email protected]';
Copy the code

Start the MyCat

  • Test connection accounts to 3 database servers on MyCat server
Install the mysql client software
[root@mycat1 ~]# yum install -y mysql-community-client

[root@mycat1 ~]# mysql-h192.168.1.11-umycatadmin [email protected]
[root@mycat1 ~]# mysql-h192.168.1.12 -umycatadmin [email protected]
[root@mycat1 ~]# mysql-h192.168.1.13 -umycatadmin [email protected]
Copy the code
  • Start the service
[root@mycat1 ~]# mycat start
[root@mycat1 ~]# netstat -tlnp | grep :8066
tcp6       0      0 :::8066                 :::*                    LISTEN      13835/java
Copy the code
  • Client connection test
[root@node10 ~]Mysql -h192.168.1.15-p8066 -uroot -p123456
mysql> show databases;
+----------+
| DATABASE |
+----------+
| TESTDB   |
+----------+
1 row in set (0.00 sec)

mysql> use TESTDB;
mysql> show tables;
+------------------+
| Tables in TESTDB |
+------------------+
| company          |
| customer         |
| customer_addr    |
| employee         |
| goods            |
| hotnews          |
| orders           |
| order_items      |
| travelrecord     |
+------------------+
9 rows in set (0.00 sec)
Copy the code

Fragmentation rules

The enumeration method

An overview of the

  • Use the sharding-by-intfile rule
  • Determine the table to use enumeration
  • Find the data shard field name
  • View the function name used by the data sharding
  • Find the configuration file for the function call
  • Modify the function configuration file

configuration

  • View the table using enumeration
[root@mycat1 ~]# grep -B1 sharding-by-intfile /usr/local/mycat/conf/schema.xml
		<table name="employee" primaryKey="ID" dataNode="dn1,dn2,dn3"
			   rule="sharding-by-intfile" />
		<table name="customer" primaryKey="ID" dataNode="dn1,dn2,dn3"
			   rule="sharding-by-intfile">
Copy the code
  • Check the rule file to see the functions used by sharding-by-intfile sharding rules
[root@mycat1 ~]# vim /usr/local/mycat/conf/rule.xml
<tableRule name="sharding-by-intfile">
                <rule>
                        <columns>sharding_id</columns>    <!-- Shard field name -->
                        <algorithm>hash-int</algorithm>   <!-- function name used -->
                </rule>
</tableRule>

<function name="hash-int"
                class="io.mycat.route.function.PartitionByFileMap">
                <property name="mapFile">partition-hash-int.txt</property>   <!-- Function call configuration file -->
</function>
Copy the code
  • Modify the function configuration file and add the DN3 data node
[root@mycat1 ~]# vim /usr/local/mycat/conf/partition-hash-int.txt10000=0 // When the value of sharding_id is 10000, data is stored in data node Dn1. 10010=1 // When the value of sharding_id is 10010, data is stored in data node Dn1. Data is stored in data node DN2. 10020=2 // When sharding_ID is 10020, data is stored in data node DN3Copy the code
  • Restart the service
[root@mycat1 conf]# mycat restart
Copy the code
  • The client connects to the shard server to access data
[root@node10 ~]Mysql -h192.168.1.15-p8066 -uroot -p123456
mysql> use TESTDB;
mysql> create table employee(id int primary key, sharding_id int.name varchar(20), birth_date date);

You must specify a list of field names when storing data
mysql> insert into employee
    -> (id, sharding_id, name, birth_date)
    -> values- > (1.10000.'nb'.'2000-01-01'),
    -> (2.10010.'wk'.'1998-10-01'),
    -> (3.10020.'plj'.'2002-05-04'),
    -> (4.10020.'dmy'.'1990-08-02');
Query OK, 4 rows affected (0.21 sec)

mysql> select * from employee;
+----+-------------+------+------------+
| id | sharding_id | name | birth_date |
+----+-------------+------+------------+| 1 | 10000 | nb | 2000-01-01 | | 2 | 10010 | wk | 1998-10-01 | | 3 | 10020 | plj | 2002-05-04 | | 4 | 10020 | dmy | The 1990-08-02 | +----+-------------+------+------------+
4 rows in set (0.09 sec)

# Check the records on three servers
[root@mysql1 ~]# mysql -uroot [email protected]
mysql> use db1;
mysql> select * from employee;
+----+-------------+------+------------+
| id | sharding_id | name | birth_date |
+----+-------------+------+------------+
|  1 |       10000 | nb   | 2000-01-01 |
+----+-------------+------+------------+
1 row in set (0.00 sec)

[root@mysql2 ~]# mysql -uroot [email protected]
mysql> use db2;
mysql> select * from employee;
+----+-------------+------+------------+
| id | sharding_id | name | birth_date |
+----+-------------+------+------------+
|  2 |       10010 | wk   | 1998-10-01 |
+----+-------------+------+------------+
1 row in set (0.00 sec)

[root@mysql3 ~]# mysql -uroot [email protected]
mysql> use db3;
mysql> select * from employee;
+----+-------------+------+------------+
| id | sharding_id | name | birth_date |
+----+-------------+------+------------+
|  3 |       10020 | plj  | 2002-05-04 |
|  4 |       10020 | dmy  | 1990-08-02 |
+----+-------------+------+------------+
2 rows in set (0.00 sec)
Copy the code

Modulus method

An overview of the

  • Use the mod-long rule
  • The segmentation rule is based on the value n entered in the configuration
  • Sharding rules divide data into N pieces (usually DN nodes are also N), so that data is evenly distributed on all nodes
  • This strategy is a good way to distribute database write stress. It is suitable for single point query scenarios

configuration

  • Check the configuration file to see the name of the table that uses the mod-long sharding rule
[root@mycat1 ~]# vim /usr/local/mycat/conf/schema.xml. . <table name="hotnews" primaryKey="ID" dataNode="dn1,dn2,dn3" rule="mod-long" /> ... .Copy the code
  • Check the rule file to see which functions the mod-long sharding rule uses
[root@mycat1 ~]# vim /usr/local/mycat/conf/rule.xml
<tableRule name="mod-long">
                <rule>
                        <columns>id</columns>             <!-- Data fragment field -->
                        <algorithm>mod-long</algorithm>   <!-- function name -->
                </rule>
        </tableRule>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
                <!-- how many data nodes -->
                <property name="count">3</property>       <!--> < span style = "box-sizing: border-box;
</function>
Copy the code
  • Restart the service
[root@mycat1 ~]# mycat restart
Stopping Mycat-server...
Stopped Mycat-server.
Starting Mycat-server...
Copy the code
  • The client connects to the shard server to access data
[root@node10 ~]Mysql -h192.168.1.15-p8066 -uroot -p123456
mysql> use TESTDB;
mysql> create table hotnews(id int primary key, title varchar(50), content text);
Query OK, 0 rows affected (0.59 sec)

mysql> insert into hotnews
    -> (id, title, content) - >values- > (1.'Python Instructor Change'.'Mr Pang has spoken Python'),
    -> (2.'Mysql instructor change'.'Miss Zhang talks about mysql'),
    -> (3.Mysql > update mysql > update mysql > update mysql.'Add all SQL syntax'),
    -> (4.'the mysql time'.'Course lasts for 11 days');
Query OK, 4 rows affected (0.14 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from hotnews;
+----+--------------------+-----------------------+
| id | title              | content               |
+----+--------------------+-----------------------+Lecturer | change | 1 | python pang teacher speak the python | | | | mysql 4 time courses, a total of 11 days | | 2 | mysql instructor teaches mysql | change | teacher zhang update | | 3 | mysql course add all | + SQL grammar----+--------------------+-----------------------+
4 rows in set (0.07 sec)

# Check the records on three servers
[root@mysql1 ~]# mysql -uroot [email protected]
mysql> use db1;
mysql> select * from hotnews;
+----+-------------------+-----------------------+
| id | title             | content               |
+----+-------------------+-----------------------+| 3 | mysql course update | add all | + SQL grammar----+-------------------+-----------------------+
1 row in set (0.00 sec)

[root@mysql2 ~]# mysql -uroot [email protected]
mysql> use db2;
mysql> select * from hotnews;
+----+--------------------+-----------------------+
| id | title              | content               |
+----+--------------------+-----------------------+Lecturer | change | 1 | python pang teacher speak the python | | | 4 mysql long | | + courses, a total of 11 days----+--------------------+-----------------------+
2 rows in set (0.00 sec)

[root@mysql3 ~]# mysql -uroot -p'[email protected]'
mysql> use db3;
mysql> select * from hotnews;
+----+-------------------+-------------------+
| id | title             | content           |
+----+-------------------+-------------------+| 2 | mysql instructor teaches + mysql | change | teacher zhang----+-------------------+-------------------+
1 row in set (0.00 sec)
Copy the code

Global table

An overview of the

  • Tables of type global are stored in all libraries

configuration

  • View global table
[root@mycat1 ~]# grep goods /usr/local/mycat/conf/schema.xml
		<table name="goods" primaryKey="ID" type="global" dataNode="dn1,dn2,dn3" />
Copy the code
  • The client connects to the shard server to access data
[root@node10 ~]Mysql -h192.168.1.15-p8066 -uroot -p123456
mysql> use TESTDB;
mysql> create table goods(id int primary key auto_increment, name varchar(10));
Query OK, 0 rows affected (0.61 sec)

mysql> insert into goods(id.name) - >values- > (1.'tea'),
    -> (2.'Mineral water'),
    -> (3.'peanuts'),
    -> (4.'seeds');
Query OK, 4 rows affected (0.16 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from goods;
+----+-----------+
| id | name      |
+----+-----------+| 1 | tea | | 2 | mineral water | | 3 | peanut | | | | melon seeds + 4----+-----------+
4 rows in set (0.01 sec)


# Check the records on three servers
[root@mysql1 ~]# mysql -uroot [email protected]
mysql> use db1;
mysql> select * from goods;
+----+-----------+
| id | name      |
+----+-----------+| 1 | tea | | 2 | mineral water | | 3 | peanut | | | | melon seeds + 4----+-----------+
4 rows in set (0.00 sec)

[root@mysql2 ~]# mysql -uroot [email protected]
mysql> use db2;
mysql> select * from goods;
+----+-----------+
| id | name      |
+----+-----------+| 1 | tea | | 2 | mineral water | | 3 | peanut | | | | melon seeds + 4----+-----------+
4 rows in set (0.00 sec)

[root@mysql3 ~]# mysql -uroot [email protected]
mysql> use db3
mysql> select * from goods;
+----+-----------+
| id | name      |
+----+-----------+| 1 | tea | | 2 | mineral water | | 3 | peanut | | | | melon seeds + 4----+-----------+
4 rows in set (0.00 sec)
Copy the code