One, say why want to cent table first

When you get into the millions, it takes you longer to query, and if you have syndication, I think you might die there. The purpose of the table is to reduce the burden of the database, shorten the query time.

In my experience, mysql executes an SQL as follows: 1. 2, put SQL in queue; 3, execute SQL; 4. Return the execution result. Where did you spend the most time in this implementation? The first is the queuing time, and the second is the SQL execution time. In fact, these two are the same thing, while waiting, there must be SQL execution. So we need to shorten the execution time of SQL.

Mysql has a mechanism called table locking and row locking. This mechanism is used to ensure data integrity. For example, if there are two SQL statements that want to modify the same data in the same table, what should we do? Mysql obviously handles this situation with table locking (MyISAM storage engine) and row locking (InnoDB storage engine). Table lock means that none of you can operate on this table until I finish operating on it. The same goes for row locking. Other SQL must wait until I’m done with this data before I can operate on it. If there is too much data and too much time to execute at a time, the longer the wait, which is why we have separate tables.

Second, the table

Mysql cluster, mysql Proxy, mysql Replication, DRDB, etc

Mysql cluster: Root table, root table, root table Although it is not in the actual sense of the sub-table, but it started to the role of the sub-table, do cluster meaning is what? To lighten the load on a database, simply reduce the amount of SQL in the SQL queue, for example: If 10 SQL requests are placed in the queue of a database server, it will take a long time to wait. If the 10 SQL requests are placed in the queue of 5 database servers, there are only 2 SQL requests in the queue of a database server, will the waiting time be greatly reduced? That’s pretty obvious. So I put it under the scope of the sub-table. I’ve done some mysql clustering:

Linux mysql Proxy installation, configuration, and read/write separation

Installation, configuration, and data synchronization of mysql Replication

Advantages: Good scalability, no complex operations after multiple sub-tables (PHP code)

Disadvantages: The amount of data in a single table is still the same, the time taken for an operation is still the same, and the hardware overhead is high.

2. The tables with large data volume and frequent access are estimated in advance and divided into several tables

This kind of estimate is not very bad, the list of posts in the forum, over a long period of time this list must be very large, hundreds of thousands, millions of may be. Chat room inside the information table, dozens of people together to talk about a night, a long time, this table of data must be very large. There are many cases like this. Therefore, the big data scale that can be predicted can be divided into N tables in advance. The number of N tables depends on the actual situation. Take the chat information table as an example:

I created 100 such tables in advance, message_00,message_01,message_02………. Message_98,message_99. Then according to the user ID to determine the user’s chat information in which table, you can use hash to get, you can use the method of remainder to get, many methods, each person to think of their own. To get the table name, use the hash method:


      
function get_hash_table($table,$userid) {
 $str = crc32($userid);
 if($str<0){
 $hash = "0".substr(abs($str), 0.1);
 }else{
 $hash = substr($str, 0.2);
 }

 return $table."_".$hash;
}

echo get_hash_table('message'.'user18991');     // Result is message_10
echo get_hash_table('message'.'user34523');    // Result is message_13
? >
Copy the code

This method tells us that user18991’s messages are stored in message_10, user34523’s messages are stored in message_13, and that when reading messages from user18991’s message_10, user34523’s messages are stored in message_13, and user34523’s messages are stored in message_13.

Advantages: Avoid millions of data in a table, shorten the execution time of a SQL

Disadvantages: When a rule is defined, breaking it can be troublesome. In the example above, I used CRC32 hash algorithm. If I don’t want to use this algorithm now, switching to MD5 will cause the same user’s message to be stored in a different table, which will clutter the data. Poor scalability.

3. Merge storage engine to implement table partitioning

I think this approach is more suitable for those situations where, without prior consideration, the data query is already slow. At this time if you want to separate the existing big data table is more painful, the most painful thing is to change the code, because the SQL statement in the program has been written, now a table is divided into dozens of tables, or even hundreds of tables, so the SQL statement is not to be rewritten? For example, I like to give examples

mysql>show engines; Mrg_myisam is merge.

mysql> CREATE TABLE IF NOT EXISTS `user1` ( -> `id` int(11) NOT NULL AUTO_INCREMENT, -> `name` varchar(50) DEFAULT NULL, -> `sex` int(1) NOT NULL DEFAULT '0', -> PRIMARY KEY (`id`) -> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ; Query OK, 0 rows affected (0.05 SEC) mysql> CREATE TABLE IF NOT EXISTS 'user2' (-> 'id' int(11) NOT NULL AUTO_INCREMENT, -> `name` varchar(50) DEFAULT NULL, -> `sex` int(1) NOT NULL DEFAULT '0', -> PRIMARY KEY (`id`) -> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ; Query OK, 0 rows affected (0.01sec) MYSQL > INSERT INTO 'user1' (' name ', 'sex') VALUES(' jiafeng ', 0); Query OK, 1 row affected (0.00 SEC) mysql> INSERT INTO 'user2' (' name ', 'sex') VALUES('tank', 1); Query OK, 1 row affected (0.00 SEC) mysql> INSERT INTO 'user2' (' name ', 'sex') VALUES('tank', 1); Query OK, 1 row affected (0.00 SEC) mysql> CREATE TABLE IF NOT EXISTS 'alluser' (-> 'id' int(11) NOT NULL AUTO_INCREMENT, -> `name` varchar(50) DEFAULT NULL, -> `sex` int(1) NOT NULL DEFAULT '0', -> INDEX(id) -> ) TYPE=MERGE UNION=(user1,user2) INSERT_METHOD=LAST AUTO_INCREMENT=1 ; Query OK, 0 rows affected, 1 warning (0.00 SEC) mysql> select ID,name,sex from alluser; +, + + -- -- -- -- -- -- -- -- -- -- -- -- -- + | | id name | sex | +, + + -- -- -- -- -- -- -- -- -- -- -- -- -- + | 1 | Zhang Ying | 0 | | | | tank | 1 + 1 - + + -- -- -- -- -- -- -- -- -- -- -- -- -- + 2 Rows in set (0.00 SEC) mysql> INSERT INTO 'alluser' (' name ', 'sex') VALUES('tank2', 0); Query OK, 1 row affected (0.00 SEC) mysql> select ID,name,sex from user2 ->; +----+-------+-----+ | id | name | sex | +----+-------+-----+ | 1 | tank | 1 | | 2 | tank2 | 0 | +----+-------+-----+ 2 Rows in set (0.00 SEC)Copy the code

I don’t know if you noticed anything from the above operation? Suppose I have a user table, user, with 50W entries, and now I want to split it into two tables, user1 and user2, with 25W entries per table,

INSERT INTO user1(user1.id,user1.name,user1.sex)SELECT (user.id,user.name,user.sex)FROM user where user.id <= 250000

INSERT INTO user2(user2.id,user2.name,user2.sex)SELECT (user.id,user.name,user.sex)FROM user where user.id > 250000

So I succeeded in a user table, divided into two tables, this time there is a problem, how to do the SQL statement in the code, used to be a table, now become two tables, the code changes a lot, so to programmers brought a lot of work, there is a good way to solve this point? Alter table allUser; alter table allUser; alter table allUser; However, not all mysql operations work

A, If you use ALTER TABLE to change the merge table to another table type, the mapping to the underlying table is lost. Instead, rows from the underlying MyISAM table are copied to the replaced table, which is then assigned the new type.

B, I saw something online that said replace didn’t work, so I tried something that did. A first halo

mysql> UPDATE alluser SET sex=REPLACE(sex, 0, 1) where id=2; Query OK, 1 row affected (0.00 SEC) Rows matched: 1 Changed: 1 Warnings: 0 mysql> select * from alluser; +, + + -- -- -- -- -- -- -- -- -- -- -- -- -- + | | id name | sex | + - + + -- -- -- -- -- -- -- -- -- -- -- -- -- + | 1 | Zhang Ying | 0 | | | tank | 1 | | 2 | tank2 | | 1 +----+--------+-----+ 3 rows in set (0.00 SEC)Copy the code

C, A merge table cannot maintain unique constraints on the entire table. When you perform an INSERT, the data enters the first or last myISAM table (depending on the value of the insert_method option). Mysql ensures that unique key values remain unique within that MyISam table, but not across all tables in the collection.

D) When you create a merge table, there is no check to ensure that the underlying table exists and has the same organization. When merge tables are used, mysql checks that the record lengths of each mapped table are equal, but this is not very reliable. If you create a merge table from different myISam tables, you are very likely to run into strange problems.

C and D saw it on the Internet. There is no test. Let’s have a try.

Advantages: good scalability, and program code changes are not very big

Disadvantages: This method is less effective than the second

Three, to sum up

I’ve actually done two of the three methods mentioned above, the first and the second. The third one hasn’t been done, so it’s a little bit more detailed. Ha ha. Do everything there is a degree, more than a degree becomes very poor, can not blindly do database server cluster, hardware is to spend money to buy, do not blindly table, divided out 1000 tables, mysql storage in the final analysis of the file situation exists on the hard disk, a table corresponding to three files, 1000 sub-tables equals 3000 files, which can be very slow to retrieve. Here’s my advice

Method 1 and Method 2 are combined to divide the table

Method 1 and Method 3 are combined to divide the table

My two suggestions are suitable for different situations, depending on the individual situation. I think many people will choose the combination of method 1 and method 3


Original text: blog.51yip.com/mysql/949.h…