Preface:

Int (string, string, string); int (string, string); int (string, string, string); Recently, I paid more attention to the article in this respect, and decided to finish the article I had been dragging for a long time. This article will introduce the differences between the string types char and varchar.

In this paper, the experimental environment is MySQL 5.7.23, storage engine is Innodb, SQL_mode uses strict mode, character set is UTF8.

1.CHAR type introduction

When we define a field using the char type, we usually specify its length M, which is char(M). In fact, M refers to the number of characters, that is, the maximum number of characters stored in this field. M is optional. The default value of M is 1 and the range is 0,255. The next Chinese character in the UTF8 character set occupies 3 bytes. Here’s a simple test:

CREATE TABLE 'char_tb1' (' col1 'char DEFAULT NULL,' col2 'char(5) DEFAULT NULL, `col3` char(10) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Char (M) M char(M) M The default is 1 mysql > show create table char_tb1 \ G * * * * * * * * * * * * * * * * * * * * * * * * * * * 1. Row * * * * * * * * * * * * * * * * * * * * * * * * * * * the table: char_tb1 Create Table: CREATE TABLE `char_tb1` ( `col1` char(1) DEFAULT NULL, `col2` char(5) DEFAULT NULL, Col3 'char(10) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 SEC) Mysql > insert into char_tb1 (col1) values ('a',('1'),(' king '),(']'); insert into char_tb1 (col1) values ('a',('1'),(' king '),(']'); Query OK, 4 rows affected (0.01 SEC) mysql> insert into char_tb1 (col1) VALUES ('aa'),('12'); ERROR 1406 (22001): Data too long for column 'col1' at row 1 mysql> select * from char_tb1; + -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- + | col1 | col2 | col3 | + -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- -- + | a | NULL | NULL | | 1 | NULL | NULL | | | NULL | | NULL | |] NULL | NULL | + -- -- -- -- -- - + -- -- -- -- -- - + -- -- -- -- -- - + 4 rows in the set (0.00 SEC) mysql > insert into char_tb1 (col2) values ('abcd'),(' wang-123 '),('^*123'),('12'),(' 1, 2, 3, 4, 5 '); Query OK, 5 rows affected (0.01sec) mysql> insert into char_tb1 (col2) values ('abcdef'); ERROR 1406 (22001): Data too long for column 'col2' at row 1 mysql> select * from char_tb1; +------+-----------------+------+ | col1 | col2 | col3 | +------+-----------------+------+ | a | NULL | NULL | | 1 | King NULL | NULL | | | NULL | NULL | |] | NULL | NULL | | NULL | abcd | NULL | | | NULL NULL | king - 123 | | NULL | | ^ * 123 12 | NULL NULL | | NULL | | | | NULL NULL | 12345 | + -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- - + 9 rows in the set # (0.00 SEC) Mysql > alter table char_tb1 add column COL4 char(0); Query OK, 0 rows affected (0.10 SEC) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table char_tb1 add column col5 char(255); Query OK, 0 rows affected (0.11 SEC) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table char_tb1 add column col5 char(256); ERROR 1074 (42000): Column length too big for column 'col5' (max = 255); use BLOB or TEXT insteadCopy the code

2.VARCHAR type introduction

Similarly, M in vARCHar (M) indicates the maximum number of characters that can be saved. A single letter, digit, or Chinese character can all occupy one character. Varchar can store the length from 0 to 65535 bytes. In addition, VARCHAR requires 1 or 2 extra bytes to record the length of the string: 1 byte if the maximum column length is 255 bytes or less, 2 bytes otherwise. For Innodb engine, utf8 character set, single Chinese character takes 3 bytes, so the maximum M in varchar(M) cannot be 21845, that is, the range of M is [0,21845], and M must be specified. The maximum length of a single field is 65535 bytes, and the maximum length of a single line is 65535 bytes, excluding TEXT and BLOB fields. That is, the sum of all vARCHAR fields defined in a single table cannot be greater than 65535. Therefore, M in all VARCHAR (M) fields can not be 21844.

CREATE TABLE 'varchar_tb1' (' col1 'varchar(0) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8; # to check the build table statements Increase field found M must specify mysql > show create table varchar_tb1 \ G * * * * * * * * * * * * * * * * * * * * * * * * * * * 1. Row *************************** Table: varchar_tb1 Create Table: CREATE TABLE 'varchar_tb1' (' col1 'varchar(0) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 SEC)  mysql> alter table varchar_tb1 add column col2 varchar; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1 # Mysql > CREATE TABLE 'varchar_tb2' (col1 varchar(21844)); Query OK, 0 rows affected (0.04 SEC) mysql> CREATE TABLE 'varchar_tb3' (col1 varchar(218445)); ERROR 1074 (42000): Column length too big for column 'col1' (max = 21845); Mysql > CREATE TABLE 'varchar_tb3' (col1 varchar(10)); Query OK, 0 rows affected (0.04 SEC) mysql> alter table varchar_tb3 add column COL2 varchar(21844); ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs mysql> alter table varchar_tb3 add column col2 varchar(21834); ERROR 1118 (42000): Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs mysql> alter table varchar_tb3 add column col2 varchar(21833); Query OK, 0 rows affected (0.09 SEC) Records: 0 Duplicates: 0 Warnings: 0 mysql> show create table varchar_tb3\G *************************** 1. row *************************** Table: varchar_tb3 Create Table: CREATE TABLE `varchar_tb3` ( `col1` varchar(10) DEFAULT NULL, 'col2' varchar(21833) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8 1 row in set (0.00 SEC)Copy the code

3.CHAR and VARCHAR comparison

The CHAR type is fixed length, MySQL always allocates enough space based on the length of the string defined. When CHAR values are saved, Spaces are padded to the right of them to the specified length, and trailing Spaces are removed when CHAR values are retrieved.

The VARCHAR type is used to store variable-length strings that are not followed by a space if the character does not reach the specified number of digits. However, because the row is lengthened, the UPDATE may make the row longer than it originally was, resulting in additional work. If a row takes up more space and there is no more space to store within the page, InnoDB needs to split the page so that the row can fit within the page, which increases fragmentation.

Here’s a quick summary of the applicable scenarios for CHAR and VARCHAR fields:

CHAR is good for storing very short strings, or all values close to the same length. For example, CHAR is good for storing the MD5 value of a password because it is a fixed-length value. CHAR is also better than VARCHAR for frequently changing data because fixed-length CHAR types are less prone to fragmentation. For very short columns, CHAR is also more storage efficient than VARCHAR. For example, using CHAR(1) to store only Y and N values requires only one byte if the single-byte character set is used, but VARCHAR(1) requires two bytes because there is an extra byte of record length.

The use of VARCHAR is appropriate when the string is very long or the length of the string to be stored varies greatly; The maximum length of the string column is much larger than the average length; Column updates are rare, so fragmentation is not a problem.

As an extra note, when defining the maximum field length, we should allocate it on demand and make an estimate in advance. Especially for VARCHAR fields, some people think that VARCHAR data types are allocated according to the actual needs of the length, it is better to give a larger. However, this is not the case. For example, if we need to store an address information, according to the evaluation, only 100 characters are needed. We can use VARCHAR(100) or VARCHAR(200) to store the address information. Longer columns consume more memory because MySQL typically allocates fixed-size chunks of memory to hold internal values, which is especially bad when using temporary in-memory tables for arrangement or manipulation. So we still can’t be too generous when assigning VARCHAR data types. Again, evaluate the actual length required and choose the longest field to set the character length. If you want to consider redundancy, you can leave around 10% of the character length. You should never assume that VARCHAR allocates storage space based on the actual length, arbitrarily allocating the length, or simply using the maximum character length.

Conclusion:

This article introduces the use of CHAR and VARCHAR field types respectively, and gives the comparison and application scenarios of the two. In the actual production situation, need specific situation specific analysis, appropriate is the best, I hope this article can give you reference.