This is the 10th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

Field type Precautions

Numeric types

1. The integer

Storage space occupied by MySQL integer types and value range:

type What occupy a space Range (signed) Range (unsigned)
tinyint 1 – 128 ~ 127 0 ~ 255
smallint 2 – 32768 ~ 32767 0 ~ 65535
mediumint 3 – 8388608 ~ 8388607 0 ~ 16777215
int 4 – 2147483648 ~ 2147483647 0 ~ 4294967295
bigint 8 – 9223372036854775808 – – 9223372036854775807 0 ~ 18446744073709551615

2. Note the unsigned attribute

  • MySQL requires unsigned values to remain unsigned when subtracted, otherwise an error will be reported

(BIGINT UNSIGNED value is out of range in...)

  • To avoid this error set sql_mode to NO_UNSIGNED_SUBTRACTION and add signed results.
SET sql_mode='NO_UNSIGNED_SUBTRACTION';
Copy the code

Floating point type and high precision type

  1. Starting with MySQL version 8.0.17, MySQL will not recommend the use of floating point types Float or Double, and high-precision DECIMAL types can be used.

  2. We do not recommend DECIMAL type for the amount field, but integer type for use in large and concurrent Internet business.

  • Funds are stored in sub-units instead of units
  • The type DECIMAL is an encoding implemented through binary, which is far less computationally efficient than integer. Therefore, it is recommended to use BIG INT to store fields related to the amount.

Use BIGINT instead of INT as primary key

  1. INT ranges up to 4.2 billion, but for massive data stores, the limit for INT types will soon be reached. Do not use ints to save 4 bytes, otherwise it will be costly to modify the table structure later.
  2. When the INT limit is reached, a repeat error is reported when auto-increment inserts again.

String type

Char or varchar

  • CHAR(N) is used to hold fixed-length characters. N ranges from 0 to 255. Note that N represents a character, not a byte.

  • VARCHAR(N) stores variable-length characters. The value of N ranges from 0 to 65536, where N indicates a character.

  • For more than 65536 characters, you can use the larger character types TEXT or BLOB, which have a maximum storage length of 4G. The difference is that BLOBS have no character set attributes and are purely binary storage.

Character set

  • Common character sets include GBK and UTF8. The default character set is UTF8. However, some emoji characters cannot be stored in the UTF8 character set, so it is recommended to set MySQL’s default character set to UTF8MB4.

  • Commands to modify the column character set should be used

ALTER TABLE ... CONVERT TO CHARSET ...

To modify the character set of an existing column.

collation

A Collation is a rule that compares and sorts strings. Each character set has a default Collation rule. You can use the SHOW CHARSET command to view the default Collation rule.

mysql> SHOW CHARSET LIKE 'utf8%';
+---------+---------------+--------------------+--------+
| Charset | Description   | Default collation  | Maxlen |
+---------+---------------+--------------------+--------+
| utf8    | UTF-8 Unicode | utf8_general_ci    |      3 |
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci |      4 |
+---------+---------------+--------------------+--------+


mysql> SHOW COLLATION LIKE 'utf8mb4%';
+----------------------------+---------+-----+---------+----------+---------+---------------+
| Collation                  | Charset | Id  | Default | Compiled | Sortlen | Pad_attribute |
+----------------------------+---------+-----+---------+----------+---------+---------------+
| utf8mb4_0900_ai_ci         | utf8mb4 | 255 | Yes     | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_ci         | utf8mb4 | 305 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_cs         | utf8mb4 | 278 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_bin           | utf8mb4 | 309 |         | Yes      |       1 | NO PAD        |
| utf8mb4_bin                | utf8mb4 |  46 |         | Yes      |       1 | PAD SPACE     |
......
Copy the code
  • Collation ends with _ci for Case insensitive (Case Insentive)
  • The collation uses _cs to indicate case sensitivity
  • The collation is denoted by _bin for comparison by storing binary characters

other

CHECK Constraint function

You are advised to use the CHECK function for gender or status fields.

  • Starting with MySQL 8.0.16, the database provides the CHECK constraint function natively.
  • The use of tinyint avoids the generation of ambiguous and dirty data (which may be stored in other values).
  • As follows, the constraint defines the range of the column sex, which can only be M or F. At the same time, MySQL explicitly throws a warning for illegal constraints when inserting illegal data (Check constraint 'user_chk_1' is violated.).
CREATE TABLE User (
    id BIGINT NOT NULL AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    sex CHAR(1) NOT NULL.password VARCHAR(1024) NOT NULL.CHECK (sex = 'M' OR sex = 'F'),
    PRIMARY KEY(id));Copy the code

JSON type

JSON is supported from MySQL 5.7. No predefined fields are required, which makes it easy to describe products.

  • The JSON type is good for storing less modified, relatively static data.
  • MySQL 8.0.17 now supports multi-valued Indexes, which can be used to create Indexes on JSON arrays and to quickly retrieve index data using the member of, jSON_CONTAINS, and jSON_OVERLaps functions.
  • The maximum value of the STORED JSON content is 1 GB.

The date type

DATETIME

As of MySQL 5.6, the DATETIME type supports milliseconds. N in DATETIME(N) indicates the precision of milliseconds.

TIMESTAMP

  • TIMESTAMP Indicates the number of milliseconds since 1970-01-01 00:00:00.
  • In MySQL, because the type TIMESTAMP occupies 4 bytes, the upper limit of its storage time can only be 2038-01-19 03:14:07.
  • With milliseconds, the type TIMESTAMP takes up 7 bytes, while DATETIME takes up 8 bytes regardless of whether it stores millisecond information.

choose

DATETIME rather than TIMESTAMP and INT is recommended for date types;

  1. The value of TIMESTAMP is the same as that of INT.
  2. We are close to the maximum value available for TIMESTAMP ‘2038-01-19 03:14:07’. Using TIMESTAMP in business is risky.
  3. To use TIMESTAMP, you must explicitly set the time zone. Do not use the default system time zone. Otherwise, performance problems may occurtime_zone = '+08:00'.
  • Performance: The underlying system functions of the operating system are invoked each time the time zone is used to calculate the time__tz_convert()This function requires additional locking to ensure that the operating system time zone has not changed at this time. Therefore, when large-scale concurrent access occurs, due to hot resource competition. The performance is inferior to DATETIME.

Three normal forms and anti-three normal forms

First Normal Form (1NF)

Concept: Each column of a data table should retain its atomic properties, that is, columns cannot be split.

Second normal Form (2NF)

Concept: Attributes must be completely dependent on primary keys. Taking the first normal form one step further and addressing partial dependencies, the goal is to ensure that every column in the table is associated with a primary key.

Third normal Form (3NF)

Concept: All non-primary attributes do not depend on other non-primary attributes. Taking the second normal form one step further and addressing transitive dependencies, the goal is to ensure that columns in a table are directly related to primary keys, not indirectly.

The Fan Shihua

We should design the table structure which conforms to the normal form criterion from the business point of view.

  • Anticanonization refers to the use of time to increase efficiency by adding redundant or duplicate data, violating the second and third paradigms.
  • De-canonization can reduce the number of join tables in associated queries.
  • In some scenarios, anti-paradigm design can be implemented using JSON data types to improve storage efficiency.