Mysql into god’s way —- first know Mysql

Everything is C-S, and mysql is also divided into two parts: client and server. Generally, the client command is mysql, and the service point command is mysqld.

  1. Mysql -ulocalhost -uusername -ppassword
  2. Common commands on the server :mysqld –skip-networking Disable TCP/IP remote connections
  3. Common commands on the server :mysqld –default-storage-engine=MyISAM Modify the default mysql storage engine
  4. Common commands on the server :mysqld –bind-address binds accessible IP addresses. –skip-networking is disabled
  5. We can use mysql –help to query more commands ourselves

Commonly used system variables

Mysql system variables have a concept of global(global all user client) and session(current session, the current user can also be called local)

  • SHOW VARIABLES LIKE ‘default_storage_engine’; Querying the current storage engine (current session by default)
  • SHOW VARIABLES like ‘max_connections’; Query the maximum number of connections
  • SHOW VARIABLES LIKE ‘default%’;
  • SET GLOBAL default_storage_engine = MyISAM; SET SESSION default_storage_engine = MyISAM; SET default_storage_engine = MyISAM; (No scope, default is current session level command)
  • Some state variables such as Threads_connected how much connection to query the current client and server, Handler_update says it has updated the how many rows SHOW [GLOBAL | SESSION] STATUS [] patterns that match the LIKE; SHOW SESSION STATUS LIKE ‘Threads_connected’;

Introduction to character set comparison rules

Character sets and comparison rules are closely related.

  • Character set comparison rules have a process of encoding and decoding. The mapping of a character set to a binary is called encoding, and the mapping of a binary to a character set is called decoding.
  • The difference between UTF8MB3 (UTF8) and UTF8MB4, utF8MB3 requires 1-3 bytes, UTF8MB4 requires 1-4 bytes, and some special characters require 4 bytes. If we set it to UTF8MB3, there will be problems, so we usually set it to UTF8MB4

The character_set_client, character_set_connection, and character_set_results system variables are set to the character_set_client, character_set_connection, and character_set_results system variables are set to the character_set_client character set

conclusion

  1. A character set is an encoding rule for a range of characters.
  2. A comparison rule is a rule that compares the size of characters in a character set.
  3. In MySQL, a character set can have several comparison rules, among which there is a default comparison rule, and each comparison rule must correspond to one character set.
  4. Check MySQL view support CHARACTER sets and rules of comparative statement is as follows: the SHOW (CHARACTER SET | CHARSET) [] patterns that match the LIKE; SHOW COLLATION [LIKE matching pattern];
  5. MySQL has four levels of character sets and comparison rules: server level, database level, table level, and column level
  6. The character set conversion that occurs from sending a request to receiving a result
    • The client uses the operating system’s character set to encode the request string and sends an encoded byte string to the server.
    • The server decodes the character string sent by the client using the Character_set_client character set, and encodes the character string based on the Character_set_connection character set.
    • If the character set represented by character_set_CONNECTION is the same as the character set used by the column for the operation, the operation is performed directly. Otherwise, you need to convert the string in the request from the character_set_connection character set to the character set used by the column for the operation.
    • Sends a byte string fetched from a column from the character set used by the column to the character set represented by CHARACTER_set_RESULTS.
    • The client uses the operating system’s character set to parse the received result set byte strings.
System variables describe
character_set_client The character set used by the server to decode the request
character_set_connection The server will change the request string from character_set_client to CHARACTER_set_connection when processing the request
character_set_results The character set used by the server to return data to the client

The general rule is to keep the values of these three variables the same as the character set used by the client.

  1. Comparison rules are usually used in expressions that compare string sizes and sort a string column

InnoDB record storage structure

  1. InnoDB reads and writes data: data is divided into several pages, and pages are used as the basic unit of interaction between disk and memory. In InnoDB, the size of pages is usually 16KB, and all reads and writes are 16KB data disk and memory interaction (this is similar to the operating principle of OS, which also works in 4K memory pages).
  2. Compact row format

  1. A page is usually 16KB. When there is too much data in a record to fit on the current page, excess data is stored on other pages. This phenomenon is called row overflow. (The value of a column becomes an overflow column if it is too large)
  2. There are four row formats :COMPACT row,Redundant row,Dynamic, and Compressed. The current version of MySQL I’m using is 5.7 and Dynamic is the default row format

InnoDB data structure

  • Delete_mask: indicates whether data will be deleted. 0- No 1- Yes. The deleted data will not be flushed to the disk immediately, which affects the disk performance
  • Min_rec_mask: This flag is added to the smallest record in each non-leaf node of the B+ tree
  • Heap_no: This property indicates the position of the current record on the page. Mysql default infimum (minimum record) and supremum (maximum record) are stored separately
  • Record_type: This attribute indicates the type of the current record. There are four types of records: 0 for common record, 1 for B+ tree non-leaf record, 2 for minimum record, and 3 for maximum record.
  • Next_record: This represents the address offset from the real data in the current record to the real data in the next record. For example, the next_record value of the first record is 32, which means that 32 bytes back from the address of the real data of the first record is the real data of the next record. If you’re familiar with data structures, you immediately understand that this is actually a linked list, and you can go from one record to the next. But notice again, the next record is not the next record in the order in which we insert it, it’s the next record in the order in which the primary key goes up. And it says that the next Infimum record (also known as the minimum record) is the user record with the smallest primary key on this page, and the next user record with the largest primary key on this page is the Supremum record.

As you can see from the figure, our records form a singly linked list in ascending order of primary keys. The next_record of the largest record has a value of 0, which means that the largest record has no next record and is the last node in the single list. If one entry is deleted from the list, the linked list will also change. For example, if we delete the second entry:

As can be seen from the figure, the main changes before and after deleting the second record are as follows:

  • The second record is not removed from the storage space, but its delete_mask value is set to 1.
  • The next_record value for the second record changes to 0, meaning that there is no next record for that record.
  • The next_record for record 1 points to record 3.
  • Another thing you may have overlooked is that the n_owned value for the largest record changed from 5 to 4, which we’ll explain in more detail later.

So, no matter how we add, delete or change records on a page, InnoDB always maintains a single linked list of records, in which nodes are connected in ascending order of primary key values. Another interesting thing is that since the primary key value of 2 was deleted, but the storage space was not reclaimed, what happens if we insert the primary key into the table again?

As you can see from the figure, InnoDB does not apply for new storage space for the new record because of insertion, but directly reuse the storage space of the original deleted record.(Garbage linked list, reuse function)

Conclusion:

  1. InnoDB designs different types of pages for different purposes. We call the pages for storing records data pages.

  2. A data page can be roughly divided into seven sections, respectively

    • The File Header, which represents some general information for the page, is a fixed 38 bytes.
    • Page Header, which represents information specific to the data Page, is a fixed 56 bytes.
    • Infimum + Supremum, two virtual pseudo-records that represent the minimum and maximum records on a page, respectively, and occupy a fixed 26 bytes.
    • User Records: The part that actually stores the Records we insert, of varying size.
    • Free Space: Unused portion of a page of uncertain size.
    • Page Directory: The relative position of some records on a Page, that is, the offset of each slot on the Page. The size is not fixed. The more records are inserted, the more space this part occupies.
    • File Trailer: Part used to verify that the page is complete and takes up a fixed 8 bytes.
  3. Each record has a next_record attribute in its header, which concatenates all the records on the page into a single linked list.

  4. InnoDB divides the records in a Page into several groups. The address offset of the last record in each group is stored as a slot in the Page Directory, so it is very fast to find records in a Page by primary key in two steps:

    • The slot in which the record resides is determined by dichotomy.

    • Each record in the group in which the slot is located is traversed through the record’s next_record property.

  5. The File Header section of each data page has the previous and next page numbers, so all data pages form a double-linked list.

  6. To ensure the integrity of pages synchronized from the memory to the disk, the checksum of data in the page and the LSN value corresponding to the last modification of the page are stored in the front and back of the page. If the checksum and LSN value of the front and back fail to be checked, it indicates that the synchronization process is faulty.