High performance MySQL Notes

MySQL Database

Logical architecture

The upper

Connection processing, authentication, security, etc

In the middle

Core Service functions

Query, analyze, optimize, cache, built-in functions, and all cross-storage capabilities: stored procedures, triggers, views

The underlying

The storage engine is responsible for storing and extracting data from MySQL

The server communicates with the storage engine through apis

The connection process

Optimization and execution

The transaction isolation

The concept that

The following concepts are what transaction isolation levels actually address

Dirty read

Dirty reads are read data that has not been committed by another transaction. Uncommitted means that the data may be rolled back, meaning that it may not end up in the database, meaning that it does not exist. Dirty reads are those that read data that must eventually exist.

Repeatable read

Repeatable reads mean that the data read at the beginning of a transaction is the same as the data read at any time until the end of the transaction. Usually for data ** UPDATE ** operation.

Unrepeatable read

In contrast to repeatable read, non-repeatable read means that the same batch of data may be read at different times in the same transaction and may be affected by other transactions, for example, other transactions have changed the batch of data and committed it. Usually for data ** UPDATE ** operation.

Phantom read

Phantom reads are for INSERT operations. Assume that the transaction for some lines for A change, but has not been submitted and the transaction B insert with the transaction before A change of record of the same line, and the transaction is A submit before you submit, and then, in A transaction in A query, will find that just change for some of the data did not play A role, but in fact is the transaction B just inserted in, let the user feel very magical, The feeling appears the illusion, this is called the illusion read.

understand

The first three phenomena are caused by whether the data can be read or written by other transactions

Dirty reads may occur when they are read by other transactions

An unrepeatable read may occur when it can be written by other transactions but cannot be read

A repeatable read state occurs when it is not readable by other transactions

Serialization locks each row of data acquired, completely eliminating the influence from other transactions

As a result, the degree of locking of data from top to bottom gradually increases, resulting in increased determinism and poor performance

A deadlock

Two or more transactions occupy the same resource and request to lock the occupied resource, causing each other to wait for the resource to be released.

Multiple transactions attempt to lock resources in different order — each locks the resource that the other is about to request first

Multiple transactions lock the same resources at the same time, the nature of the user in the affairs of A lock here by the Shared lock attempt to rise to an exclusive lock (for update), and the exclusive lock in the users B because A there is A Shared lock must be released down to A Shared lock, such as the exclusive lock and A with B to rise an exclusive lock is impossible to release A Shared lock, So there’s a deadlock

The transaction log

Changes are temporarily logged (persistent), data is copied to memory (nonpersistent), and data is slowly flushed back to hard disk in the background

Avoid persisting data to hard disks every time you change data.

Multi-version Concurrency Control (MVCC)

A variant of row-level locking, there is no uniform standard

The goal is to reduce locking operations, most implement non-blocking read operations, write operations also lock only necessary lines

Implementation saves a snapshot of the table at the start of a transaction. Operations are performed only on the data in the snapshot

InnoDB engine MVCC keeps two hidden columns after each row to record creation time and expiration time respectively (the time is recorded by system version number)

InnoDB:

operation	implementation
select	1. Rows whose version is earlier than the current transaction version 2. Deleted versions of rows are not defined or larger than the current transaction version
insert	InnoDB stores the current system version number as the row version number for each newly inserted row
delete	Save the current system version number for each row you delete as a deletion identifier
update	Insert a new record to save the current system version number as the line version number, and save the current system version number to the original line as the delete mark

Storage engine

The benchmark

The importance of benchmarking

The only convenient and effective way to find out what will happen to the system in a given working situation

Verify the hypothesis
Found abnormal
Test performance
Plan for future business growth

The test strategy

Overall testing (integrated)

Test the overall performance of the Web server, application code, network, and database
Databases are not necessarily a source of performance bottlenecks
Discover the impact of caching between parts
Reveal the overall performance of the overall application

Individual test (single component)

You need to compare the performance of different schemas or queries
A test for a specific problem in the application
Short test, quick loop for quick adjustment

indicators

throughput

Number of transactions per unit time

For online affairs, multi-user interaction applications

The response time

The total time required for the test task, common percentage

concurrency

Concurrent operations at work

To test performance at different concurrency

You can pass a test specifying 128 threads

Can expand sex

Doubling the work on the system will, ideally, double the results (i.e., doubling the throughput)

But most systems can’t achieve such ideal linear expansion

The test method

Ask questions and define goals

Standard test

Select the appropriate test standard protocol

Design test

Obtain a snapshot of production data
Run queries against the data
Test the query at different levels

Test plan

Record test data, system configuration steps, how to measure and analyze results, warm-up plan, etc.

Document the above

The test of time

It should run long enough

Make sure the system is stable and make adequate observations

Most systems have some margin in case of an emergency and can absorb spikes in performance and postpone some work to perform after the peak. But after pressurizing the machine long enough, the margin is exhausted, and the system’s short-term spikes cannot maintain its original performance

Collect test data

Create a directory for benchmarking Each round of test execution creates a separate subdirectory where test results, configuration files, test metrics, scripts, and other related instructions are stored

System STATUS and evaluation indicators include CPU usage, disk I/O, network traffic statistics, and the SHOW GLOBAL STATUS counter

Result checking mechanism

Ensure that the test results are repeatable. If the system is preheated, ensure the same preheating time (that is, the test environment and test contents are the same).

For example, external pressures, test script parameters, and scheduled tasks of the system