This article is aimed at mysql InnoDB features, combined with the source code

A few points to understand:

  1. Real physical data is stored in disks. Either solid state or mechanical.
  2. sqlYou get every row of data, butserverTo obtain the unit from disk ispage. apageContains more than onerecord.
  3. Add, delete, modify and check,serverAll the operations are in memory

What is the point? The data is on disk, but the operation data is in memory, and the unit of obtaining data to memory is page.

From a daily business perspective, Redis is to solve the service to get database data I/O latency consumption, thereby putting data from DB -> Redis. If innoDB wants to cache data, it also needs a cache pool to manage the corresponding page and improve the efficiency of the database.

This is the buffer pool, and of course management is complicated by the introduction of this middle layer:

  • Memory management [preheat, add, eliminate]
  • Concurrent access
  • [Modified data (memory), how to deal with disk data consistency]

Basic knowledge of

Github.com/mysql/mysql…

This is the structure definition for the buffer pool. Here’s a full view of the pool:

struct buf_pool_t {
  // Access the lock of the underlying chunk memory.
  BufListMutex chunks_mutex;
  // Flush_list: flush_list; // flush_list: flush_list;BufListMutex LRU_list_mutex; .// Buffer pool instance index. Multiple instances may exist
  ulint instance_no;

  ulint curr_pool_size;
	// old lru list, this value is dynamiculint LRU_old_ratio; .// hashtable -> How to find a page in the buffer pool
  // "tablenum+pagenum, value: page"
  hash_table_t*page_hash; .// The underlying memory interaction unit
  buf_chunk_t*chunks; .// Flush List base node
  UT_LIST_BASE_NODE_T(buf_page_t) flush_list; .// free list base node
  UT_LIST_BASE_NODE_T(buf_page_t) free; .// lru list base node
  UT_LIST_BASE_NODE_T(buf_page_t) LRU;
  // The pointer to the old header.
  // > BUF_LRU_OLD_MIN_LEN; < only old
  buf_page_t *LRU_old;
}
Copy the code
struct buf_chunk_t {
  ulint size;           
  unsigned char *mem;   
  ut_new_pfx_t mem_pfx;
  // Array of control blocks -> The control block points to the buffer page
  buf_block_t*blocks; . }Copy the code
// Uncompressed page control body
struct buf_block_t {
  // Note: always put it first
  // Buf_block_t and buf_page_t can be converted to each other
  buf_page_t page;

#ifndef UNIV_HOTBACKUP
  / / lock
  BPageLock lock;
#endif /* UNIV_HOTBACKUP */
  
  // Point to the actual data page, which is the core of the actual buffer pool service
  // Redo page, undo page, redo page, etcbyte *frame; . }Copy the code

As for the compressed page control body, I won’t mention it here

Buf_page_t, that’s why the conversion is mentioned, and the data pages are stored in buf_block_t, because it’s uncompressed.

At this point, the basic data structure is out.

Logic chain table

  • free_list
  • lru_list
  • flu_list

There are also several linked lists of compressed pages, which this article does not cover.

Free_list:

All nodes are idle and point to unallocated buffer pages.

InnoDB needs to make sure that the free_list has enough nodes for user threads, otherwise it needs to eliminate certain nodes from flu_List or Lru_list.

Lru_list:

All newly read pages are connected to it. The change of the whole list follows the LRU algorithm. The least recently used node is at the end of the list, so the end of the list is eliminated first.

Obsoletization time: If no free nodes exist in free_list, you need to obsoletize existing pages (or flush dirty flu_list pages).

Lru_list also contains pages that have not been unzipped, that have just been read from disk, that have not yet come, and that have been unzipped.

Flu_list:

All nodes are connected to dirty pages, which are pages that have been modified but not flushed to disk.

influ_listMust be on the pagelru_listBut not the other way around.

A page can be modified multiple times at the same time. However, no matter how many changes are made, the first change will cause the page control to point to the pages that are already in flu_list, so the dirty pages in flu_List are sorted by the time the page was first changed. This control block in the node has the following properties:

  • oldest_modification: The transaction that changes this page for the first time beginslsnWill be written to this value
  • newest_modification: Each time the page record is modified thereafter, the transaction beginslsn Will be written to this value

To be continued…