SQLite adopts module design, and its architecture diagram is as follows:

Interface (Interface)

The interface is composed of SQLite C APIS, which are externally connected to SQLite through JDBC, etc., and are finally operated by calling these C apis.

The Compiler (Compiler)

In the compiler, tokenizers and parsers parse the SQL, convert it into a syntax tree, a hierarchical data structure that is more easily processed at the bottom, and pass the syntax tree to a code generator for processing. From it, the code generator generates assembly code for SQLite, which is executed by a Virtual Machine.

  • Tokenizer

When a string containing an SQL statement is executed, the interface program passes the string to the Tokenizer. Tokenizer’s job is to split the original string into tokens and pass those tokens to the parser. Tokenizer is written manually in the C file tokenize.c.

  • Parser (Parser)

The job of the parser is to give concrete meaning to an identifier in a given context. SQLite’s parser is generated using Lemon LALR(1) parser generator. Lemon does the same job as YACC/BISON, but uses a different input syntax that is less error-prone. Lemon also produces a reentrant and thread-safe parser. Lemon defines the concept of a non-terminal destructor that does not leak memory when encountering syntax errors. The source files for the driver Lemon can be found in parse.y.

  • Code Generator

After assembling the identifiers into complete SQL statements, the parser calls the code generator to produce the virtual machine code

Virtual Machine

The most central part of the architecture is the Virtual machine, or Virtual Database Engine (VDBE). It is similar to the Java Virtual machine in that it interprets the execution of bytecode. The bytecode of VDBE consists of 128 opcodes, which are focused on database operations. Each of its instructions is used to perform specific database operations (such as opening a cursor for a table) or to prepare stack space for those operations (such as pushing in parameters). In short, all of these instructions are intended to satisfy the requirements of SQL commands.

The back-end (the Back – End)

The back end consists of b-tree, page cache (Pager), and operating system interface (system call). B-tree and Page cache manage data together. The main function of B-tree is index, which maintains the complex relationship between each page, so as to quickly find the required data. The pager’s main role is to pass pages between b-Tree and Disk through the OS interface.

  • B-tree (b-tree)

An SQLite database is stored on disk as a B-tree, the implementation of which is in the source file btree.c. Each table and index in the database uses a separate B-tree, and all b-trees are stored in the same disk file. The details of the file format are recorded in a note beginning with btree.c. The interfaces for the B-tree subsystem are defined in the header file btree.h.

  • Page Cache

The B-tree module requests information from disk as fixed-size blocks of data, the default block size is 1024 bytes, but can vary between 512 and 65536 bytes. The page cache is responsible for reading, writing, and caching these data blocks. Page caching also provides rollback and atomic commit abstractions, and manages locking of data files. The B-tree driver module requests specific pages from the page cache, and it also notifies the page cache when it wants to modify the page or commit or roll back the current change. Page caching handles all the messy details to ensure that requests are processed quickly, safely, and efficiently. The code implementation of page caching is contained in a single C source file, pager.c. The interface to the page caching subsystem is defined in the header file pager.h.

OS interface

To provide portability between POSIX and Win32 operating systems, SQLite uses an abstraction layer to provide the operating system interface. The interfaces to the OS abstraction layer are defined in os.h, and each supported operating system has its own implementation: Unix uses OS_UNIx. c, Windows uses OS_win.c, and so on. Each implementation of a specific operating system usually has its own header file, for example, OS_UNIx.h, OS_win.h, etc.

The resources

  • SQLite profiling architecture