The first sentence of the article: “This article has participated in the weekend learning plan, click to view the details”


Recently, I am sorting out the knowledge about OS, mainly to clarify some details in the running process of the program.

This article takes a look at memory management in Linux.

The address space distribution of the process

Under Linux, each process has its own virtual address space. The specific distribution is as follows:

Bottom-up distribution of the whole:

  • Text Segment: The binary code of the stored program.
  • Data Segment: Stores initialized global data.
  • BSS Segment: Stores uninitialized global data.
  • Heap: heap memory, growing towards high address.
  • Memory Mapping Region: address range of mMAP system call mapped memory.
  • Stack: stack memory, growing towards the lower address.
  • Undefined Region: undefined address range for future expansion of the 64-bit address space.
  • Kernel Space: kernel address space.

A virtual address

As mentioned above, the process address space is actually the virtual memory address space, and this is what we call virtualization.

Each process runs with a large address range of its own, and actually gets the data under the address or the real address, and actually the memory used by the process is scattered in different areas of physical memory, and possibly on disk (what else is virtualization?).

Here comes the key operation: how to convert a virtual address into a physical address?

In other words, each process has its own page table. Virtual addresses are translated into physical addresses through table lookup:

Page table transformation

To convert a virtual address into a physical address, you must use the physical address organization. Physical memory is managed paging, physical addresses are partitions of memory areas, and the typical memory page size is 4KB.

The translation from virtual address to physical address is the translation from an area address of a virtual memory page to an area address of a physical memory page:

All of the above have a page table in the same process. Let’s calculate the memory footprint for this direct match so far:

For a 32-bit virtual address, the offset part of the page is 12 bits (2^12B = 4KB), and the virtual page number is 20 bits. So, this one-dimensional array needs 2^20 elements, and that element is 4B in size, and it takes up memory 2^20 * 4B = 4MB

A process light page table allocation is 4MB, a 32-bit system, the maximum memory is only 4GB, so this direct match is definitely not possible at present.

To solve this problem, Linux uses a “multilevel page table” structure.

Multi-level page table is the page table directory as an index, before the virtual page number. The process only needs to create the top-level page table at the beginning, and only establish the full mapping through page faults when the actual virtual address is needed, which significantly reduces unnecessary virtual-to-physical conversion ownership.


Next we’ll talk about missing page interrupts, page cache, and dirty brushes…