In Linux, when a child process is created with the fork system call, it does not copy all the memory pages used by the parent process. Instead, it shares the same memory pages with the parent process and copies the memory pages only when the child or the parent process makes changes to the memory pages — a mechanism known as copy-on-write.

Let’s look at how Linux’s Copy On Write mechanism works.

Virtual memory versus physical memory

Process memory can be divided into virtual memory and physical memory.

  • Physical memory: is the computer installed memory, if the computer installed 2GB of memory, then the system used for 0 ~ 2GB of physical memory space.
  • Virtual memoryVirtual memory is virtualized using software. In a 32-bit operating system, each process monopolizes 4GB of virtual memory space.

The application program uses the virtual memory, for example, the C language fetch address operation symbol & the address is the virtual memory address. Virtual memory addresses need to be mapped to physical memory addresses in order to be used. Using an unmapped virtual memory address will result in a page miss exception.

Virtual memory addresses map to physical memory addresses as shown in the figure below:

As shown in the figure above, the same virtual memory address of process A and process B is mapped to different physical memory addresses, which is why the same virtual memory address of different processes does not affect each other.

Copy-on-write principle

After introducing the concepts of virtual memory and physical memory, we’ll look at how Linux copy-on-write works.

As mentioned earlier, virtual memory needs to be mapped to physical memory to be used. If the virtual memory address of different processes is mapped to the same physical memory address, then the mechanism of shared memory is implemented. As shown in the figure below:

Because process A’s virtual memory M maps to the same physical memory G as process B’s virtual memory M’, when the data of process A’s virtual memory M is modified, the data of process B’s virtual memory M’ is also changed.

Linux implements copy-on-write in order to speed up the process of creating child processes and save memory usage.

The principle of copy-on-write is as follows:

  • When a child process is created, the parent process’sVirtual memoryPhysical memoryThe mapping is copied to the child process and the memory is set to read-only. (This is set to read-only to be triggered when a write is made to memoryMissing page exception).
  • Fired when a child or parent process makes changes to in-memory dataWhen writing copyMechanism: Copy the original memory page to a new one, and reset its memory mapping relationship, set the parent-child process memory read and write permissions to read and write.

The copy-on-write process is shown in the following figure:

When a child process is created, the parent and child processes point to the same physical memory, rather than making a copy of the physical memory occupied by the parent process. The benefits of this are twofold:

  • Speed up the creation of child processes.
  • Reduce physical memory usage by processes.

As shown in the above, when the parent calls fork to create the child, the parent virtual memory page of the M and the child’s virtual memory page M G mapping to the same physical memory page, and the virtual memory page of the parent and child process M is set to read-only (set to read only because it is over, to write on pages, pages missing abnormal will happen, Thus, the kernel can copy physical memory pages in the page fault exception handler.

A page miss exception is raised when a child process writes to virtual memory page M (because virtual memory page M has been set to read-only). In the page-missing exception handler, a new physical memory page G’ is copied to the physical memory page G’, and the virtual memory page M of the child process is mapped to the physical memory page G’, and the virtual memory page M of the parent-child process is set to read and write.

conclusion

This article focuses on the principle of copy-on-write in Linux. Copy-on-write is the key to the efficient creation of child processes in Linux, as well as saving on physical memory. We will examine the copy-on-write implementation in detail in the next article.

Our official account