preface

Only a bald head can be strong

When reading Redis Design and Implementation about the expansion of hash tables, I found this passage:

To execute the BGSAVE or BGREWRITEAOF command, Redis creates a child of the current server process. Most operating systems use copy-on-write to optimize the use of the child process, so the server raises the threshold of the load factor while the child process exists. In this way, hash table expansion operations can be avoided during the existence of the child process, and unnecessary memory writing operations can be avoided to maximize memory saving.

Hit the blind spot of knowledge, so I went to search for copy-on-write to copy what the technology is. I found that there are a lot of things involved, and it’s very difficult to understand. So I wrote down this note to record my learning process of copy-on-write.

This article makes every effort to briefly explain copy-on-write this knowledge point, I hope you can have a harvest after reading.

Copy -on-write in Linux

Before explaining the copy-on-write mechanism under Linux, we first need to know about two functions: fork() and exec(). It is important to note that exec() is not a specific function. It is a collective name for a set of functions including execl(), execlp(), execv(), execle(), execve(), and execvp().

1.1 Simple use of fork

First let’s see what fork() is:

fork is an operation whereby a process creates a copy of itself.

Fork is the primary method of creating processes on Unix-like operating systems. Fork is used to create a child process (equivalent to a copy of the current process).

  • The new process is copied by the old process, which is called fork!

If you’re familiar with Linux, you know that the init process is the parent of all processes.

  • Linux processes are created by the init process or its child process fork(vfork).

Here’s an example of fork:


#include 
       
#include 
       
 
int main (a)   
{   
    pid_t fpid; //fpid represents the value returned by fork
    int count=0;
	
	// call fork to create a child process
    fpid=fork();

	// So the following code has two processes executing!
    if (fpid < 0)   
        printf("Failed to create process! /n");   
    else if (fpid == 0) {  
        printf("I'm the child, fork out /n from the parent.");   
        count++;  
    }  
    else {  
        printf("I am the parent process /n");   
        count++;  
    }  
    printf("The result is: %d/n",count);  
    return 0;  
}  
Copy the code

The resulting output is:

I am a child process, and the parent process forks.1I am the parent process.1

Copy the code

Explain:

  • Fork is called as a function. This function returns twice, returning the PID of the child to the parent and 0 to the child. (If the value is less than 0, the child process failed to be created.)
  • Again: current process callfork(), creates a child identical to the current process (except for the PID), so the child will also executefork()The code after that.

So said.

  • When the parent executes the if block,Fpid variableIs the pid of the child process
  • When the child executes the if block,Fpid variableThe value is 0

1.2 Now look at the exec() function

We already know from above that fork creates a child process. The child is a copy of the parent.

The exec function loads a new program (executable image) that overwrites the image in the memory space of the current process to perform different tasks.

  • The exec family of functions directly replaces the address space of the current process when executed.

So let me draw a picture to make sense of it:

References:

  • Fork and exec functions blog.csdn.net/bad_good_ma…
  • Fork () = fork (); Instance) : blog.csdn.net/jason314/ar…
  • Linux c fork () and the exec function description and usage: blog.csdn.net/nvd11/artic…
  • Linux under the Fork and Exec: www.cnblogs.com/hicjiajia/a…
  • Linux system calls, fork () is the kernel source code analysis: blog.csdn.net/chen8927040…

1.3 What is COW under Linux

Fork () produces a child exactly the same as the parent (except for the PID)

Traditionally, the parent’s data is copied directly to the child, and the data segments and stacks between the parent and child are independent of each other.

However, in our experience, child processes often execute exec() to do what they want to do.

  • So, if you follow the above approach, copying past data when creating the child process is useless (because the child process executes)exec(), the original data will be cleared.)

Since most of the time the data copied to the child process is invalid, the technique of Copy On Write is introduced, and the principle is simple:

  • Fork Creates a child process that shares memory space with its parent process. That is, if the child process does not write to the memory space, the data in the memory space will not be copied to the child process, so the creation of the child process is very fast! (No copying, direct reference to the physical space of the parent process).
  • And if the child exec a new executable image immediately after the fork function returns, no time or memory space is wasted.

Another way of saying it:

Before fork and exec, the two processes use the same physical space (memory area). The code segment, data segment, and stack of the child process refer to the physical space of the parent process. In other words, the virtual space of the two processes is different, but the corresponding physical space is the same.

When the parent process changes the corresponding segment, the child process allocates the corresponding segment physical space.

If it were not for EXEC, the kernel would allocate physical space to the child’s data segment and stack segment (so far they have their own process space, independent of each other), and the code segment would continue to share the parent’s physical space (the code is identical to the parent’s).

If it is exec, the child process’s code segment will also be allocated a separate physical space because the code executed by the two is different.

Copy On Write technology implementation principle:

After fork(), the kernel sets the permissions of all memory pages in the parent process to read-only, and the child’s address space points to the parent process. When both parent and child processes read only memory, nothing happens. When one of these processes writes to memory, the CPU hardware detects that the memory page is read-only and triggers a page-fault, falling into one of the kernel’s interrupt routines. In the interrupt routine, the kernel makes a copy of the page that triggered the exception, so that the parent process keeps a separate copy.

What are the benefits of Copy On Write?

  • COW technology can reduce the instant delay when allocating and copying a large number of resources.
  • COW technology can reduce unnecessary resource allocation. For example, when forking, not all pages need to be copied, and the parent’s code segments and read-only data segments are not allowed to change, so there is no need to copy.

What are the disadvantages of Copy On Write?

  • If both parent and child processes need to continue writing after fork(), the result will be a lot of paging errors (page exceptions break page-fault), which is not worth the cost.

The Linux Copy On Write technique is summarized in a few words:

  • Fork The child process shares the parent process’s physical space. When the parent process writes to memory, read-only the page is interrupted and a copy of the page is made (the rest of the page is shared).
  • Fork the child process implements the same functionality as the parent process. We’ll use it if we need itexec()Replace the current process image with a new process file to do what you want.

References:

  • Linux process basis: www.cnblogs.com/vamei/archi…
  • Linux copy-on-write technique (copy – on – write) www.cnblogs.com/biyeymyhjob…
  • What happens when you start a process on Linux? zhuanlan.zhihu.com/p/33159508
  • Does Linux fork() ‘copy-on-write’ still end with copy-on-write? www.zhihu.com/question/26…
  • Copy-on-write (copy – on – write) COW technology blog.csdn.net/u012333003/…
  • When you Write Copy – On – Write Copy principle blog.csdn.net/ppppppppp20…

Explain COW of Redis

Based on the above, we should already know COW as a technology.

Here’s what I learned about Redis Design & Implementation:

  • If Redis persists using BGSAVE or BGREWRITEAOF, then Redis forks a child process to read the data and write it to disk.
  • In general, Redis still does a lot of reads. If a lot of writes occur during the life of the child process, there may be a lot of paging errors (page exceptions interrupt page-faults), which can cost performance in replication.
  • In the Rehash phase, writes are unavoidable. Therefore, after forking out the child process, Redis increases the load factor threshold to minimize write operations, avoid unnecessary memory write operations, and save memory to the maximum extent.

References:

  • Fork () after the copy on write some features: zhoujianshi. Making. IO/articles / 20…
  • When writing copy: miao1007. Making. IO/gitbook/jar…

COW for file systems

Here’s what COW means in a file system:

Copy-on-write Does not modify data directly on the original data location, but on a new location. In this way, Fsck is not required after the system restarts in case of a sudden power failure. The advantage is that data integrity is guaranteed and it is easy to recover if power goes down.

  • For example, if you want to modify the contents of data block A, read A first and write it to block B. If the power fails at this time, the original A content is still there!

References:

  • What are the specific benefits of copy-on-write mode in file systems? www.zhihu.com/question/19…
  • Introduction to BTRFS :www.ibm.com/developerwo…

The last

Finally, let’s look at the idea of copying while writing (from Wikipedia) :

Copy-on-write (COW for short) is an optimization strategy in computer programming. Its core idea is that if there are multiple caller (callers) at the same time, request the same resources (such as the data on the memory or disk storage), they will get the same pointer to the same common resources, until a caller tried to change the contents of the resources, the system will truly reproduce a copy of a dedicated (private copy) to the caller, The original resource that other callers see remains the same. The process is transparently transparent to other callers. The main advantage of this approach is that if the caller does not modify the resource, no private copy is created, so that multiple callers can share the same resource only when reading operations.

At least from this article we can conclude:

  • Linux greatly reduces the overhead of forking with the Copy On Write technique.
  • File systems use Copy On Write technology to ensure data integrity to a certain extent.

In fact, in Java, there is also a Copy On Write technology.

Stay tuned for the next post

If you have a better way to understand or there are mistakes in the article, please feel free to leave a message in the comment section, we learn from each other ~~~

References:

  • When writing Copy, copy-on-write, split when writing, Copy on write:my.oschina.net/dubenju/blo…
  • Will not produce milk COW (Copy – On – Write) www.jianshu.com/p/b2fb2ee5e…

A adhere to the original Java technology public number: Java3y, welcome your attention

3Y all original articles:

  • Table of contents navigation (brain map + massive video resources) : github.com/ZhongFuChen…