Readers should have a general understanding of Linux from the previous articles on Linux memory management and process scheduling. The topic of this article is Linux virtual file system. Without further ado, let’s go!

1. The difference between soft links and hard links

We know that every file has a file name and data, and the data is divided into two parts: user data and metadata. User data, that is, file data blocks, data blocks are the places where the real contents of files are recorded. Metadata is the additional attributes of a file, such as file size, creation time, owner, and so on. In Linux, the inode number in metadata (an inode is part of a file’s metadata but does not contain a file name; the inode number is the inode number) is the unique identifier of a file, not the file name. The file name is only for the convenience of people’s memory and use, the system or program through the inode number to find the correct file data block

To solve file sharing, Linux uses two types of links: hard link and soft link (also known as symbolic link or soft link). Linking solves the problem of file sharing for Linux system, and also brings the benefits of hiding file path, increasing permission security and saving storage. If an inode number corresponds to multiple file names, these files are called hard links. A hard link is a file with multiple aliases

Hard links are files with the same inode number but different file names. Therefore, hard links have the following features:

  • Files have the same inode and data block;
  • Only existing files can be created.
  • Hard links cannot be created across file systems.
  • You cannot create directories, only files.
  • Deleting a hard-linked file does not affect other files with the same inode number.

The inode number is unique in each file system. If Linux mounts multiple file systems, the inode number may be duplicated. Therefore, hard links cannot be created across file systems

A soft link is different from a hard link. If the user data block of a file points to the path name of another file, the file is a soft link. A soft link is a normal file, but the data block content is a bit special. Soft links have their own inode numbers and blocks of user data. Therefore, the creation and use of soft links do not have many restrictions similar to hard links:

  • Soft link has its own file attributes and permissions;
  • You can create soft links to files or directories that do not exist.
  • Soft links can cross file systems;
  • Soft links can be created for files or directories;
  • When creating a soft link, the link count i_nlink is not increased.
  • When a soft link is deleted, it does not affect the file. However, when the original file is deleted, the soft link is referred to as a dead link.
  • Generally, the file name and inode number are one-to-one. Each inode number corresponds to a file name. However, Unix/Linux systems allow multiple file names to point to the same inode number. This means that you can access the same content with different file names; If you modify the file content, all file names will be affected. However, deleting one file name does not affect access to the other file name. This is called a “hard link”.

2.Linux VFS

Linux has an extremely rich set of file systems, which can be broadly categorized as follows:

  1. Network file systems, such as NFS and CIFS;
  2. Disk file systems, such as ext4 and ext3;
  3. Special file systems, such as Proc, SYSFS, RAMfs, TMPFS, etc.

The Linux VFS (Virtual File System, also known as the Virtual Filesystem Switch) is the basis for these File systems to coexist in Linux. As a general-purpose file system, the VFS abstracts the four basic concepts of file systems: files, directory entries (dentries), inodes (inodes), and mount points, and provides interfaces to user-space layer file systems in the kernel. VFS enables system tunings such as open() and read() and makes user-space programs such as CP cross-file systems. VFS really does that: in Linux everything is a file except a process.

The Linux VFS has four basic objects: superblock objects, inode objects, dentry objects, and File objects. The superblock object represents an installed file system; The index node object represents a file; The directory entry object represents a directory entry, such as the device file event5. In the path /dev/input.event5, there are four directory entry objects: /, dev/, input/, and event5. The file object represents the file opened by the process. For fast resolution of file paths, the Linux VFS designs the Directory Entry Cache (dcache).

3. File opening process

The open() system call is as follows:

1. Check whether the file exists in the system-wide open-file table. That is, check whether the file has been opened by another process

2. If yes, the process creates a project in its per-process open-file table that points to the file in the system-wide open-file table

3. If the file does not exist, search for the file in the directory based on the file name. Usually, part of the file in the directory is stored in the cache to speed up the search.

4. Once the file is found, the file control block FCB(File control Block) is copied to the system-wide open-file table, which not only stores the FCB but also records how many processes open each file

5. Next, in per-process open-file table, there is almost an entry pointing to the item in the per-process open-file table

When the process closes () a file:

1. Entries in per-Process open-flle Table of the process are deleted, and the file counter in the open table of the system decreases by 1

2. If the calculation in the table is 0, delete the file entry

4. The understanding of the inode

When the operating system reads disks, it does not read disks one sector at a time, which is inefficient. Instead, it reads disks consecutively, that is, one block at a time. This “block”, composed of multiple sectors, is the smallest unit of file access. The size of a “block” is 4KB, that is, eight consecutive sectors constitute a block.

The file data is stored in “chunks,” so obviously we have to find a place to store meta-information about the file, such as who created the file, the date it was created, the size of the file, and so on. The area where the meta information is stored is called an inode, or “index node” in Chinese.

Inodes contain meta information about files, specifically the following:

* Number of bytes of the file

* User ID of the file owner

* Group ID of the file

* Read, write, and execute permissions on files

Ctime indicates the last time the inode was changed, mtime indicates the last time the file content was changed, and atime indicates the last time the file was opened.

* Link count, which is how many file names point to the inode

* Location of file data blocks

All file information except the file name is stored in the inode

Each inode has a number that the operating system uses to identify different files.

Ostensibly, the user opens the file by the name of the file. In fact, the system’s internal process is divided into three steps: first, the system finds the inode number corresponding to the file name; Secondly, inode information can be obtained by inode number. Finally, according to the inode information, find the block where the file data resides and read the data.

A directory is also a file. The structure of a directory file is very simple. It is a list of dirents. Each directory entry consists of two parts: the name of the file it contains and the inode number corresponding to the file name.

Data block addressing

Inodes record the location of file blocks. There are three addressing methods: Direct blocks refer directly to data blocks; Single indirect points to a block with a pointer to a data block; Double indirect, two-level block

Linux – File Systems & Virtual File Systems (very important!)

5. File descriptor

In Linux, everything can be considered a file, and files can be divided into: normal files, directory files, link files, and device files. A file descriptor is an index created by the kernel to efficiently manage opened files. It is a non-negative integer (usually a small integer) that refers to the open file. All system calls to perform I/O operations go through the file descriptor. When the program starts, 0 is standard input, 1 is standard output, and 2 is standard error. If you open a new file at this point, its file descriptor will be 3. POSIX standards require that the smallest file descriptor number available in the current process be used each time a file (including socket) is opened

File descriptors are an important resource of the system, although it is said that the system memory can be opened as many file descriptors, but in practice the kernel will handle the corresponding process, generally the maximum number of open files is 10% of the system memory (in KB) (called the system level limit).

6. The relationship between file descriptors and open files

Each file descriptor corresponds to an open file, and different file descriptors refer to the same file. The same file can be opened by different processes or multiple times in the same process. The system maintains a file descriptor table for each process. The values in this table start at 0, so you will see the same file descriptor in different processes. In this case, the same file descriptor may refer to the same file, or it may refer to different files. To get a sense of what this is all about, you need to look at three data structures maintained by the kernel.

1. Process-level file descriptor table

2. Open the file descriptor table at the system level

3. The I-Node table of the file system

Each entry in the process-level descriptor table records information about a single file descriptor.

1. A set of flags that control the operation of the file descriptor. (Currently, only one such flag is defined, the close-on-exec flag.)

2. Reference to the open file handle

The kernel maintains a system-level Open File Description table for all open files. Sometimes it is also called an open file table, and each entry in the table is called an open file handle. An open file handle stores all information about an open file, as follows:

1. Current file offset (updated when calling read() and write(), or modified directly with lseek())

2. State flags used to open the file (that is, flags for open())

3. File access mode (such as read-only mode, write-only mode or read-write mode set when open() is called)

4. Settings related to signal drive

5. Reference the I-Node object of the file

6. File type (for example: regular file, socket or FIFO) and access rights

7. A pointer to the list of locks held by the file

8. Various attributes of the file, including the file size and timestamps associated with different types of operations

In process A, file descriptors 1 and 30 both point to the same open file handle (labeled 23). This can be done by calling dup(), dup2(), FCNTL (), or by calling open() multiple times on the same file.

File descriptor 2 for process A and file descriptor 2 for process B both point to the same open file handle (labeled 73). This can happen after A fork() call (that is, processes A and B are parent-child), or when one process passes an open file descriptor to another through A UNIX domain socket. In addition, different processes independently call open to open the same file, and the internal descriptors of the process are allocated to the same descriptors as those of other processes that open the file.

In addition, descriptor 0 for process A and descriptor 3 for process B point to different open file handles, but these handles point to the same entry in the I-Node table (1976), in other words, to the same file. This happens because each process individually makes an open() call to the same file. A similar situation occurs when the same process opens the same file twice.

7. To summarize

1. Because the process-level file descriptor table exists, different processes may have the same file descriptor, which may point to the same file or different files

2. If two different file descriptors point to the same open file handle, they share the same file offset. Thus, if the file offset is modified through one of the file descriptors (by calling read(), write(), or lseek()), the change is also observed from the other descriptor, regardless of whether the two file descriptors belong to different processes or the same process.

3. To get and modify open file flags (for example, O_APPEND, O_NONBLOCK, and O_ASYNC), the F_GETFL and F_SETFL operations of FCNTL () are performed, with much the same scope constraints as the previous one.

4. The file descriptor flag (that is, close-on-exec) is private to the process and file descriptor. Changes to this flag will not affect other file descriptors in the same process or in different processes