This is the 8th day of my participation in the Gwen Challenge in November. See details: The Last Gwen Challenge in 2021.

A file system is a system that manages storage media.

How to understand storage media

There are many storage media, such as disks, through which data can be identified and persisted into something useful.

  • Physical Disk

  • Logical Disk Form

You can think of the disk as a 01 binary array, ignoring more physical forms, and think of the disk as being made up of sectors.

A file system is a system that manages disks, as shown in the following figure:

  • A sector is 512KB
  • Each sector has a number
  • You can see the condition of the disk device with the following command:
# fdisk -lDisk /dev/vda: 107.4 GB, 107374182400 bytes, 209715200 Sector Units = Sector of 1 * 512 = 512 bytes Sector size (logical/physical) : 512 bytes / 512 bytes I/O size (minimum/best) : 512 bytes / 512 bytes Disk label type: DOS Disk Identifier: 0xf6Abafec Device Boot Start End Blocks Id System /dev/vda1 63 209712509 104856223+ 83 Linux descriptionCopy the code

As you can see, the disk /dev/vda has about 100 GIGABytes of storage space and a total of over 200 million sectors.

The organization and management capability of the file system

The file system can manage files. The logical form of the file system is described above, which is a large array.

So how does the file system manage this array?

In other words:

  • Upper-layer functions: Manage files
  • Lower level implementation: manages the storage unit in the storage medium (block device) array

So, the file system does exactly that: how to implement the upper level functionality through the lower level management.

Basic ability

The file system, as an external device, interacts with the central system (memory) and needs to be designed to match the size of the memory interaction. Since the memory page size is 4kB, the block size of the design disk is also 4kB. To view:

# stat -f /File: "/" ID: 336e5F556022D7BD File Name Length: 255 Type: ext2/ext3 Block size: 4096 Basic Block size: 4096 Block size: Total: 25769967 Free: 5032408 Available: 3927408 Inodes: Total: 6553600 Idle: 4886167# blockdev --getbsz /dev/vda1
4096
Copy the code

Let’s say we implement a simple file system of our own, as shown below, with 32 blocks of disk storage:

Implementing a file system is to meet the function of storing files:

  • Now you want to store the file file1, which takes up 8 blocks on your disk
    • Data nodes store data content
    • Blocks that store file contents may not be contiguous
  • But where to find the file?
    • You need to design an index to manage it.
    • An index node, or iNode, is used to store information such as file name, permissions, and creation time.
    • Index node stores indexes pointing to all data blocks.
    • If an inode is 256 BYTES, a block can hold 16 inodes.

More details on the implementation

BITMAP

Although inodes and Datanodes are designed,

  • You also need to scope inodes and data nodes,
  • To better find idodes and data nodes, record the allocation and release of inodes and data blocks
  • When new data is generated, we need to select a free block to store the data, in addition to a free inode.

Use bitmaps to determine what is used and what is free. Mark it as 0 if it is idle and 1 if it is in use

Why use Bitmap? Because if you read less data, you know which block is fast and efficient

As you can see here

  • A block can store 4096*8 bits of information, and a bitmap of two blocks can store more than 60,000 bits
  • The bitmap that records inode usage has only 3*16 inodes, which is sufficient
  • The bitmap that records data block usage is actually only 24 data nodes, which is sufficient

As you can see, bitmaps occupy very little space. A 100G file system with more than 6 million inodes requires only about 20 blocks, or 80K space.

Super Block

A superblock records basic metadata information about a file system.

  • File System Type
  • Partition information: Start and end numbers of sectors
  • Number of iNodes and data blocks
  • Index Block range
  • Data block range, etc
  • Bitmap range

Multiple file types

For a better user experience, the file system also supports other file types, such as directories and links.

Files on disk

Run ls -L to view files

# ls -liDrwxr-xr-x 2 root root 4096 November 23 08:08 dir-rw-r --r-- 3 root root 6 November 21 14:33 file1-rw-r --r-- 3 root root 6 November 21 14:33 file1_hard LRWXRWXRWX 1 root root 5 November 21 14:36 file1_soft -> file1Copy the code
  • Common file type
    • The file you just stored, file1, is a normal file
    • This is one of the most common file types in Linux, including plain text files (ASCII); Binary; Data format files (data); All kinds of compressed files, etc.
    • The first property of ls -l is -.
  • Directory file type
    • The first property is d, accessible from CD

    So, how do directories work?

    • A directory is essentially an inode
    • This inode refers to other inodes (files or directories under directories).
  • Soft links
    • Soft Link, the first property is L

    • Create a new inode that points to an old inode

Other file types

The following types are not important for this article because they are not file types managed by file systems, but by various drivers or kernels.

  • Character device files: Devices that interface with serial ports, such as keyboards, mice, and so on. The first property is c.
  • Block device file: an interface device that stores data for system access, also known as storage media
  • Socket file: represents the read and write file of the network client and server

More details on the underlying administration

Block device partition -fdisk

Partitioning is the division of a storage block device into several independently managed sub-areas based on physical addresses.

# fdisk /dev/vda1Welcome to fdisk (util-Linux 2.23.2). Changes stay in memory until you decide to write them to disk. Think twice before using write commands. Device does not contain a Recognized Partition Table Disk Identifier 0xf65964EC Creates a New DOS Disk label. Command (enter m for help) : m command operation...
#  mkfs -t ext4 /dev/vdb1
Copy the code

So this is going to be

  • Fdisk Specifies a block device file
  • Create a partition n from the block device file
    • Select primary partition or not
    • Select partition number
    • Select the first sector
    • Select the last sector
  • mkfs -t ext4 /dev/vdb1Initialize the partition asext4Type of a file system
  • Mount the file system to a directory

So what do you mean by mount?

mount

/ and /data are the two mount points.

# df -hTFile system Type Capacity Used Available Used % Mount point /dev/vda1 ext4 99G 80G 16G 85% /dev/vdb1 ext4 394G 6.5g 368GB 2% /dataCopy the code

What do you make of this mount point?

  • Block devices are identified and initialized by the device’s driver (kernel)
  • / is a must, otherwise this directory will not be usable, other directories are fine
  • /data is the mount point for /dev/vdb1. This means that all files and directories created under /data are stored using blocks under /dev/vdb1
  • When storing data, the mount point priority of the child directory > the mount point priority of the child directory means that /data/file is stored at the mount point of /data instead of /, that is, the actual mount point is stored at /dev/vdb1.

Relationship between an operating system and a file system

  • After the operating system starts, the file system will be mounted.
  • The operating system image exists on a disk (boot disk), which is special and can be booted without being mounted.

More advanced capabilities -VFS

File systems can vary in implementation details, and we can choose different file systems when formatting partitions. As you can see from the command, there are various mount points and file systems on our system, but no matter which one we read, we do it the same way: vim reads, CAT reads, echo writes.

This is because file systems have a unified abstract VFS, as shown below:

The VFS provides a unified upper-layer interface to read and write files, so that applications can access files on various file systems in a simple and uniform way.