Technology is open source and knowledge is shared.

1 File and file system

1.1 Basic Concepts

A file is really an abstraction of a disk, a sequence of information items with an identifier (that is, a filename) that logically has complete meaning.

Information item: a basic unit (single or multiple bytes) that constitutes the contents of a file. Information items have a sequential relationship.

A file system is a piece of software that centrally manages information resources in an operating system. It manages file storage, retrieval, and updating, and provides secure and reliable means for sharing and protecting files. The main functions are:

  • Manages disk space in a unified manner and reclaims disk space
  • Implement file by name access
  • Realize file information sharing, and provide file protection, confidentiality means
  • Provides a unified interface with the I/O system

Table 1-1 File classification

category describe
Regular documents Contains information about the user
Directory files Manage the system files of the file system
Special files Character device files: related to input and output, such as printers, network cards, etc. Block device file: disk

1.2 Logical structure of files and file access

1.2.1 Logical structure of the file

Streaming files: The basic unit of a file is a character. A file is a logical, unstructured collection of characters. Record file: A file consists of several records, which can be read, written, and searched according to the records. Each record has its own internal structure. In addition, there are tree structure, heap structure, sequence structure, index structure and so on.

1.2.2 File Access

This includes sequential access and random access (which requires a read and write location).

2 File storage medium

2.1 Storage Media and Physical Blocks

Storage media include disks (including SSDS), tapes, USB disks, and CD-RoMs. A physical block (block or cluster) is an independent unit for information storage, transmission, and distribution. Storage devices are divided into physical blocks of equal size and numbered uniformly.

2.2 Typical Disk Structure

Each disk contains a disk surface,There are many concentric circles on the disk, called tracksEach track has a number of arcs calledsector. Each disk has a unique read-write head that moves in a radii direction.

Only one head is active at any one time and the input and output streams appear as bit strings.

The physical address on the disk consists of three parts:Magnetic head number (disk number), track number (cylinder number), sector number.

Each sector contains three parts: title, data, and parity information.

2.3 Disk Access

Each disk request consists of three steps (SSD only has the third step) : 1. Seek time: the magnetic head moves to the specified track. 2. Rotation delay time: wait for the specified sector to rotate past under the magnetic head; 3. Data transfer time: actual data transfer between disk and memory.

3 Disk space management

3.1 Related data structures

1 the bitmap

  • Each physical block corresponds to one bit. The allocated physical block is 0. Otherwise, the allocated physical block is 0.
  • When applying for a physical block, you can search for the bit 1 in the bitmap, return the physical block number and change the bit to 0 at the same time. Return will correspond to position 1. 2 Free block table Records all free blocks in one table, that is, the free block table. The start block number and number of blocks are recorded in the table. 3 Free block linked list Links all free blocks into a chain. To prevent too many chains, group chaining can be used.

3.2 Conversion between the Disk ADDRESS and block NUMBER

If the block id is known, the disk address is:

Cylinder number =[Block number /(Number of heads and sectors)] Head number =[(Block number mod(Number of heads and sectors))/ Sector number] Sector number = (Block number mod(Number of heads x Sectors)) mod Number of sectors

Known disk address:

Block number = Cylinder number x (Number of heads and sectors) + Number of sectors of the head number + Sector number

When using a bitmap data structure:

Known font I and bit J: Block number = I * Word length +j Known block number: font =[Block number/word length] Bit = Block number MOD word length

4 File control block and file directory

4.1 File Attributes

File properties, also known as File Control blocks, are data structures set for File management and store all information required for File management. It mainly includes file name, file size, file address, creation time, last modified time, various flags (read only, hidden, system, archive), etc.

4.2 File Directories, Directory files, and directory items

File directories manage the metadata of each file in a unified manner to support the conversion of file names to physical addresses. The management information of all files is organized together to form a file directory. Directory file The directory is stored as a file on the disk. Directory entry A directory entry is the basic unit of a file directory, which can be FCB. A directory is an ordered collection of file control blocks.

4.3 Concepts related to directories

Path name (file name) Absolute path name: starts from the root directory relative path name: starts from the current directory

5 Physical structure of the file

5.1 Continuous Structure

File information is stored in several contiguous physical blocks. Advantages Support sequential access and random access; The minimum number of disk seek times and time is required. Disadvantages: files cannot grow dynamically. Is not conducive to file insertion and deletion; Prone to external debris.

5.2 Link Structure

5.2.1 concept

The information in a file is stored in discrete physical blocks connected by Pointers, with the first physical block pointing to the next. Advantages Improve the utilization of disk space. Facilitate file insertion and deletion; It is beneficial to file dynamic expansion, but the access speed is slow and not suitable for random access. Reliability problems, such as pointer errors; Need more seek times and seek time; Link Pointers also need to take up some space.

5.2.2 A variation of link structure — file allocation table FAT

All physical blocks are placed in a unified table, which is called the file allocation table. The table records the block numbers of all physical blocks and their corresponding entry values. 0 indicates the empty physical blocks, a positive integer indicates the next block number, and -1 indicates the last block of the file.

5.3 Index Structure

5.3.1 concept

The information of a file is stored in several discontinuous physical blocks. The system establishes a special data structure — index table for each file, and stores the block numbers of these physical blocks in the index table. An index table is an array of disk block addresses, where the i-th entry points to the i-th block of the file.

5.3.2 Organization mode of index tables

Linked Mode A disk block stores one index table and multiple index tables are linked multi-level index mode Puts the index table address of a file in another index table Comprehensive mode Direct index mode combines with indirect index mode

5.3.3 Tertiary Index structure in Unix

1. The main index table of each file has 15 index entries (in FCB), each of which is 2 bytes; 2, the first 12 items directly store the physical block number of the file (direct addressing); 3. If the file is larger than 12 blocks, use item 13 to point to a physical block where the first level index table is stored.

Assuming that the sector size is 512 bytes and the physical block is equal to the sector size, the level 1 index table can hold 256 physical block numbers

4. Items 14 and 15 can also be used as secondary and tertiary index tables for larger files. Therefore, the tertiary index structure of Unix can hold 12+256+2562+2563 blocks at most

6 File system implementation

6.1 Related Terms

Partition: Divides the storage space of a physical disk into independent parts. Volume: A logical partition on a disk that consists of one or more physical blocks.

A file volume can be all or part of a disk or across disks (RAID). The same management data is used for file allocation and disk space management in a file volume. A file volume contains file system information, a group of files (user files, directory files), and unallocated space. Block or Cluster: One or more (powers of 2) related sectors, addressable blocks of data.

Format: To set up a file system on a file volume, that is, to set up and initialize administrative data (that is, metadata) for file allocation and disk free space management.

6.2 Disk Contents

Table 6-1 Disk information

partition instructions
Boot sector Contains information needed to boot the operating system from that volume, one per volume, usually the first sector
Volume of information The value includes the number of blocks (clusters), block (cluster) size, number of free blocks (clusters), and Pointers of the volume
Directory file Root and other directory files
User files

6.3 Data Structures required in Memory – Unix is used as an example

7 File system instance – Unix

7.1 File and Directory Retrieval

Accessing a file consists of two steps: 1 Directory search The file name given by the user – > Find the directory entry FCB by the file name

Search by pathname: full pathname: start at root \A\B\C\File1 Relative path: start at current directory \C\File1

2 File addressing The address of any record or character in a file is calculated based on the physical address of the file in the directory entry /FCB.

7.2 Unix File System

1 a

On Unix systems, each file consists of a directory entry, an I node, and several disk blocks. A directory file is made up of directory entries.



2 File search process