This chapter focuses on

  • Why use files
  • What is a file
  • File opening and closing
  • Sequential reads and writes to files
  • Random reads and writes to files
  • Text files and binary files
  • The determination of the end of file reading
  • File buffer

The text start


1. Why files

Our previous learning structure, write the address book program, when the address book to run, can give the address book, add, delete data in the data is stored in memory, when the program exits in the address book data does not exist naturally, while waiting for the next run recorded communication program, data and have to entry, It is difficult to use such an address book.

We are thinking that since it is the address book, we should record the information, and only when we choose to delete the data, the data will no longer exist. This involves the problem of data persistence. Our general data persistence methods include storing data in disk files and databases.

Using files we can store the data directly on the hard disk of the computer, so as to make the data persistent.


2. What is a document

Files on disk are files. But in programming, we generally talk about two kinds of files: program files, data files (classified from the point of view of file function).


2.1 Program Files

Program files include the source program file (with the extension of.c), target file (with the extension of.obj in Windows), and executable program (with the extension of.exe in Windows).


2.2 Data Files

The contents of a file are not necessarily the program, but the data that the program reads and writes when it runs, such as the file from which the program reads data, or the file from which it outputs content.

This chapter discusses data files. The input and output of the data processed in the previous chapters are based on the terminal as the object, that is, input data from the keyboard of the terminal, and the running results are displayed on the display. In fact, we sometimes output information to disk and then read data from disk into memory for use when needed. In this case, we are dealing with files on disk.


2.3 the file name

A file should have a unique file id for users to identify and reference. A file name contains three parts: file path + file name trunk + file suffix for example: C :\code\test.txt For convenience, the file identifier is often called the file name.


3. Open and close files

3.1 File Pointers

In buffered file systems, the key concept is “file type pointer”, or “file pointer” for short.

Each used file has a corresponding file information area in memory, which is used to store the relevant information of the file (such as file name, file status, file current location, etc.). This information is stored in a structure variable. This structure type is system-declared and named FILE.

For example, the VS2013 compilation environment provides the following file type declaration in the stdio.h header:

struct _iobuf {
	char* _ptr;
	int   _cnt;
	char* _base;
	int   _flag;
	int   _file;
	int   _charbuf;
	int   _bufsiz;
	char* _tmpfname;
};
typedef struct _iobuf FILE; //FILE is the structure type declared by the system
Copy the code

This means that the structure type FILE creates a structure variable whose memory space contains information about a FILE.

The FILE types of different C compilers do not contain exactly the same content, but they do. Whenever a FILE is opened, the system will automatically create a variable of the structure type FILE according to the situation of the FILE and fill in the information. Users do not need to care about the details. Variables of the FILE structure are usually maintained through a pointer to the FILE, which makes it easier to use.

We can create a pointer variable to FILE:

FILE* pf; // File pointer variable

Define PF to be a pointer variable to data of type FILE. You can make PF point to the file information area of a file (which is a structure variable). The file is accessible through the information in the file information area. That is, the file associated with a file pointer variable can be found. Such as:


3.2 File opening and closing

Files should be opened before reading and writing, and closed after use. When writing the program, when opening the FILE, it will return a pointer variable of FILE* to point to the FILE, which is equivalent to establishing the relationship between the pointer and the FILE.

ANSIC specifies the use of the fopen function to open the file and fclose to close the file.

// Open the file
FILE* fopen(const char* filename, const char* mode);
// return a pointer to type FILE with two arguments :(FILE name, how FILE is used)

// Close the file
int fclose(FILE* stream);
// Returns an integer as a pointer to a file to close
Copy the code

The opening mode is as follows:

Sample code:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    FILE* pFile;
    pFile = fopen("myfile.txt"."w");// Only "write" files
    // File manipulation
    if(pFile ! = NULL) { perror("pFile");
    }
    fclose(pFile);    // Close the file
    pFile = NULL;

    return 0;
}
Copy the code

Output result:


4. Sequential read and write files

Because we write programs in memory, and files are on hard disk, when we read data from files into memory is calledInput (read)The act of writing data from a program to a file or to hard disk is calledWrite (output).

The following figure shows the various functions we use when reading and writing files


4.1 FPUTC Character Output Function

To write to a file, for example by entering characters into a file:

The fpuTC function inputs a character into a specified file. The first argument is the input character, the second argument is the file pointer, and the return value is the ASCLL value of the character.

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    FILE* pf;
    pf = fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
    // File manipulation
    if (pf == NULL)
    {
        perror("pf");
        return 1;
    }
    / / write file
    fputc('b', pf);
    fputc('i', pf);
    fputc('t', pf);
    
    fclose(pf);    // Close the file
    pf = NULL;

    return 0;
}
Copy the code

When ‘b” I ”t’ is written to a file, it is also written sequentially, so it is called sequential read and write to the file.When fpuTC disappears, so does the value in the input file:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    FILE* pf;
    pf = fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
    // File manipulation
    if (pf == NULL)
    {
        perror("pf");
        return 1;
    }
    / / write file
    //fputc('b', pf);
    //fputc('i', pf);
    //fputc('t', pf);

    fclose(pf);    // Close the file
    pf = NULL;

    return 0;
}
Copy the code

You can see that the file size has changed to 0KB again


4.2 FGETC character input function

To read, for example, character data from a file:

The fgeTC function reads a character from a specified file and returns the character’s ASCLL value if the reading is normal, or EOF(a symbolic constant with a value of -1) if the reading fails. Each time fgeTC is used, the position pointer of the file automatically moves back one bit.

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    FILE* pf;
    pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");// Only "read" the file
    // open the file,"r" means read only
    if (pf == NULL)
    {
        perror("pf");
        return 1;
    }
    / / write file
    int ret = fgetc(pf);/ / a
    // Reads a character from a file
    printf("%c\n", ret);
    // Prints the characters read
    ret = fgetc(pf);   / / twice
    printf("%c\n", ret);
    ret = fgetc(pf);   / / three times
    printf("%c\n", ret);
    // After each character is read, the next character is automatically read

    fclose(pf);    // Close the file
    pf = NULL;

    return 0;
}
Copy the code

Contents of the document:

Output result:

EOF(-1) is returned when fgeTC finishes reading or reads incorrectly:


4.3 Fputs text line output function

Enter file by line:

Fputs writes strings to the specified file and does not automatically write the string end identifier ‘\0’. After successfully writing a string, the position pointer to the file is automatically moved back, and the function returnsNon-negative integer; An error returns EOF (symbolic constant, value -1)

Note: Line breaks should be in the code

Line input without newline character \n:

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
int main()
{
    FILE* pf;
    pf=fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
    // Write file -- line by line
    if (pf == NULL)
    {
        perror("pf");// If pf is null, output the cause of the error
        return 1;
    }
    fputs("abcdef", pf);// Note that line breaks are reflected in the code
    fputs("ghijkllmn", pf);

    fclose(pf);
    pf = NULL;

    return 0;
}
Copy the code

Contents of the document:When typing by lineThere is a newline character \n:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    FILE* pf;
    pf=fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
    // Write file -- line by line
    if (pf == NULL)
    {
        perror("pf");// If pf is null, output the cause of the error
        return 1;
    }
    fputs("abcdef\n", pf);// Note that line breaks are reflected in the code
    fputs("ghijkllmn", pf);

    fclose(pf);
    pf = NULL;

    return 0;
}
Copy the code

Contents of the document:


4.4 Fgets text line input function

Read file by line:

The fgets function reads num from the specified file. Num is the maximum number of characters that can be read, but num-1 is the number of characters that can be read, because there is a space left for ‘\0’.

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    char str[50] = { 0 };
    FILE* pf;
    pf=fopen("E: \ \ test \ \ test2.23 TXT"."r");// Only "read" files
    // Read the file
    if (pf == NULL)
    {
        perror("pf");// If pf is null, output the cause of the error
        return 1;
    }
    fgets(str, 5, pf);// Only 4 characters are read
    printf("%s\n", str);

    fgets(str, 3, pf);// Only 2 characters are read
    printf("%s\n", str);

    fclose(pf);
    pf = NULL;

    return 0;
}
Copy the code

Contents of the document:

Output result:In order to find outPrinciple of the fgets function, we made some changes to the code on the basis of the same file content:

// What is the output?
#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include <stdio.h>
int main()
{
    char str[50] ="xxxxxxxxxxxxxxxxxxxxxxx";
    FILE* pf;
    pf=fopen("E: \ \ test \ \ test2.23 TXT"."r");// Only "read" the file
    // Read the file
    if (pf == NULL)
    {
        perror("pf");// If pf is null, output the cause of the error
        return 1;
    }
    fgets(str, 5, pf);// Only 4 characters are read
    printf("%s\n", str);

    fgets(str, 1, pf);// What is being read?
    printf("%s\n", str);

    fgets(str, 8, pf);
    printf("%s\n", str);

    fclose(pf);
    pf = NULL;

    return 0;
}
Copy the code

Output result:Start debugging:

Fgets when first used:Fgets when used the second time:The third time fgets is used:

The fgets function, when used, stores the number of characters it reads into the character array we create with a ‘\0’ at the end. When num is 1 in fgets, \0 is stored in an array of characters, so when we print it to the screen, there is nothing at all. When num is 0, nothing is stored in the character array.


4.5 Fprintf Formats the output function

Format indicates the format (%s, %d, etc.) followed by “…” Represents a variable argument (one or more arguments, the number of arguments can vary). Fprintf is just one more argument than printfFILE*streamAnd nothing more.

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include<stdio.h>
typedef struct Student
{
	char Name[20];     / / name
	int Age;                 / / age
	float High;            / / height
}S;

int main()
{
	S s = { "Zhang".18.177.5f };
	FILE* pf = fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
	// Read (input) files for formatted content
	if (pf == NULL)
	{
		perror("pf");
		return 1;// The function terminates abnormally
	}
	fprintf(pf, "%s %d %f", s.Name, s.Age, s.High);

	fclose(pf);
	pf = NULL;
	return 0;
}
Copy the code

Output result:

Contents of the document:


4.6 Fscanf Formats the input function

Like fprintf, format stands for format (%s, %d, etc.) followed by “…” Represents a variable argument (one or more arguments, the number of which can vary), and fscanf has only one more argument than scanfFILE*streamAnd nothing more.

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include<stdio.h>
typedef struct Student
{
	char Name[20];     / / name
	int Age;                 / / age
	float High;            / / height
}S;

int main()
{
	S s = { 0 };
	FILE* pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");// Only "read" files
	// Read (input) files for formatted content
	if (pf == NULL)
	{
		perror("pf");
		return 1;// The function terminates abnormally
	}

	fscanf(pf, "%s%d%f", s.Name, &(s.Age), &(s.High));
	// What is read is the same as what is typed on the screen
	printf("%s %d %f", s.Name, s.Age, s.High);
	// Outputs what is read

	fclose(pf);
	pf = NULL;
	return 0;
}
Copy the code

Contents of the document:

Output result:


4.7 Fwrite, FREAD binary output, input function

Void * buffer is a pointer to the written content,size_t size is the number of bytes to write,size_t count is the number of elements to write in units of the last parameter, FILE* stream represents a FILE pointer, and the return value is the number of elements read.

Fread is used to read the contents of a specified file, and the first parameter PTR indicates the start address of the content. The second parameter size indicates the size of each element, again in bytes. The third parameter indicates the number of elements to read. The fourth argument, stream, represents a pointer to the file from which to read. The return value is the number of elements read.

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include<stdio.h>
typedef struct Student
{
	char Name[20];         / / name
	int Age;              / / age
	float High;          / / height
}S;

int main()
{
	S s = {"Zhang".18.177.5};
	FILE* pf;
	pf = fopen("E: \ \ test \ \ test2.23 TXT"."w");// Only "write" files
	if (pf == NULL)
	{
		perror("pf");
		return 1;
	}
	fwrite(&s, sizeof(s), 1, pf);
	

	fclose(pf);
	pf = NULL;
	return 0;
}
Copy the code

Contents of the document:What is this? At this point, we find that some of the contents of the file we do not understand, or even a string of gibberish, but if we use the fread function to read, can we read the number we want to write?

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#define _CRT_SECURE_NO_WARNINGS
/* fopen fclose example */
#include<stdio.h>
typedef struct Student
{
	char Name[20];        / / name
	int Age;              / / age
	float High;           / / height
}S;

int main()
{
	S s = {0};
	FILE* pf;
	pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");// Only "read" files
	if (pf == NULL)
	{
		perror("pf");
		return 1;
	}

	fread(&s, sizeof(s), 1, pf);// Read the file
	printf("%s %d %f", s.Name, s.Age, s.High);

	fclose(pf);
	pf = NULL;
	return 0;
}
Copy the code

Contents of the document:Output result: Thus, the fread and fwrite functions should normally be used together.


4.8 Compare a Group of functions

  1. scanf / fscanf / sscanf

  2. printf / fprintf / sprintf

Scanf — formatted input statements for standard input –stdin

Fscanf — formatted input statements for all input streams –stdin/ files

Sscanf – Read formatted data to a string

The STR argument is the string to read from; Format Indicates the user-specified format;” …” Is a variable used to store the data read. [Return value] The number of parameters is returned on success, and -1 is returned on failure.

Printf — formatted output statement for standard output –stdout

Fprintf — formatted output statements for all output streams -stdout/ file

Sprintf – Converts formatted data to a string

Like sscanf, the STR argument is a string to read data from; Format Indicates the user-specified format;” …” Is a variable used to store the data read. [Return value] The number of parameters is returned on success, and -1 is returned on failure.

The sprintf sample code is as follows:

#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
typedef struct Student
{
	char Name[15];/ / name
	int Age;            / / age
	float High;       / / height
}S;

int main()
{
	S s = { "Zhang".18.177.5 };
	char buf[100] = { 0 };
	sprintf(buf, "%s %d %f", s.Name, s.Age, s.High);
	//sprintf(merge the data in s into a string and store it in buF character array, format, specified data);
	printf("%s\n", buf);
	// Prints the contents of the buf character array. The contents of s will be printed directly as a string

	return 0;
}
Copy the code

Output result:

So how do you restore this string data?

Example code for sscanf is as follows:

#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
typedef struct Student
{
	char Name[15];/ / name
	int Age;            / / age
	float High;       / / height
}S;

int main()
{
	S s = { "Zhang".18.177.5 };
	char buf[100] = { 0 };
	S tmp = { 0 };

	sprintf(buf, "%s %d %f", s.Name, s.Age, s.High);
	//sprintf-- converts formatted data to a string
	//sprintf(merge the data in s into a string and store it in buF character array, format, specified data);
	printf("%s\n", buf);

	// Prints the contents of the buf character array. The contents of s will be printed directly as a string
	sscanf(buf, "%s %d %f", tmp.Name, &(tmp.Age), &(tmp.High));
	//sscanf- Reads format data from a string
	printf("%s %d %f\n", tmp.Name, tmp.Age, tmp.High);
	// Prints the read data

	return 0;
}
Copy the code

Output result:


5. Random read and write of files

In the previous study, we know the location file pointer to the change in time, such as automatically after you read a point to the next to read the content, this way has some limitations, no way to where at where we want to read, so is there any limitation smaller methods? Let’s look at random reads and writes to files.


5.1 fseek

Fseek: To locate the file pointer based on the file pointer position and offset is to move the file pointer to the desired location.

The first argument is the file pointer, and the second argument, offset, is the offset. (When determining the offset, consider the starting position of the offset. There are three starting positions, the first SEEK_SET is the starting position of the file, the second SEEK_CUR is the starting position of the file pointer, The third SEEK_END is offset from the end of the file, with positive numbers to the left and negative numbers to the right, in bytes), and the third origin is the starting position.

Sample code:

#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
int main()
{
	// File content: zhangsan
	FILE* pf;
	pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");
	if (pf == NULL)
	{
		perror("pf");
		return 1;
	}
	int ch = fgetc(pf);      / / for the first time
	printf("%c\n", ch);     //z

	fseek(pf, 2, SEEK_CUR);
	// The file pointer is offset two bytes to the right from z (where the file pointer is currently pointing)

	ch = fgetc(pf);          / / the second time
	// The file pointer automatically points to the next bit (n)
	printf("%c\n", ch);    //n

	ch = fgetc(pf);         / / the third time
	// The file pointer automatically points to the next bit (at g)
	printf("%c\n", ch);   //g

	fseek(pf, -2, SEEK_END);
	// Offset two bytes to the left from end n (s)
	ch = fgetc(pf);        / / for the fourth time
	// The file pointer automatically points to the next bit (a)
	printf("%c\n", ch);   //a

	fclose(pf);
	pf = NULL;

	return 0;
}
Copy the code

Contents of the document:

Output result:


5.2 ftell

That’s what FTELL doesTells us what the current offset of the file pointer is relative to the starting position, which takes a file pointer and returns the value of the offset.

Example code is as follows:

#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
int main()
{
	// File content: zhangsan
	FILE* pf;
	pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");
	if (pf == NULL)
	{
		perror("pf");
		return 1;
	}
	int ch = fgetc(pf);      / / for the first time
	printf("%c\n", ch);     //z

	fseek(pf, 2, SEEK_CUR);
	// The file pointer is offset two bytes to the right from z (where the file pointer is currently pointing)

	int n = ftell(pf);       // The offset of the current position from the start position
	printf("%d\n", n);   / / 3


	fclose(pf);
	pf = NULL;

	return 0;
}
Copy the code

Output result:


5.3 rewind

Rewind returns the current position of the file pointer to the starting position of the file.

Sample code:

#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
int main()
{
	// File content: zhangsan
	FILE* pf;
	pf = fopen("E: \ \ test \ \ test2.23 TXT"."r");
	if (pf == NULL)
	{
		perror("pf");
		return 1;
	}
	int ch = fgetc(pf);      / / for the first time
	printf("%c\n", ch);     //z

	fseek(pf, 2, SEEK_CUR);
	// The file pointer is offset two bytes to the right from z (where the file pointer is currently pointing)

	rewind(pf);                // Return to the starting position
	ch = fgetc(pf);         
	printf("%c", ch);       //z

	fclose(pf);
	pf = NULL;

	return 0;
}
Copy the code

Output result:


6. Text files and binary files

Data files are called text files or binary files, depending on how the data is organized.

Data is stored in memory in binary form, which is binary if no converted output is added to external memory.

If you want to store it as ASCII on external memory, you need to convert it before storing it. Files stored as ASCII characters are text files.

How is a piece of data stored in memory?

All characters are stored in ASCII format, and numeric data can be stored in EITHER ASCII format or binary format. For example, if the integer 10000 is output to the disk as ASCII code, 5 bytes (one byte for each character) will be occupied on the disk, while if output in binary form, only 4 bytes will be occupied on the disk (VS2013 test).

Here is:

Open binary:Binary file:


7. Judgment of the end of file reading

7.1 Misused FEOF

Remember: The return value of the feOF function cannot be used to determine whether the file is closed or not during file reading. Instead, it is used to determine whether a read fails or an end-of-file is encountered at the end of a file read.

The fgeTC function returns the ASCLL code value of the read character at the end of the EOF normal reading

The fgets function returns NULL at the end of a read

The fread function returns the number of complete elements that were actually read. If the number of complete elements that were read is less than the specified number, it considers the read as the last read

For example, copy file contents:

Test2.24 is generated by copying the contents of test2.23
#define _CRT_SECURE_NO_WARNINGS
#include<stdio.h>
int main()
{
	FILE* pfread = fopen("E: \ \ test \ \ test2.23 TXT"."r");
	if (pfread == NULL)
	{
		return 1;
	}
	FILE* pfwrite = fopen("E: \ \ test \ \ test2.24 TXT"."w");
	if (pfwrite == NULL)
	{
		fclose(pfread);
		pfread = NULL;
		return 1;
	}
	// File opened successfully
	// Read and write files
	int ch = 0;
	while((ch = fgetc(pfread))! = EOF) {/ / write file
		fputc(ch, pfwrite);
	}
	// Close the file
	fclose(pfread);
	pfread = NULL;
	fclose(pfwrite);
	pfwrite = NULL;

	return 0;
}

Copy the code

Before the code runs:

After the code runs:

Successful copy:


8. File buffer

The ANSIC standard uses a buffered file system to process data files, which means that the system automatically creates a file buffer in memory for each file being used in the program. The output data from the memory to the hard disk is sent to the buffer in the memory first. After the buffer is filled, the data is sent to the hard disk. If data is read from the hard disk to the computer, input is read from the hard disk file into the memory buffer (filling the buffer), and from the buffer the data is sent one by one to the program data area (program variables, etc.). The size of the buffer is determined by the C compiler system.

For example, if a teacher is preparing for class and there are a lot of students interrupting the teacher by asking questions, the efficiency of the class will be significantly reduced; If the teacher says that a person can only ask questions after he has accumulated 10 questions, on the one hand, it will limit the number of students to ask questions to some extent, and on the other hand, it will improve the efficiency of the class, which is the same reason for the buffer zone.

Such as:

#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <windows.h>
//VS2022 WIN11 environment test
int main()
{
	FILE* pf = fopen("E: \ \ test \ \ test2.23 TXT"."w");
	fputs("abcdef", pf);
	// Put the code in the output buffer first
	printf("Sleep 10 seconds - write data, open test. TXT file, find no contents \n");
	Sleep(10000);

	printf("Flush buffer \n");
	fflush(pf);// Write the output buffer to a file (disk) only when the buffer is flushed
	
	printf("Sleep for another 10 seconds - at this point, open the test. TXT file again, the file has content \n");
	Sleep(10000);
	fclose(pf);
	// Note: fclose also flushes the buffer when closing the file
	pf = NULL;
	return 0;
}
Copy the code

Before buffer flushing:

After buffer flushing:

One conclusion can be drawn here:

Because of the existence of the buffer, C language when operating on the file, need to refresh the buffer or close the file at the end of the file operation. If you do not do this, problems may occur in reading and writing files


That’s the end of this post. If you think it’s helpful, please like it. See you next time!