I haven't updated the blog of C language learning for a long time. What I bring today is the knowledge points of the file part! 😋
1. Why do I need documents?
Previously learned the code implementation of the address book, which can add and delete contacts in the address book. However, the address book is destroyed when your exe file is closed. Its contents cannot be postponed to the next time you open the address book, which is inconvenient for our use.
The file can help us realize the persistence of data: save the data in the disk file, and the previously saved contacts will not disappear the next time we open the address book.
2. What is a document?
A file is data with a specific format stored on disk.
2.1 document classification
In programming, two kinds of files are generally discussed: program file and data file
- Program files: code source files, such as c. Object file obj/.o. Executable file exe
- Data file: the data read and written by the program during use, such as the file of reading content and the file of data output
What we know about this blog is data files
2.2 file name
The file name consists of three parts: file path + file name trunk + file suffix
For example: C: \ code \ test txt
The file ID is often referred to as the file name
3. Use of documents
3.1 document pointer
In file operation, a very important knowledge point is file type pointer, which is called file pointer for short
Each FILE has a FILE information area when opening up, which is used to save the name, status, current location and other relevant information of the FILE. This information is stored in a structure, which is declared as FILE by the system
Different C language compilers have different FILE types, but they are similar.
When opening a FILE, the system will automatically create a FILE structure variable and fill its information according to the content of the FILE.
When we need to use a FILE, we can access the structure variable through a pointer of FILE type
3.2 opening and closing files
The file needs to be opened before reading and writing, and closed after use
This is similar to dynamic memory management
ANSIC stipulates that fopen function is used to open the file and fclose is used to close the file.
When opening a FILE, a pointer variable of FILE * will be returned to point to the FILE.
After closing the file, the file pointer becomes a wild pointer, which needs to be set to NULL to prevent wrong calls
fopen function fails to open the file and returns a null pointer
#include <stdio.h> #include <errno.h> #include <string.h> int main() { //Open file FILE* pf = fopen("test.txt", "r"); if (pf == NULL) { printf("%s\n", strerror(errno));//Use this function to print error messages return 0; } //1. Reading documents //Close file fclose(pf); pf = NULL; return 0; }
#include <stdio.h> #include <errno.h> #include <string.h> int main() { //Open file FILE* pf = fopen("test.txt", "w"); if (pf == NULL) { printf("%s\n", strerror(errno));//Use this function to print error messages return 0; } //2. Write documents //Close file fclose(pf); pf = NULL; return 0; }
strerror function is explained in this blog 👉 Point me
3.2.1 document usage
Through this table, we can understand the different types of file usage
- Note: they are used in double quotation marks, not single quotation marks!
When writing with w, the existing content will be overwritten. If you need to add after the existing content, you need to use a
3.2.2 standard input / output stream
- Output: memory → file
- Input: file → memory
C language program will open three streams by default when running
- stdin: standard input stream
- stdout: standard output stream
- stderr: standard error stream
When performing input and output operations, we used to print the data printf in memory directly to the screen
Now we can input the data into the standard output stream through the file pointer to achieve the effect similar to printf
3.3 file input / output function
In the above code, the fputc function is used to input a character into a file
The following table lists some file functions we will use
3.3.1 character input and output
fputc function: write a single character to a file
fgetc function: reads a single character from a file
As you can see, we have printed out all the characters just written in the file
Realize file copy
Copy the contents of one file to another
int main() { //Implement a code to convert data Txt copy to generate data2 txt FILE* pr = fopen("data.txt", "r"); if (pr == NULL) { printf("open for reading: %s\n", strerror(errno)); return 0; } FILE* pw = fopen("data2.txt", "w"); if (pw == NULL) { printf("open for writting: %s\n", strerror(errno)); fclose(pr); pr = NULL; return 0; } //Copy file int ch = 0; while ((ch = fgetc(pr)) != EOF) { fputc(ch, pw); } fclose(pr); pr = NULL; fclose(pw); pw = NULL; return 0; }
3.3.2 text line input and output
fputs function: writes a string to a file
//Write a line #include <stdio.h> int main() { FILE* pf = fopen("data.txt", "w"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } fputs("hello world\n", pf); fputs("hehe\n", pf); fclose(pf); pf = NULL; return 0; }
Run the code and you can see that the two lines of string have been written to the data under the project path Txt file
fgets function: reads a string of specified length from a file
This function has the third parameter when used, which is used to limit the length of the read string
read file-Read a line int main() { FILE* pf = fopen("data.txt", "r"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } char buf[1000] = {0}; //read file fgets(buf, 3, pf); printf("%s\n", buf); fgets(buf, 3, pf); printf("%s\n", buf); fclose(pf); pf = NULL; return 0; }
Running the program, you can see that we set 3, but only read 2 characters
Change buf[2] to 1, debug and view
You can see that after the first fgets function is executed, the original 1 is written to \ 0
This proves that the fgets function will end with \ 0 when reading characters
If we need to read 3 characters, we need to set the limit to 4
3.3.3 format input and output
The "format" here refers to the data content with specific format such as structure
fprintf function: writes formatted data to a file
#include<stdio.h> //...... struct Stu { char name[20]; int age; double d; }; int main() { struct Stu s = { "Zhang San", 20, 95.5 }; FILE* pf = fopen("data.txt", "w"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } //Write formatted data fprintf(pf, "%s %d %lf", s.name, s.age, s.d); fclose(pf); pf = NULL; return 0; }
fscanf function: read the formatted data from the file and store it in the corresponding structure variable s
3.3.4 binary input and output
- fread and fwrite can operate on any type of data
- As its name suggests, binary input function is to input content into a file in binary form
When using this function, you need to use * * "rb", "wb" * * to open the file
fwrite(s, sizeof(struct Stu), 2, pf); //s source //sizeof the size of the element to be written //2 number of elements to be written //pf write target file pointer
The following is an example of writing structure variables
struct Stu { char name[20]; int age; double d; }; //Binary write int main() { struct Stu s[2] = { {"Zhang San", 20, 95.5} , {"lisi", 16, 66.5}}; FILE* pf = fopen("data.txt", "wb"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } //Write files in binary mode fwrite(s, sizeof(struct Stu), 2, pf); fclose(pf); pf = NULL; return 0; }
It can be seen that the data written at this time has partially become garbled. At this time, its contents are already stored in binary, and the txt reader cannot read these data correctly
Binary reading is the step of reproduction, which reads out the binary data in the text in a specific format and puts it into the corresponding variable
fread(s, sizeof(struct Stu), 2, pf); //s variable for storing file contents //sizeof needs to read the size of the element //2 number of elements to be read //pf read target file pointer
3.3.5 sscanf/sprintf function
These two functions are special. Their function is to copy the formatted data (such as structure) in the file into the character array in the form of string
See the figure below
3.4. Other file functions
3.4.1 fseek
http://cplusplus.com/reference/cstdio/fseek/?kw=fseek
This function moves the file pointer to a specific offset relative to a location
It sounds a little tongue twister. Just give an example
Give a string "abcdef"
Each time fgetc is used, the file pointer will go back one bit. Used twice, the file pointer points to the character c
If we need to point to f, let the pointer
- 5 positions backward from the starting position
- 3 bits backward from the current position
- 1 bit forward from end position
We can use this function to locate the file pointer, change it to the position we need, and perform character replacement and other operations
int main() { FILE* pf = fopen("test.txt", "w"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } //Write file int ch = 0; for (ch = 'a'; ch <= 'z'; ch++) { fputc(ch, pf); } //Locate file pointer fseek(pf, -2, SEEK_END); fputc('#', pf);//Replace the current character with# fclose(pf); pf = NULL; return 0; }
3.4.2 ftell
Returns the current offset of the file pointer (relative to the beginning of the file)
3.4.3 rewind
http://cplusplus.com/reference/cstdio/rewind/?kw=rewind
Return the position of the file pointer to the starting position of the file
fseek(pf, 0, SEEK_SET); //The rewind function is equivalent to the fseek function //But rewind is more convenient
int main() { FILE* pf = fopen("test.txt", "r"); if (pf == NULL) { printf("%s\n", strerror(errno)); return 0; } //read file int ch = fgetc(pf); printf("%c\n", ch);//a ch = fgetc(pf); printf("%c\n", ch);//b int ret = ftell(pf); printf("%d\n", ret);//2 rewind(pf); //fseek(pf, 0, SEEK_SET); ret = ftell(pf); printf("%d\n", ret);//0 fclose(pf); pf = NULL; return 0; }
4. Text files and binary files
We now know that the fread/fwrite function can realize binary input and output. How do they implement them?
According to the organization form of data, data files are called text files or binary files. Data is stored in binary form in memory. If it is output to external memory without conversion, it is a binary file.
If it is required to store in the form of ASCII code on external memory, it needs to be converted before storage. The file stored in the form of ASCII characters is a text file.
In memory, all characters are stored in ASCII form, and numerical data can be stored in ASCII form or binary form.
The number 10000 can be stored in the following two ways
- 1 0 is stored as 5 characters – 5 bytes
- Store in binary form of the number itself – 4 bytes
At this time, using binary mode can save space
Use the following code to write 10000 to the file in binary mode
In VS, we can open test with a binary editor in a specific way Txt document
You can see that 10000 is stored in the file in the form of binary code
This involves the problem of large and small ends 👉 Point me
5. Determination of the end of file reading
5.1 misusing feof
You cannot directly use the return value of the feof function to determine whether the file is ended
Instead, you should use the feof function to judge whether the reading fails or ends normally at the end of the file
- Whether the reading of the text file is finished, and judge the return value
- EOF(fgetc)
- NULL(fgets)
ferror function: judge whether there is a reading error in the file. If yes, return to true
http://cplusplus.com/reference/cstdio/ferror/?kw=ferror
- After reading the binary file, judge whether the return value is less than the actual number to be read
- The return value of fread is the number of successfully read data
- Judge whether the return value is less than the actual number to be read
6. File buffer
ANSIC standard adopts "buffer file system" to process data files.
The so-called buffer file system means that the system automatically opens up a "file buffer" for each file being used in the program in memory. Data output from memory to disk will be sent to the buffer in memory first, and then sent to disk together after the buffer is filled.
Like git, this is to put the files that need to be pushed into the cache first, and then push them to the remote warehouse after confirming that the files are correct
If you read data from the disk to the computer, read the data from the disk file, input it into the memory buffer (fill the buffer), and then send the data from the buffer to the program data area (program variables, etc.) one by one. The size of the buffer is determined by the C compilation system (compiler).
Because of the existence of buffer, C language needs to refresh the buffer or close the file at the end of file operation when operating the file. If not, it may cause problems in reading and writing files.
Code example 1
#include <stdio.h> #include <windows.h> int main() { FILE* pf = fopen("test.txt", "w"); fputs("abcdef", pf);//Put the code in the output buffer first printf("Sleep for 10 seconds-The data has been written. Open it test.txt File, no content found in the file\n"); Sleep(10000); printf("refresh buffer \n"); fflush(pf);//When the buffer is refreshed, the data of the output buffer is written to the file (disk) printf("Sleep for another 10 seconds-At this point, open again test.txt File, the file has content\n"); Sleep(10000); fclose(pf); //Note: fclose also flushes the buffer when closing the file pf = NULL; return 0; }
Run the program and pause the program through the sleep function. You can see that the initial string is not saved in the file
Instead, the input buffer is written first, and the txt file is written only after the buffer is refreshed
Code example 2
#include <stdio.h> #include <windows.h> int main() { while (1) { printf("hehe\n"); //In the linux environment, without '\ n', it will not print (without refreshing the cache) //In VS environment, it will print normally with or without Sleep(1000);//In linux environment, the parameter of sleep function is in seconds (VS is in milliseconds) // In linux environment, the sleep function needs to be lowercase, and in VS, it is sleep } return 0; }
Test this code in a Linux environment (raspberry pie)
As you can see, after \ n is removed, the code will not print hehe
When compiling, an error is reported 👇, But the program is still compiled
implicit declaration of function 'sleep'
CSDN checked and found that it needs to reference the header file #include < unistd h>
Recompile and no error is reported (here hehe has been added, and the program prints normally)
epilogue
The contents of the document chapters are very rich. Have you paid the tuition! 😁
Most of the content still needs us to operate a lot to get familiar with its real function
If the content is wrong, please correct it ruthlessly!