Count the number of characters, words and total rows of the file, including:
- Number of characters and words per line
- The total number of characters, words, and total rows of the file
be careful:
- White space characters (spaces and tab indents) are not included in the total number of characters;
- Words are separated by spaces;
- Do not consider a word in two lines;
- Limit the number of characters per line to 1000.
Please look at the code first:
#include <stdio.h> #include <string.h> int *getCharNum(char *filename, int *totalNum); int main(){ char filename[30]; // totalNum[0]: total rows totalNum[1]: total characters totalNum[2]: total single words int totalNum[3] = {0, 0, 0}; printf("Input file name: "); scanf("%s", filename); if(getCharNum(filename, totalNum)){ printf("Total: %d lines, %d words, %d chars\n", totalNum[0], totalNum[2], totalNum[1]); }else{ printf("Error!\n"); } return 0; } /** * Count the number of characters, words and lines of the file * * @param filename file name * @param totalNum File statistics * * @return Statistics are returned successfully, otherwise NULL is returned **/ int *getCharNum(char *filename, int *totalNum){ FILE *fp; // Pointer to file char buffer[1003]; //A buffer that stores the contents of each row read int bufferLen; // The length of what is actually stored in the buffer int i; // The i th character of the current read buffer char c; // Read characters int isLastBlank = 0; // Is the last character a space int charNum = 0; // The number of characters in the current line int wordNum = 0; // Number of words in the current line if( (fp=fopen(filename, "rb")) == NULL ){ perror(filename); return NULL; } printf("line words chars\n"); // Read one line of data at a time and save it to the buffer. Each line can only have 1000 characters at most while(fgets(buffer, 1003, fp) != NULL){ bufferLen = strlen(buffer); // Traverse the contents of the buffer for(i=0; i<bufferLen; i++){ c = buffer[i]; if( c==' ' || c=='\t'){ // Space encountered !isLastBlank && wordNum++; // If the last character is not a space, add 1 to the number of words isLastBlank = 1; }else if(c!='\n'&&c!='\r'){ // Ignore line breaks charNum++; // If it is neither a newline character nor a space, add 1 to the number of characters isLastBlank = 0; } } !isLastBlank && wordNum++; // If the last character is not a space, add 1 to the number of words isLastBlank = 1; // Reset to 1 per newline // At the end of one line, calculate the total number of characters, the total number of single words and the total number of rows totalNum[0]++; // Total number of rows totalNum[1] += charNum; // Total characters totalNum[2] += wordNum; // Total number of single words printf("%-7d%-7d%d\n", totalNum[0], wordNum, charNum); // Set to zero and count the next line again charNum = 0; wordNum = 0; } return totalNum; }
Create a file demo on disk D Txt and enter the following:
I am Chinese. I love my country. China has 960 square kilometers of territory. China has a population of 1.35 billion. The capital of China is Beijing. By gunge 2021-08-12
Run the program, and the output result is:
Input file name: d://demo.txt line words chars 1 7 26 2 7 39 3 7 33 4 6 27 5 0 0 6 2 7 7 0 0 8 1 10 Total: 8 lines, 30 words, 142 chars
The above program reads one line from the file at a time, puts it in the buffer, and then traverses the buffer to count the number of characters and words in the current line.
The fgets() function is used to read a line or a specified number of characters from the file. Its prototype is:
char * fgets(char *buffer, int size, FILE * stream);
Parameter Description:
- Buffer is a buffer used to store the read data.
- Size is the number of characters to read. If the number of characters in this line is greater than size-1, it ends when size-1 characters are read, and '\ 0' is added at the end; If the number of characters in this line is less than or equal to size-1, read all characters and supplement '\ 0' at the end. That is, a maximum of size-1 characters can be read at a time. Characters read include line breaks.
- stream is a file pointer.
Some readers ask why you don't use getc() to read one character from the file at a time without opening up a buffer.
This is no problem, but pay attention to cross platform issues when handling line breaks, because different platforms handle line breaks of text files differently. Linux uses' \ n 'as line breaks, Windows uses' \ n\r' as line breaks, and Mac uses' \ r\n 'as line breaks. Therefore, it is troublesome to use the getc() function to handle line breaks.
To simplify, read the whole line of data through fgets(), and then process each character, ignoring '\ n' and '\ r'.
Note: since there will be a newline character with a maximum length of 2 bytes at the end of each line, and fgets() will also add NUL, the length of the buffer must be at least 1003 to accommodate 1000 characters per line, otherwise strlen() may return a garbage value.
Look at line 43 of the code. When there is an error opening the file, NULL is returned instead of the stiff exit(). In this way, you can notify the main function of an error, let the main function handle it properly, or notify the user to improve the user experience of the software.