System Call IO

Posted by jeanlee411 on Thu, 03 Mar 2022 18:30:33 +0100

Standard IO relies on system IO for implementation.
The concept of a file descriptor
File IO operations: open, close, read, write, lseek
Difference between File IO and Standard IO
Efficiency issues with IO
File sharing
Atomic Operation
Redirection implementation in program: dup, dup2
Sync: sync, fsync, fdatasync
fcntl();
ioctl();
/dev/fd/directory

1. The concept of file descriptors

Integer, which is the subscript of the pointer in the array. The file descriptor returned by open() takes precedence over the smallest usage in the current available range, and there is a process in progress.
There is an array of file descriptors for each process;

Interpretation: Opening a file (inode) with the system call function open() produces a FILE* like structure inside, including information such as file location pointer, open count counter, etc. The resulting structure is masked to the user and the first address pointer of the structure is saved in an array. The subscript given to the user for the array where the pointer is currently stored is the file descriptor. The default 0, 1, and 2 descriptors correspond to stdin, stdout, and stderr, respectively. Array size query: ulimiti-a. The default size is 1024.
Exceptional case:
1. Two processes open the same file, resulting in two structures;
2. The same file is opened twice in a process, and the open function is executed twice, resulting in two structures, two file descriptors, and associating the same file. If there is no agreement, there will be competition for file content operations; Close one file descriptor without affecting the other;
3.Two file descriptors point to the same structure (see dup), closing a file descriptor does not close the structure. The reason is that the structure has counters that reflect how many pointer references the current structure has. free structure only when counter is 0.
**Find Procedure: **File Descriptor->Array Subscript->Pointer->Structures->Operating File Information
From a programming perspective, buf and cache:buf are interpreted as write buffers and cache as read buffers.
**The difference between blocked and non-blocked IO implementations:**Functions are the same set, depending on whether O_is parameterized NONBLOCK.
The printer prints things and prints them if there are files coming. If there is no file coming, wait and block IO; If there are files coming, print them. No files coming, busy with other things, not blocking IO;

2. File IO operations: open, close, read, write, lseek

r -> O_RDONLY
r+ -> O_RDWR
w -> O_WRONLY|O_CREAT|O_TRUNC(Write-only, create file if it doesn't exist, truncate if it exists)
w+ -> O_RWDR|O_TRUNC|O_CREAT(Read and write, empty or create)

open function:

The open function has two forms and is implemented with variable parameters.
How can I tell if a function is overloaded or variable parameter?
Enter a few more parameters to see the error prompt; Variable parameter general warning, overload is error;
gcc -wall

IO/sys/mycpy.c
 while(1)
    {
        ret = read(sfd, buf, BUFFSIZE); 
		printf("%d\n", ret);
        if(ret < 0)   
        {
            perror("read src_file");
            break;
        }
        if(ret == 0)
        {
            break;    
        } 
        //wet = write(dfd, buf, BUFFSIZE);   
		//Detect that if you don't finish writing at one time, continue writing until you have written all the data you read this time
		//Keep writing enough bytes. The reason for this is that the signal interrupts blocked system calls and is probably not written enough
		//Interrupted in the case of len bytes
		pos = 0;
		while(ret > 0)
		{
			wet = write(dfd, buf+pos, ret);
			if(wet < 0)
			{
				perror("write des_file");
				//break;
				exit(1);
			}
			pos += wet;
			ret -= wet;
		}      
    }

3. Difference between File IO and Standard IO

System call IO: every call, it executes one time from user state to kernel state with high real-time performance.
Standard IO: Operation is only true once when memory is full/newline/fflush, merging system calls
Difference: Response speed: System call IO > Standard IO
Throughput: System Call IO <Standard IO
Interview: How can I make a program faster?
Look at the problem in two parts, respond quickly, and use the system to call IO; High throughput with standard IO;
From the user experience, faster generally refers to greater throughput;
Reminder: System call IO and standard IO cannot be mixed;
Reason: Changes to files by standard IO do not really change the contents of the files, but are written to the buffer; Only when it really refreshes
To really change the contents of the file. Therefore, the operation of standard IO on files cannot correspond to system call IO one-to-one.
See IO/sys/ab.c.

IO/sys/ab.c
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

int main()
{
    putchar('a');
    write(1, "b", 1);

    putchar('a');
    write(1, "b", 1);

    putchar('a');
    write(1, "b", 1);

    exit(0);
}

The output is bbbaaa

Strace. / Ab: See how system calls to executable ab occur:

File Descriptor to File Type Pointer Conversion Function:

 int fileno(FILE* stream); //Convert from file pointer to corresponding file descriptor
 FILE* fdopen(int fd, const char* mode);//Convert an open file descriptor to a file pointer

4. Efficiency of IO

Exercise: Put IO/sys/mcpy.c program to change, multiply the value of BUFFSIZE
1. The time spent observing the process;
2. Note the BUFFSIZE value when the best performance inflection point occurs;
3. When will the procedure fail;

//Calculate program execution time
time ./mycpy <src_file> <des_file>

5. File Sharing

Multiple tasks work together on a file or work together to complete a task
Interview: Write a program to delete line 10 of a file

Idea 1:
//Go to the beginning of 11 lines and read the contents. Position at the beginning of 10 lines to overwrite;
//Four system calls
//Finally, you need to truncate the file
while()
{
lseek11 + read + lseek10 + write
}

Idea 2:
Current process space opens the file twice 
1 -> open r ->fd1 -> lseek 11
2 -> open r+  ->fd2 -> lseek 10  //r+is used to check if a file exists

while()
{
1->fd1-> read
2->fd2-> write 
}

Idea 3
process1 -> open -> r
process -> open -> r+

process1 -> read -> process2 -> write

Thoughts 2 and 3 are designed to reduce system calls by opening the file twice;

Supplemental functions:

//How long to truncate an open file
int truncate(const char* path, off_t length);
//How long to truncate an open file
int ftruncate(int fd, off_t length);

6. Atomic operations

Atomic operations: indivisible operations
The Role of Atomic Operations: Resolving Competitions and Conflicts
Such as solving the non-atomic operation problem of tmpnam()

7. Implementation of redirection in the program: dup, dup2

#include <unistd.h>
//Copy a copy of oldfd to the smallest location within the current available range
//And oldfd point to the same file structure
int dup(int oldfd);

//Newfd is a copy of oldfd, if newfd is occupied, newfd will be released first 
//Do nothing if newfd and oldfd are file descriptors of the same value
int dup2(int oldfd, int newfd);
IO/sys/dup.c
//Put ("hello!") Redirect output from standard output to tmp/out
int flag;
   
   //Idea 1
	/*
   flag =  close(1);
   if(flag < 0)
   {
        perror("close()");
        exit(1);
   }
    
	
    int fd;
    fd = open(FNAME, O_WRONLY|O_CREAT|O_TRUNC, 0600);
    if(fd < 0)    
    {
        perror("open");
        exit(1);
    }
	*/
	
	//Idea 2
	//Defects:
	//1. The current process file descriptor only has 0 by default, if there is no 1, fd will be 1 after open, which will cause problems in close(1);
	//2. Another task opens a new file after the close(1) statement and before dup is executed.
	//Descriptor 1 is assigned away 
	//Cause of defect: close(1) + dup(fd) operation nonatomic
	/*
	int fd;
    fd = open(FNAME, O_WRONLY|O_CREAT|O_TRUNC, 0600);
    if(fd < 0)    
    {
        perror("open");
        exit(1);
    }
	close(1);
	dup(fd);
	close(fd);
	*/
	
	//Idea 3
	//Or is it a write defect and should not be written from the main perspective
	//It should be written from a module perspective, after changing File Descriptor 1, you need to restore the site after puts
	//1. Do not leak memory 2. Don't create out-of-bounds 3. Always think of it as writing a small module, not a main function
	int fd;
    fd = open(FNAME, O_WRONLY|O_CREAT|O_TRUNC, 0600);
    if(fd < 0)    
    {
        perror("open");
        exit(1);
    }
	
	dup2(fd, 1);
	
	if(fd != 1)
		close(fd);
	
	//Reopen dev's standard output
	 
/***********************************************/
    puts("hello!");

Topics: C Linux