Read write file
read
Open a file through the built-in function open() and return the file object. If it cannot be opened, OSError will be thrown.
File content
Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple
read
If the file is small, read() is the most convenient one-time read
file = open("222.txt", mode="r", encoding="utf-8") print(type(file)) #Type of print file object print(file.read()) #Read all the contents of the file at one time, which means that it is doomed to be unable to read large files file.close() #After opening the file, be sure to close the file, otherwise it will always occupy memory
result
<class '_io.TextIOWrapper'> Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple
What's left after read?
with open("222.txt", mode="r", encoding="utf-8") as f: print("for the first time read") print(f.read()) print("The second time read") print(f.read())
for the first time read Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple Tang Monk vs Decepticons The second time read Process finished with exit code 0
We found a problem. After the first read, we didn't read anything the second time. We can imagine that there are N cakes in the pot (get the content from the file to the cache). When the pot is brought to the basin (take it out of the cache and print it), the pot is empty
with
Well, we have a certain understanding of reading files, but we need to interrupt. There is a disadvantage in the way we read files above, that is, we must close the files at the end. The following two situations may cause us not to close the files normally, resulting in a waste of system resources
1. Naughty, just forget to write
2. The file was opened, but before closing, the program reported an error and could not close
The first one is good. Human eye verification is hard. The second one is hard. One solution is to try... finally, but it is still very cumbersome. In order to solve this problem, with is introduced. The following method will automatically call close for us
with open("222.txt", mode="r", encoding="utf-8") as f: print(type(f)) print(f.read())
<class '_io.TextIOWrapper'> Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple
Is it the same as the first example? We'll write it like this in the future. Don't write it like before.
read(size)
Calling read() will read all the contents of the file at one time. If the file has 10G, the memory will burst. Therefore, to be safe, you can call the read(size) method repeatedly to read the contents of size bytes at most each time
If the file size cannot be determined, it is safer to call read(size) repeatedly
As for how to use it, to be honest, I haven't used this in actual use, because usually I read more configuration files, but I copied an answer
def readlines(f, separator): ''' Method of reading large files :param f: File handle :param separator: Separator for each line :return: ''' buf = '' while True: while separator in buf: position = buf.index(separator) # Position of separator yield buf[:position] # Slice, from start position to separator position buf = buf[position + len(separator):] # Slice again, cut off the data of yield, and retain the remaining data chunk = f.read(4096) # Read 4096 data into buf at one time if not chunk: # If no data is read yield buf # Return data in buf break # end buf += chunk # If read has data, add the read data to buf with open('text.txt',encoding='utf-8') as f: for line in readlines(f,'|||'): # Why can the readlines function use the for loop to traverse? Because there is the yield keyword in this function, which is a generator function print(line)
readline
Call readline() to read one line at a time
with open("222.txt", mode="r", encoding="utf-8") as f: print(type(f)) print(f.readline())
<class '_io.TextIOWrapper'> Hello
We found that this readline can only read one line. How can we read multiple lines.
Before reading all the contents of the file, let's upgrade the file and add a blank line in the middle
Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple Tang Monk vs Decepticons
with open("222.txt", mode="r", encoding="utf-8") as f: done = 0 while not done: # 0 is False, not False = True line = f.readline() if line != "": # If the read content is not empty print(line.strip()) # Print the contents of this line. The strip() method is used to remove the characters specified at the beginning and end of the string (space or newline by default) else: done = 1 # If the content read by readline is empty, the loop ends
Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple Tang Monk vs Decepticons
Many people may have questions here. Will an empty line be regarded as the end of the file? In fact, a blank line in a file does not return a blank line. Because there is one or more separators at the end of each line, the "blank line" will have at least one line break or other symbols used by the system. Therefore, even if the file really contains a "blank line", the read line is not empty, which means that the program will not actually stop until the end of the actual traversal and reading of the file.
readlines
Call readlines() to read everything at once and return the list by line
If it is a configuration file, it is most convenient to call readlines()
with open("222.txt", mode="r", encoding="utf-8") as f: for i in f.readlines(): print(i.strip()) print(type(f.readlines()))
Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple Tang Monk vs Decepticons <class 'list'>
with open("222.txt", mode="r", encoding="utf-8") as f: print(f.readlines())
['Hello\n', 'Zhang San\n', 'Outlaw maniac\n', 'Lin Daiyu Fengxue mountain temple\n', '\n', 'Tang Monk vs Decepticons']
We found that the newline is \ n, the blank line is \ n, and the read blank line is not ""
for line in f.readlines(): print(line.strip()) # Delete '\ n' at the end
rb and encoding
All characters in Python 3 are in the form of utf-8. What if I don't know the code when opening a file?
Then we don't specify the encoding code. Python 3 defaults to utf8, but that doesn't work. At the same time, we need to change the reading mode. r is the text mode, which can directly read the string. If the user doesn't know the file format, he can not specify the encoding format, and directly use the rb mode, It's how the hard disk is stored. You can store it in memory directly in binary form
Error demonstration
with open("one.jpg", mode="r") as f: print(f.readlines())
report errors
Traceback (most recent call last): File "/Users/zc/PycharmProjects/pythonProject1/test/MyOne.py", line 2, in <module> print(f.readlines()) File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
Correct opening mode
with open("one.jpg", mode="rb") as f: print(f.readlines())
It's also very big to open a picture through binary files. Just show part of it
d7\xc4\xa5\x92\xbe]4H\x06\x9d8\xa6Aa\x8d\x15b%\xa3tD\x84\x8fUL\xa1F\xb7\x95\xc7\xf6G\xf4\xf6\xa0\x96T\x0b\xe7\xd5Y\xdbN:Vm\xac\xbd2\xe4`\xa6\x9eS\xea\x93J>\xb0\xaa\xd5\x04\x1d1)\xfa\xf8\xd3\x9b\xff\x00\xaf\xed|\x12+\xfe\x93\xf9\x0c}\xbd*\x82M/W\xe1\xe5\xf6\xf4\xae\x97r\xd6\xe2e\x9a\xb9*\x8a\x1aZ\x91\x03\x95vT}\x12i\x88\xd8\x8e}#\x8f\xf6\xfe\xdcm\xd2{5\x0c~ ~\xdf\xe5\xd2\xcf\x0c:\x96\x93$\xff\x00\x83\xfc\xfd\x0f; . . . . . . .
How do you know the file encoding format?
chardet!
Installation command
pip3 install chardet
use
import chardet result = chardet.detect(open("222.txt", mode="rb").read()) print(result)
{'encoding': 'utf-8', 'confidence': 0.99, 'language': ''}
What if you encounter non-standard files
When encountering some files with nonstandard encoding, you may encounter Unicode decodeerror, because some illegally encoded characters may be mixed in the text file. In this case, the open() function also receives an errors parameter, which indicates how to deal with coding errors. The simplest way is to ignore it directly
Reverse textbook, use gbk encoding to read utf-8 encoding file
with open("222.txt", mode="r", encoding="gbk") as f: print(f.read())
report errors
Traceback (most recent call last): File "/Users/zc/PycharmProjects/pythonProject1/test/MyOne.py", line 2, in <module> print(f.read()) UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 18: illegal multibyte sequence
What if we ignore the error report
with open("222.txt", mode="r", encoding="gbk", errors='ignore') as f: print(f.read())
result
Youソ For three years Legal check Increase in support area Panel lightфLong dark manuscriptぉPlutonium
It is found that although the code is wrong, at least no error is reported and the program is not interrupted
write
write
with open("222.txt", mode="r") as f: print("Before the file is written") print(f.read()) with open("222.txt", mode="w") as f: f.write("Why do meteorites always fall in craters? So accurate. Who dug this crater") with open("222.txt", mode="r") as f: print("After the file is written") print(f.read())
result
Before the file is written Hello Zhang San Outlaw maniac Lin Daiyu Fengxue mountain temple Tang Monk vs Decepticons After the file is written Why do meteorites always fall in craters? So accurate. Who dug this crater
We found that the file was overwritten. In fact
w is not a modification, but a new file name is created. If it has the same name as the original old file, the original file is empty. If it has a different file name, it is a new one, so we should use it carefully: w
append mode
Ah, it overwrites my file. I don't want to. I just want to add content at the end of the file. Then we need to use mode = "a"
with open("222.txt", mode="r") as f: print("Before the file is written") print(f.read()) with open("222.txt", mode="a") as f: f.write("This is an addition") with open("222.txt", mode="r") as f: print("After the file is written") print(f.read())
Before the file is written Why do meteorites always fall in craters? So accurate. Who dug this crater After the file is written Why do meteorites always fall in craters? So accurate. Who dug this crater? This is an additional content
We found that it is directly added to the end of the file, and it is a peer display. If you want to cross line display, you only need to
```
f.write("this is the additional content")
```
Modified into
f.write("\n This is an addition")
writelines
a = ["\n", "I'm grandma Liu\n", "Grandma Liu's Baoyu fell in love with me"] with open("222.txt", mode="r") as f: print("Before the file is written") print(f.read()) with open("222.txt", mode="a") as f: f.writelines(a) with open("222.txt", mode="r") as f: print("After the file is written") print(f.read())
Before the file is written Why do meteorites always fall in craters? So accurate. Who dug this crater After the file is written Why do meteorites always fall in craters? So accurate. Who dug this crater I'm grandma Liu Grandma Liu's Baoyu fell in love with me
File path
File path is divided into relative path and absolute path. Relative path refers to the path relationship between the path of a file or folder and other files or folders, while absolute path refers to the path from the drive letter (i.e. disk area) to the current location.
Determine whether the path of a file or folder is an absolute path
import os print(os.path.isabs("222.txt")) print(os.path.isabs("/Users/zc/PycharmProjects/pythonProject1/test/222.txt"))
False True
Get file absolute path
import os print(os.path.abspath("222.txt"))
/Users/zc/PycharmProjects/pythonProject1/test/222.txt
Get current path
import os path1 = os.getcwd() path2 = os.path.dirname(__file__) #More commonly used print(path1) print(path2)
/Users/zc/PycharmProjects/pythonProject1/test /Users/zc/PycharmProjects/pythonProject1/test
Determine whether the path exists
import os print(os.path.exists("/Users/zc/PycharmProjects/pythonProject1/test")) print(os.path.exists("/Users/zc/PycharmProjects/pythonProject1/tes2"))
True False
Return to the path
If the input parameter is a path, it returns to the previous layer
import os path1 = os.getcwd() print(path1) path2 = os.path.dirname(path1) print(path2)
/Users/zc/PycharmProjects/pythonProject1/test /Users/zc/PycharmProjects/pythonProject1
If the input parameter is a file, the path of the file is returned
import os path1 = "/Users/zc/PycharmProjects/pythonProject1/test/MyOne.py" path2 = os.path.dirname(path1) print(path2)
/Users/zc/PycharmProjects/pythonProject1/test
Splicing path
import os path1 = os.path.dirname(__file__) path2 = os.path.join(path1, "112.txt") path3 = os.path.join(path1, "222.txt") print(path2) print(path3)
/Users/zc/PycharmProjects/pythonProject1/test/112.txt /Users/zc/PycharmProjects/pythonProject1/test/222.txt