1. txt or csv file
[External chain picture transfer failed, source station may have anti-theft chain mechanism, it is recommended to save the picture and upload it directly (img-pbcWqI5i-16311994397) (C:UsersfylalAppDataRoamingTyporatypora-user-imagesimage-2023000394;54.png)]
import numpy as np a = np.array(range(20)).reshape((4, 5)) print(a) # Change the suffix to the same as.txt filename = 'data/a.csv' # Write File np.savetxt(filename, a, fmt='%d', delimiter=',') # read file b = np.loadtxt(filename, dtype=np.int32, delimiter=',') print(b)
Disadvantages:
- Only one-dimensional and two-dimensional numpy arrays can be saved. When numpy arrays are multidimensional, they need to be two-dimensional to be saved.
- Save cannot be appended, that is, every time np.savetxt() overwrites the previous content
2. Read and write npy or npz files through numpy
- Read and write npy files
numpy.save(file, arr, allow_pickle=True, fix_imports=True) file:file name/File Path arr:Array to store allow_pickle:Boolean Value,Allow use Python pickles Save object array(Optional parameters,Default is fine) fix_imports:For convenience Pyhton2 Read Python3 Saved data(Optional parameters,Default is fine)
import numpy as np a=np.array(range(20)).reshape((2,2,5)) print(a) filename='data/a.npy' #Save Path # Write File np.save(filename,a) #read file b=np.load(filename) print(b) print(b.shape)
Advantage:
(1) npy files can hold numpy arrays of any dimension, not limited to one and two dimensions
(2) npy holds the structure of numpy arrays, including shape s and dtype s
Disadvantages:
(3) Only one numpy array can be saved, each save will overwrite the previous contents of the file
- Read and write npz files
Parameter introduction
numpy.savez(file, *args, **kwds) file:file name/File Path *args:Array to store,Can write multiple,If no array is specified Key,Numpy Will default from'arr_0','arr_1'Method Naming kwds:(Optional parameters,Default is fine)
import numpy as np a = np.array(range(20)).reshape((2, 2, 5)) b = np.array(range(20, 44)).reshape(2, 3 ,4) print('a:\n', a) print('b:\n', b) filename = 'data/a.npz' # Write the file, and if you don't specify a key, the default keys are'arr_0','arr_1', and keep going. np.savez(filename, a, b=b) # read file c = np.load(filename) print('keys of NpzFile c:\n', c.keys()) print("c['arr_0']:\n", c['arr_0']) print("c['b']:\n", c['b'])
What's more amazing is that instead of Numpy giving the array keys, we can give them meaningful keys so that we don't have to guess if we need to load the data.
#Data Save np.savez('newsave_xy',x=x,y=y) #Read saved data npzfile=np.load('newsave_xy.npz') #Access by setting the array key on save npzfile['a'] npzfile['b']
Advantage:
(1) npy files can hold numpy arrays of any dimension;
(2) npy preserves the structure of numpy arrays;
(3) Multiple numpy arrays can be saved at the same time
(4) You can specify a key to hold the numpy array, which is convenient to read.
Disadvantages:
(1) When multiple numpy arrays are saved, they can only be saved at the same time.
-
Read and write hdf5 files through h5py
import numpy as np import h5py a = np.array(range(20)).reshape((2, 2, 5)) b = np.array(range(20)).reshape((1, 4, 5)) print(a) print(b) filename = 'data/data.h5' # Write File h5f = h5py.File(filename, 'w') h5f.create_dataset('a', data=a) h5f.create_dataset('b', data=b) h5f.close() # read file h5f = h5py.File(filename, 'r') print(type(h5f)) # numpy array from slice print(h5f['a'][:]) print(h5f['b'][:]) h5f.close()
By Slice Amplitude
import numpy as np import h5py a = np.array(range(20)).reshape((2, 2, 5)) print(a) filename = 'data/a.h5' # Write File h5f = h5py.File(filename, 'w') # h5f['a'] may not be initialized directly when the array A is too large to be sliced for operation; # The maxshape parameter can be omitted when there is no need to change the shape of h5f['a'] later h5f.create_dataset('a', shape=(2, 2, 5), maxshape=(None, 2, 5), dtype=np.int32, compression='gzip') for i in range(2): # Assignment in the form of slices h5f['a'][i] = a[i] h5f.close() # read file h5f = h5py.File(filename, 'r') print(type(h5f)) print(h5f['a']) # numpy array from slice print(h5f['a'][:])
(1) Numpy array dimension is not limited, numpy array structure and data type can be maintained;
(2) Suitable for large numpy arrays and small file footprint;
(3) dataset can be accessed by key (numpy.array), which is easy to read without confusion.
(4) The contents contained in the original file may not be overwritten.
3. Summary
- csv and txt can only be used to store one-dimensional or two-dimensional numpy arrays;
- npy is used to store a single numpy array. npz can store multiple numpy arrays at the same time, both of which are not limited to the numpy dimension, and both maintain the shape and dtype of the numpy array. When writing a file, the original file content can only be overwritten if it exists.
- When the numpy array is large, it is best to use hdf5 files, which are relatively smaller.
- When the numpy array is large and MemoryError is prone to occur when the entire numpy array is computed, you can choose to slice the numpy array and save the computed array to a hdf5 file, which supports slice indexing.
Reference resources: