In system testing of B/S architecture, it is sometimes necessary to import Excel files to generate some data records. When the amount of data is small, there is usually no problem. When the amount of data imported is large, the performance of the system is a test.To verify the performance of your system, sometimes you need to import a large amount of data, such as 300,000 data records, but you don't have that much data at hand.Copy and paste one by one, or drag excel down to generate records?This will kill the tester, and the tedious, repetitive work will be very interesting.With python, we can easily generate large amounts of data, free people from monotonous, repetitive work, and do more meaningful things.
For example, we need to generate the following data.
This is sample data of user information, which contains several information such as number, name, gender, hobby, favorite movie, music and so on, among which only different numbers in columns A, I and L need to be checked.Other data is not of much concern in performance testing, so we can use rules to generate some data, where only column A, I, L are different and other information remains unchanged.
Idea: Save the header and sample data in a sample file, read the sample content by xlrd, and write the header and different information by rule by xlsxwriter.
Say nothing more, let's go directly to the code:
1 import xlrd 2 import xlsxwriter 3 #File name and path, preceded by one r Prevent unnecessary escaping. 4 filename=r'D:\001\example.xlsx' 5 data = xlrd.open_workbook(filename) 6 # Get the first sheet page 7 table = data.sheets()[0] 8 # Get row 2 (article 1) data 9 content=table.row(1) 10 print('Article 1 Sample data are:',content) 11 # Pass above print You can see what is read directly, although it is list,But each data is preceded by 12 # text:Word, not directly strong conversion tuple For our use, we customize a conversion method to make it usable list So that the later strong transition tuple 13 def convertRowToTuple(rowNum): 14 data=[] 15 # len(content)Is the number of columns in the row obtained above 16 for i in range(len(content)): 17 data.append(table.cell_value(rowNum,i)) 18 # Read out the data in each cell and add it to the data this list Medium and Strong Tuple Return 19 return tuple(data) 20 # Row 0 (i.e. excel Data is read and tupled for assignment to header 21 header=convertRowToTuple(0) 22 # Line 1 (i.e. excel Sample data is read and tupled for assignment to data1 23 data1=convertRowToTuple(1) 24 # print(header) 25 # print(data1) 26 # File name to write data to 27 filename2=r'D:\001\TestDatas2.xlsx' 28 # If you have a very large amount of data, you can enable constant_memory,This is a sequential write mode, where you get a row of data and write it immediately, without keeping all the data in memory. 29 # If this mode is not enabled, the program will most likely get stuck when there is a large amount of data 30 workbook = xlsxwriter.Workbook(filename2, {'constant_memory': True}) 31 # Create a new sheet page 32 worksheet = workbook.add_worksheet() 33 # startNum Indicates the line from which to start writing, where the number starts at 1, because it is followed by a combination of letters excel Medium, such as A1 Represents line 1 A column 34 startNum=1 35 # Initial values, followed by numbers that increase on top of them 36 startValues=['000001','245353','24289796'] 37 # Columns corresponding to initial values 38 col=['A','I','L'] 39 #Write header to file first 40 worksheet.write_row('A'+str(startNum), header) 41 # Formally start writing data 42 for i in range(300000): 43 # To keep the build process from getting bored, add this print information to view the progress 44 print('Creating #',i+1,'Bar data') 45 # The header occupies the first row, so the first data starts at the second row when i=0 When writing from A2 start 46 # take data1 Write data in turn A2,B2,C2...... 47 worksheet.write_row('A'+str(startNum+i+1), data1) 48 # Previously it was equivalent to copying everything from the first data, but A,I,L The three columns need to be in the lower row, so we'll rewrite each row A I L Values in Cells 49 for m in col: 50 length=len(startValues[col.index(m)]) 51 # Numbers are added with initial values i 52 content=str(int(startValues[col.index(m)])+i) 53 # Because a number like 000001 will lose the previous zero in the calculation, to keep the number of digits, the lost zero will be made up for it 54 if(len(content)<length): 55 content='0'*(length-len(content))+content 56 # Write the calculated value into the corresponding cell 57 worksheet.write(m+str(startNum+i+1), content) 58 # Close after writing workbook,Otherwise, an error will be made 59 workbook.close()
Next, let's make a cup of tea and wait for the data to be generated successfully.
After testing, it only takes about 4 minutes to generate 300,000 pieces of data!