A brief introduction to the module xlrd of Python operating Excel tables
Data processing is easy, simple and powerful for Python. It can process a large amount of data information in just a few lines of code. Here is a module for Python to process Excel tables: xlrd.
1, xlrd introduction and installation
xlrd is a module for Python to process excel table data. It can read Excel data, including:
1. Read Excel rows, columns, row values, column values
2. Read cell value and data type
3. Add, delete and modify table data
4. Export and import worksheets, etc
The installation of xlrd is relatively simple. Just open the cmd command box and enter: pip install xlrd, or open the compiler Jupiter book and enter: pip install xlrd can also be installed in the code bar, and then enter the following code in the code bar to check whether the installation is successful (if you can see the version number, it means the installation is successful):
import xlrd xlrd.__version__ # __ Double underline
2, Introduction to xlrd function
1,open_workbook(path)
As the name suggests, this function is to open a worksheet, that is, to open an Excel table, and path is the path of the Excel file. The operation is as follows:
import xlrd xls = xlrd.open_workbook("file/Programming language ranking.xls")
open_ The Workbook () function returns an object, that is, a pointer to the Excel worksheet "programming language ranking. xls". With this object, we can perform various operations on the table.
2. Get sheet in Excel worksheet
There are three ways to get the sheet table in the open worksheet:
2.1,xls.sheets() [index]: the index starts from 0 and corresponds to sheet1, sheet2, sheet3
2.2,xls. sheet_ by_ Index: the index starts from 0 and corresponds to sheet1, sheet2, sheet3
2.3,xls.sheet_by_name(sheet name): the premise is to know the sheet name in the table
The above return is also an object, that is, a pointer to the obtained sheet table. With this object, you can operate on the contents of the sheet, such as obtaining the number of rows, columns, cell contents, etc. in addition, you can use XLS sheet_ Names() gets the names of all sheets in the worksheet, such as:
# Three ways to get a worksheet sheet1 = xls.sheets()[0] print(sheet1) sheet2 = xls.sheet_by_index(0) print(sheet2) sheet3 = xls.sheet_by_name("2019 year") print(sheet3) # Gets all sheet names of the worksheet sheet_name = xls.sheet_names() print(sheet_name)
Execution results:
3. nrows and ncols
We all know that row and col are rows and columns, so nrows and ncols are the number of rows and columns, that is, we use sheet Nrows and sheet Ncols can obtain the functions and columns of the worksheet we have opened. The operations are as follows:
sheet_names = xls.sheet_names() # Gets the names of all sheet tables for i in range(3): sheet = xls.sheets()[i] # Get table sheetx object rows = sheet.nrows # Gets the number of rows in the worksheet cols = sheet.ncols # Gets the number of columns in the worksheet print('form "{}" Total {} that 's ok,{} column'.format(sheet_names[i],rows,cols)) # Printout
Execution results:
4. Function to get row data
The following functions are used to obtain data for rows:
4.1,sheet.row(n): get the data of row n, where n is the number of rows. Starting from 0, return the list composed of all cell data in this row.
4.2,sheet.row_slice(n): get the data of row n, where n is the number of rows. Starting from 0, return a list composed of all cell data in this row.
4.3,sheet.row_types(rowx = n, start_colx = x, end_colx = y): get the data type of row n, where start_ Colx and end_ Colx (if None, get to the end) is the range of the column, that is, specify the number of rows n, and then specify the range X and y of the column, which can be abbreviated as: sheet.row_types(n,x,y), and return the list composed of the obtained data types.
Note: there are 5 types in total: 0 empty, 1 string, 2 number, 3 date, 4 boolean and 5 error
4.4,sheet.row_values(rowx = n, start_colx = x, end_colx = y): get the data of row n, where start_ Colx and end_ Colx (if it is None, get to the end) is the range of the column, that is, specify the number of rows n, and then specify the range X and y of the column. It can be abbreviated as: sheet.row_values(n,x,y) returns the list composed of the obtained data.
4.5,sheet.row_len(n): get the effective length of line n. Note: column does not have this function.
data = sheet1.row(0) # Get data for line 1 print(data) data = sheet1.row_slice(1) # Get the data of line 2 print(data) data = sheet.row_types(rowx = 0,start_colx = 0,end_colx = 5) # Get the data of row 1 and columns 1-5 print(data) data = sheet.row_values(1,0,3) # Get the data of row 2 and columns 1-3 print(data) num = sheet.row_len(0)# Gets the effective length of line 1 print(num)
Execution results:
5. Function to get row data
The following functions are available for data acquisition of columns:
5.1,sheet.cols(n): get the data of column n, where n is the number of columns. Starting from 0, return a list composed of all cell data in this column.
5.2,sheet.col_slice(n): get the data of column n, where n is the number of columns. Starting from 0, return a list composed of all cell data in this column.
5.3,sheet.col_types(colx = n, start_rowx = x, end_rowx = y): get the data type of column n, where start_ Rowx and end_ Rowx (if it is None, get to the end) is the range of rows, that is, specify the number of columns n, and then specify the range X and y of rows. It can be abbreviated as: sheet.col_types(n,x,y), and return a list composed of obtained data types.
Note: there are 5 types in total: 0 empty, 1 string, 2 number, 3 date, 4 boolean and 5 error
5.4,sheet.col_values(rowx = n, start_rowx = x, end_rowx = y): get the data of column n, where start_ Rowx and end_ Rowx (if it is None, get to the end) is the range of rows, that is, specify the number of columns n, and then specify the range X and y of rows. It can be abbreviated as: sheet.col_values(n,x,y) returns a list composed of obtained data.
data = sheet1.col(0) # Get data for line 1 print("The data in the first column is:",data) data = sheet1.col_slice(1) # Get the data of line 2 print("The data in the second column is:",data) data = sheet.col_types(colx = 0,start_rowx = 0,end_rowx = 5) # Get the data of row 1 and columns 1-5 print("Column 1, column 1-5 The data of the row is:",data) data = sheet.col_values(1,0,3) # Get the data of row 2 and columns 1-3 print("Column 2, column 1-3 The data of the row is:",data)
Execution results:
6. Function to get cell data
The following functions are used to obtain data from cells:
6.1,sheet.cell(rowx = n, colx = m): get the cell data corresponding to row N and column m (cell type data is returned. To get the data itself, use sheet.cell (n, m) value).
6.2,sheet.cell_type(rowx = n, colx = m): get the cell data type corresponding to row N and column M.
6.3,sheet.cell_value(rowx = n, colx = m): get the cell data corresponding to row N and column M.
data = sheet.cell(rowx = 2, colx = 3) # Get the cell data corresponding to row 3 and column 4 print("The cell data corresponding to row 3 and column 4 are:",data) print("The cell data corresponding to row 3 and column 4 are:",data.value) data = sheet.cell_type(2,3) # Get the cell data type corresponding to row 3 and column 4 print("The cell data types corresponding to row 3 and column 4 are:",data) data = sheet.cell_value(2,1) # Get the cell data corresponding to row 3 and column 2 print("The cell data corresponding to row 3 and column 2 are:",data)
Execution results:
7. Overall operation
The following example is to obtain all the data of the worksheet "programming language ranking. xls":
import xlrd # Import module table = xlrd.open_workbook("file/Programming language ranking.xls") # Open local Excel table: programming language ranking xls sheet_names = table.sheet_names() # Gets all table names for the workbook for i in range(len(sheet_names)): # Obtain table data in turn sheet = table.sheets()[i] rows = sheet.nrows for j in range(rows): # Cycle print table data data = sheet.row_values(j,0,None) print("form {} The first {} The data of the row is:{}".format(sheet_names[i],j+1,data)) print("-"*100)
Execution results:
The above is a brief introduction to the module xlrd of Python operating Excel tables. This is relatively simple. Of course, there are other operation modules, such as pandas, which will be introduced later.