At present, the real offer mainly uses the daily data of a shares. The daily data can be downloaded free of charge on platforms such as BaoStock, Tushare and AKShare. We first choose BaoStock as the data source. This article records the process of downloading A-share list from BaoStock.
BaoStock installation
Open PyCharm and enter the following command in the Terminal window below to complete the installation:
pip install baostock -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host pypi.tuna.tsinghua.edu.cn
New source file
First, create a new source file, right-click the Project directory in the Project window under PyCharm, for example, my Project directory is mnj, and then select new - > Python file:
Enter the file name of the new file, such as download_data_v1, select Python file as the file type below (default):
source code
download_ data_ All codes of v1.py are as follows:
import baostock as bs import datetime import sys ''' Function: Returns the of the specified date A Stock code list If parameter date If it is blank, the date of the last trading day will be returned A Stock code list If parameter date If it is not empty and is a trading day, return date Current day A Stock code list If parameter date If it is not empty but not a trading day, the non trading day information will be printed and the program will exit parameter date date Return value A List of stock codes ''' def get_stock_codes(date=None): # Log in to biostock bs.login() stock_df = bs.query_all_stock(date).get_data() if 0 == len(stock_df): if date is not None: print('The currently selected date is a non trading day or there is no trading data, please set it date Is the date of a historical trading day') sys.exit(0) delta = 1 while 0 == len(stock_df): stock_df = bs.query_all_stock(datetime.date.today() - datetime.timedelta(days=delta)).get_data() delta += 1 # Log out bs.logout() # Filter stock data stock_df = stock_df[(stock_df['code'] >= 'sh.600000') & (stock_df['code'] < 'sz.399000')] # Return to stock list return stock_df['code'].tolist() if __name__ == '__main__': stock_codes = get_stock_codes() print(stock_codes)
code analysis
- Lines 1 ~ 3, referencing related packages
- Lines 5 ~ 17, function description
- Lines 18 ~ 35, define function
- In line 20, log in to biosock. You need to log in first every time you access the biosock data
- Line 21, call the query of baostock_ all_ Stock function, which Official documents The description is as follows:
Securities code query: query_all_stock()
Method description: get the list of all stocks on the specified trading date. Obtain the securities code and stock trading status information through the API interface and update it at the same time with the daily K-line data. You can obtain data (including A shares and indexes) through the parameter "A trading day". The data range is the same as that of the interface query_history_k_data_plus().
Return type: DataFrame type of pandas.
Update time: updated at the same time as the daily K-line.
There is a problem with the return type after testing, which is tested by the following code:
print(type(bs.query_all_stock()))
The print result is:
<class 'baostock.data.resultset.ResultData'>
The return type of this function is the data type customized by baostock, which can be accessed through get_ The data function can obtain the return result of Pandas DataFrame type:
print(type(bs.query_all_stock().get_data()))
The print result is
<class 'pandas.core.frame.DataFrame'>
We print get_data. The results are as follows. For more DataFrame print settings, please refer to link:
code tradeStatus code_name 0 sh.000001 1 Shanghai composite index 1 sh.000002 1 Shanghai A Stock index 2 sh.000003 1 Shanghai B Stock index 3 sh.000004 1 Shanghai Industrial Index 4 sh.000005 1 Shanghai commercial index 5 sh.000006 1 Shanghai real estate index 6 sh.000007 1 Shanghai public utility index 7 sh.000008 1 Shanghai Composite Industry Index 8 sh.000009 1 SSE 380 9 sh.000010 1 SSE 180 Index 10 sh.000011 1 Shanghai Stock Exchange Fund Index 11 sh.000012 1 Shanghai treasury bond index 12 sh.000013 1 SSE Corporate Bond Index 13 sh.000015 1 Shanghai Stock Exchange dividend index 14 sh.000016 1 SSE 50 Index 15 sh.000017 1 New Shanghai Composite Index 16 sh.000018 1 SSE 180 financial stock index 17 sh.000019 1 sse corporate governance index 18 sh.000020 1 SSE Medium Enterprise Composite Index 19 sh.000021 1 SSE 180 corporate governance index 20 sh.000022 1 Shanghai corporate bonds ...
You can see that the results include both stock codes and some index codes. Among them, the code of Shanghai Stock Exchange starts with "sh." and the code of Shenzhen Stock Exchange starts with "SZ.". The code naming method may be different in different data sources. For example, in ptrade, the code of Shanghai Stock Exchange ends with ". SS" (such as 600000.SS), and the code of Shenzhen Stock Exchange ends with ". SZ" (such as 00000 1. SZ).
- Line 22. If the length of the queried data is 0, judge the parameter date
- Line 23-25. If the parameter date is explicitly set, you will be prompted that date is a non trading day, and the program will exit. For example, if the day of calling the program is a non trading day, the following code is used when calling the function:
stock_codes = get_stock_codes(datetime.datetime.today())
Then the program will prompt non trading day and exit. If you want to obtain the data of the latest trading day, you can call it directly without setting parameters:
stock_codes = get_stock_codes()
- Lines 26-29. If no date is explicitly given, go back to history from the current day, find the nearest trading day from the current day, and obtain all stock and index codes of the current day
- Line 31, exit BaoStock login
- In line 33, the stock code is filtered out. The stock code is between sh.600000 and sz.399000. For more DataFrame filtering methods, please refer to link
- Line 35, return to A-share code list
- Lines 38 ~ 40, function application example
The final output result is:
['sh.600000', 'sh.600004', 'sh.600006', 'sh.600007', 'sh.600008', 'sh.600009', 'sh.600010', 'sh.600011', 'sh.600012', 'sh.600015', 'sh.600016', 'sh.600017', 'sh.600018', 'sh.600019', 'sh.600020', 'sh.600021', 'sh.600022', 'sh.600023', 'sh.600025', 'sh.600026', 'sh.600027', 'sh.600028', 'sh.600029', 'sh.600030', 'sh.600031', 'sh.600032', 'sh.600033', 'sh.600035', 'sh.600036', 'sh.600037', 'sh.600038', 'sh.600039', 'sh.600048', 'sh.600050', 'sh.600051', 'sh.600052', 'sh.600053', 'sh.600054', 'sh.600055', 'sh.600056', 'sh.600057', 'sh.600058', 'sh.600059', 'sh.600060', 'sh.600061', 'sh.600062', 'sh.600063', 'sh.600064', 'sh.600066', 'sh.600067', 'sh.600068', 'sh.600070', 'sh.600071', 'sh.600072', 'sh.600073', 'sh.600075', ...
So far, the list of A-share stocks has been obtained. The next article records the download method of daily data of these stocks.
Personal blog: https://coderx.com.cn/
Welcome to pay attention, like, forward and leave messages. Thank you for your support!
Wechat group is used for learning and communication. Group 1 is full and group 2 has been created. Interested readers please scan the code and add wechat!
QQ group (676186743) is used for data sharing. Welcome to join!