In fact, I'm not afraid of jokes. In the initial stage of learning crawler, my dream is to automatically log in to station b and obtain the data I want to obtain in station b. Now I have finally realized this dream. I am very grateful to csdn bloggers. Your blog has brought me great help. Thank you~
This time, I also summarized the cracking of site b's touch verification code and the complete login of site b, hoping to help those in need. Let's start the analysis and explanation below.
1. Libraries needed
import time import requests from PIL import Image from io import BytesIO from selenium import webdriver from chaojiying_Python import chaojiying from selenium.webdriver.common.action_chains import ActionChains
Although it is a little small, most of them are commonly used libraries except super eagle, and the problem is not big. So let's talk about the use of super eagle, including how to call super Eagle (this is from chaojiying_Python import chaojiying)
1.1 super Eagle
Let's give a brief introduction to super Eagle: Super eagle, the full name of super Eagle coding platform (it's a good thing emmm)
Website link: https://www.chaojiying.com/
1. Register first and then log in (this is not taught =.)
2. Click development documentation
3. After entering the development document, find your corresponding programming language. Because I use python here, I use python by default
4. Click (click here to download) to download the super Eagle document. After downloading, you can get a compressed package of the super Eagle document
5. After decompression, you can get such a file package
Let's just run the py file
6. After opening the py file, we can pull it to the bottom and see it at the entrance
The three parameters in the first line are your super Eagle account, password and software id.
The second line is the path of the image to be read. rb won't explain it
The third line 1902 is the type of verification code
7. Acquisition of software id
After logging in the super eagle on the web page, we will enter the user center by default. Pull down the page to see the word software id, and click in
After clicking in, you don't have this software id list at the beginning. We can click the above to generate a software id
Fill in the software name casually, indicating that it is not necessary to fill in. The software key is automatically generated. After filling in, click submit to get a software id.
8. Acquisition of verification code type
We can directly click this price system and see various verification code types after entering. We can select specific verification code types for use according to specific needs. For the login case of station b this time, we use 9005
9. Credit recharge
Return to the user center and click this button to recharge immediately (after entering, select the user-defined quota. Don't enter 100 at once, ha ha)
So far, how to use the super eagle is finished. Of course, it is not very complete, but it is enough for our case
2. Simulated landing at station b
Website for this test: https://passport.bilibili.com/login
2.1 preparation before landing
def __init__(self): """ self.user_name:b Station login account self.password:b Station login password self.chaojiying_user_name:Super Eagle login account self.chaojiying_password:Super Eagle login password self.chaojiying_ID:Super Eagle software ID self.chaojiying_kind:Super Eagle verification code type :return:None """ self.url = 'https://passport.bilibili.com/login' self.driver = webdriver.Chrome() self.driver.maximize_window() self.username = 'B Station account' self.password = 'B Station password' self.chaojiying_user_name = 'Super Eagle account' self.chaojiying_password = 'Super Eagle password' self.chaojiying_ID = 'Super Eagle software id' self.pic_id = '' self.chaojiying = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID)
Before starting to implement login, prepare our things and put them in the initialization method.
self. chaojiying = chaojiying. Chaojiying_ The client (self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_id) initializes the method by calling the method in the super Eagle py file. As for how to import the super eagle, you can directly put the extracted file package of the super eagle in the same directory as your code, and then call from chaojiying_Python import chaojiying is OK.
2.2. Open the website
def open(self): """ input b Station account and password :return:None """ self.driver.get(self.url) self.driver.find_element_by_id('login-username').send_keys(self.username) self.driver.find_element_by_id('login-passwd').send_keys(self.password)
2.3. Get login button
def get_touclick_button(self): """ Get initial verification button :return: Login button """ # Get initial verification button button = self.driver.find_element_by_xpath('//*[@id="geetest-wrap"]/div/div[5]/a[1]') return button
2.4. Obtain the touch verification code and process the verification code
def pick_code(self): """ Obtain the touch verification code and process the touch verification code :return: None """ time.sleep(3) # Get the label of the touch picture pick_img_label = self.driver.find_element_by_css_selector('img.geetest_item_img') # Get a link to a touch picture src = pick_img_label.get_attribute('src') response = requests.get(src).content # File reading and writing f = BytesIO() f.write(response) img = Image.open(f) # Gets the size ratio of the picture to the browser's label scale = [pick_img_label.size['width'] / img.size[0], pick_img_label.size['height'] / img.size[1]] # Call super Eagle cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID) result = cjy.PostPic(response, 9005) # Analyze the results position = result['pic_str'].split('|') position = [[int(j) for j in i.split(',')] for i in position] if result['err_no'] == 0: for item in position: ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0], item[1] * scale[1]).click().perform() time.sleep(1) time.sleep(2) btn = self.driver.find_element_by_css_selector('div.geetest_commit_tip') time.sleep(1) btn.click()
Judging from the amount of code, it is not difficult to see that this place is the most difficult place for the whole login. There are a few points that I have encountered some problems when I did it myself. Let me share with you.
2.4. 1. Gets the size ratio of the picture to the browser's label
scale = [pick_img_label.size['width'] / img.size[0], pick_img_label.size['height'] / img.size[1]]
This place must be necessary. If you don't get the size ratio between the picture and the browser, you will find that it won't be able to click when you click later
2.4. 2. Call super Eagle
cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID) result = cjy.PostPic(response, 9005)
Result: the final result of running: {err_no ': 0,' err_str ':'ok', 'pic_id':'1162317367529700100 ',' pic_str ':'145, 59|245, 255|158, 271|241, 150', 'MD5':'188f2726a81e33871e1e57b57be76b71ef '} let's remember this result and look down
2.4. 3,pic_str was used to analyze the results
pic_str, (string) the recognized result. Analyze the coordinates, and analyze the coordinates into the methods we need through split and loop nesting
The final result is: [[145, 59], [245, 255], [158, 271], [241, 150]] (of course, the coordinates of each picture must be different, so the value of your position will be different from me, which is normal)
2.4.4,err_no
err_no, (numeric) return code. If 0 is returned, it means successful return. We add a judgment, return 0 and click
if result['err_no'] == 0: for item in position: ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0], item[1] * scale[1]).click().perform()
2.5. Handling of login failure
Generally speaking, the coding platform is not 100% successful. At this time, we need to make a judgment on it and re identify those who fail to identify the verification code. We can judge whether the login is successful by judging whether the url is switched. The code is as follows:
def detect(self): """ Handling in case of login failure :return:None """ current = self.driver.current_url if ( current == 'https://passport.bilibili.com/account/security#/home' or current == 'https://www.bilibili.com/'): print('Success!!') else: self.chaojiying.ReportError(self.pic_id) self.pick_code() self.detect()
So far, the click out simulation login of station b is over. Thank you again for your help. The following release of the complete code, there is something wrong, I hope you can correct it, thank you~~
import time import requests from PIL import Image from io import BytesIO from selenium import webdriver from chaojiying_Python import chaojiying from selenium.webdriver.common.action_chains import ActionChains class BiliiliSpider(object): def __init__(self): """ self.user_name:b Station login account self.password:b Station login password self.chaojiying_user_name:Super Eagle login account self.chaojiying_password:Super Eagle login password self.chaojiying_ID:Super Eagle software ID self.chaojiying_kind:Super Eagle verification code type :return:None """ self.url = 'https://passport.bilibili.com/login' self.driver = webdriver.Chrome() self.driver.maximize_window() self.username = '18676556724' self.password = '1997yx0912' self.chaojiying_user_name = 'chatblanc' self.chaojiying_password = '1997yx0912' self.chaojiying_ID = 925932 self.pic_id = '' self.chaojiying = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID) def open(self): """ input b Station account and password :return:None """ self.driver.get(self.url) self.driver.find_element_by_id('login-username').send_keys(self.username) self.driver.find_element_by_id('login-passwd').send_keys(self.password) def get_touclick_button(self): """ Get initial verification button :return: Login button """ # Get initial verification button button = self.driver.find_element_by_xpath('//*[@id="geetest-wrap"]/div/div[5]/a[1]') return button def pick_code(self): """ Obtain the touch verification code and process the touch verification code :return: None """ time.sleep(3) # Get the label of the touch picture pick_img_label = self.driver.find_element_by_css_selector('img.geetest_item_img') # Get a link to a touch picture src = pick_img_label.get_attribute('src') response = requests.get(src).content # File reading and writing f = BytesIO() f.write(response) img = Image.open(f) # Gets the size ratio of the picture to the browser's label scale = [pick_img_label.size['width'] / img.size[0], pick_img_label.size['height'] / img.size[1]] # Call super Eagle cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID) result = cjy.PostPic(response, 9005) # Analyze the results position = result['pic_str'].split('|') position = [[int(j) for j in i.split(',')] for i in position] if result['err_no'] == 0: for item in position: ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0], item[1] * scale[1]).click().perform() time.sleep(1) time.sleep(2) # Positioning OK button btn = self.driver.find_element_by_css_selector('div.geetest_commit_tip') time.sleep(1) btn.click() def detect(self): """ Handling in case of login failure :return:None """ current = self.driver.current_url if ( current == 'https://passport.bilibili.com/account/security#/home' or current == 'https://www.bilibili.com/'): print('Success!!') else: self.chaojiying.ReportError(self.pic_id) self.pick_code() self.detect() def main(self): """ Main function :return:None """ self.open() time.sleep(2) button = self.get_touclick_button() button.click() self.pick_code() time.sleep(2) # self.detect() if __name__ == '__main__': bilibili = BiliiliSpider() bilibili.main()