Python | crawler | selenium automated test | b site touch verification code login

Posted by erwt on Mon, 13 Dec 2021 10:58:04 +0100

In fact, I'm not afraid of jokes. In the initial stage of learning crawler, my dream is to automatically log in to station b and obtain the data I want to obtain in station b. Now I have finally realized this dream. I am very grateful to csdn bloggers. Your blog has brought me great help. Thank you~
This time, I also summarized the cracking of site b's touch verification code and the complete login of site b, hoping to help those in need. Let's start the analysis and explanation below.

1. Libraries needed

import time
import requests
from PIL import Image
from io import BytesIO
from selenium import webdriver
from chaojiying_Python import chaojiying
from selenium.webdriver.common.action_chains import ActionChains

Although it is a little small, most of them are commonly used libraries except super eagle, and the problem is not big. So let's talk about the use of super eagle, including how to call super Eagle (this is from chaojiying_Python import chaojiying)

1.1 super Eagle

Let's give a brief introduction to super Eagle: Super eagle, the full name of super Eagle coding platform (it's a good thing emmm)
Website link: https://www.chaojiying.com/

1. Register first and then log in (this is not taught =.)

2. Click development documentation

3. After entering the development document, find your corresponding programming language. Because I use python here, I use python by default

4. Click (click here to download) to download the super Eagle document. After downloading, you can get a compressed package of the super Eagle document

5. After decompression, you can get such a file package


Let's just run the py file

6. After opening the py file, we can pull it to the bottom and see it at the entrance

The three parameters in the first line are your super Eagle account, password and software id.
The second line is the path of the image to be read. rb won't explain it
The third line 1902 is the type of verification code

7. Acquisition of software id

After logging in the super eagle on the web page, we will enter the user center by default. Pull down the page to see the word software id, and click in

After clicking in, you don't have this software id list at the beginning. We can click the above to generate a software id

Fill in the software name casually, indicating that it is not necessary to fill in. The software key is automatically generated. After filling in, click submit to get a software id.

8. Acquisition of verification code type

We can directly click this price system and see various verification code types after entering. We can select specific verification code types for use according to specific needs. For the login case of station b this time, we use 9005

9. Credit recharge

Return to the user center and click this button to recharge immediately (after entering, select the user-defined quota. Don't enter 100 at once, ha ha)

So far, how to use the super eagle is finished. Of course, it is not very complete, but it is enough for our case

2. Simulated landing at station b

Website for this test: https://passport.bilibili.com/login

2.1 preparation before landing

    def __init__(self):
        """
        self.user_name:b Station login account
        self.password:b Station login password
        self.chaojiying_user_name:Super Eagle login account
        self.chaojiying_password:Super Eagle login password
        self.chaojiying_ID:Super Eagle software ID
        self.chaojiying_kind:Super Eagle verification code type
        :return:None
        """
        self.url = 'https://passport.bilibili.com/login'
        self.driver = webdriver.Chrome()
        self.driver.maximize_window()
        self.username = 'B Station account'
        self.password = 'B Station password'
        self.chaojiying_user_name = 'Super Eagle account'
        self.chaojiying_password = 'Super Eagle password'
        self.chaojiying_ID = 'Super Eagle software id'
        self.pic_id = ''
        self.chaojiying = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password,
                                                       self.chaojiying_ID)

Before starting to implement login, prepare our things and put them in the initialization method.
self. chaojiying = chaojiying. Chaojiying_ The client (self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_id) initializes the method by calling the method in the super Eagle py file. As for how to import the super eagle, you can directly put the extracted file package of the super eagle in the same directory as your code, and then call from chaojiying_Python import chaojiying is OK.

2.2. Open the website

    def open(self):
        """
        input b Station account and password
        :return:None
        """
        self.driver.get(self.url)
        self.driver.find_element_by_id('login-username').send_keys(self.username)
        self.driver.find_element_by_id('login-passwd').send_keys(self.password)

2.3. Get login button

    def get_touclick_button(self):
        """
        Get initial verification button
        :return: Login button
        """
        # Get initial verification button
        button = self.driver.find_element_by_xpath('//*[@id="geetest-wrap"]/div/div[5]/a[1]')
        return button

2.4. Obtain the touch verification code and process the verification code

    def pick_code(self):
        """
        Obtain the touch verification code and process the touch verification code
        :return: None
        """
        time.sleep(3)
        # Get the label of the touch picture
        pick_img_label = self.driver.find_element_by_css_selector('img.geetest_item_img')
        # Get a link to a touch picture
        src = pick_img_label.get_attribute('src')
        response = requests.get(src).content
        # File reading and writing
        f = BytesIO()
        f.write(response)
        img = Image.open(f)
        # Gets the size ratio of the picture to the browser's label
        scale = [pick_img_label.size['width'] / img.size[0],
                 pick_img_label.size['height'] / img.size[1]]
        # Call super Eagle
        cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID)
        result = cjy.PostPic(response, 9005)
        # Analyze the results
        position = result['pic_str'].split('|')
        position = [[int(j) for j in i.split(',')] for i in position]
        if result['err_no'] == 0:
            for item in position:
                ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0],
                                                                      item[1] * scale[1]).click().perform()
                time.sleep(1)
            time.sleep(2)
        btn = self.driver.find_element_by_css_selector('div.geetest_commit_tip')
        time.sleep(1)
        btn.click()

Judging from the amount of code, it is not difficult to see that this place is the most difficult place for the whole login. There are a few points that I have encountered some problems when I did it myself. Let me share with you.

2.4. 1. Gets the size ratio of the picture to the browser's label

scale = [pick_img_label.size['width'] / img.size[0],
                 pick_img_label.size['height'] / img.size[1]]

This place must be necessary. If you don't get the size ratio between the picture and the browser, you will find that it won't be able to click when you click later

2.4. 2. Call super Eagle

cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID)
        result = cjy.PostPic(response, 9005)

Result: the final result of running: {err_no ': 0,' err_str ':'ok', 'pic_id':'1162317367529700100 ',' pic_str ':'145, 59|245, 255|158, 271|241, 150', 'MD5':'188f2726a81e33871e1e57b57be76b71ef '} let's remember this result and look down

2.4. 3,pic_str was used to analyze the results

pic_str, (string) the recognized result. Analyze the coordinates, and analyze the coordinates into the methods we need through split and loop nesting
The final result is: [[145, 59], [245, 255], [158, 271], [241, 150]] (of course, the coordinates of each picture must be different, so the value of your position will be different from me, which is normal)

2.4.4,err_no

err_no, (numeric) return code. If 0 is returned, it means successful return. We add a judgment, return 0 and click

 if result['err_no'] == 0:
            for item in position:
                ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0],
                                                                      item[1] * scale[1]).click().perform()
                                                                      

2.5. Handling of login failure

Generally speaking, the coding platform is not 100% successful. At this time, we need to make a judgment on it and re identify those who fail to identify the verification code. We can judge whether the login is successful by judging whether the url is switched. The code is as follows:

    def detect(self):
        """
        Handling in case of login failure
        :return:None
        """
        current = self.driver.current_url
        if (
                current == 'https://passport.bilibili.com/account/security#/home' or
                current == 'https://www.bilibili.com/'):
            print('Success!!')
        else:
            self.chaojiying.ReportError(self.pic_id)
            self.pick_code()
            self.detect()

So far, the click out simulation login of station b is over. Thank you again for your help. The following release of the complete code, there is something wrong, I hope you can correct it, thank you~~

import time
import requests
from PIL import Image
from io import BytesIO
from selenium import webdriver
from chaojiying_Python import chaojiying
from selenium.webdriver.common.action_chains import ActionChains


class BiliiliSpider(object):

    def __init__(self):
        """
        self.user_name:b Station login account
        self.password:b Station login password
        self.chaojiying_user_name:Super Eagle login account
        self.chaojiying_password:Super Eagle login password
        self.chaojiying_ID:Super Eagle software ID
        self.chaojiying_kind:Super Eagle verification code type
        :return:None
        """
        self.url = 'https://passport.bilibili.com/login'
        self.driver = webdriver.Chrome()
        self.driver.maximize_window()
        self.username = '18676556724'
        self.password = '1997yx0912'
        self.chaojiying_user_name = 'chatblanc'
        self.chaojiying_password = '1997yx0912'
        self.chaojiying_ID = 925932
        self.pic_id = ''
        self.chaojiying = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password,
                                                       self.chaojiying_ID)

    def open(self):
        """
        input b Station account and password
        :return:None
        """
        self.driver.get(self.url)
        self.driver.find_element_by_id('login-username').send_keys(self.username)
        self.driver.find_element_by_id('login-passwd').send_keys(self.password)

    def get_touclick_button(self):
        """
        Get initial verification button
        :return: Login button
        """
        # Get initial verification button
        button = self.driver.find_element_by_xpath('//*[@id="geetest-wrap"]/div/div[5]/a[1]')
        return button

    def pick_code(self):
        """
        Obtain the touch verification code and process the touch verification code
        :return: None
        """
        time.sleep(3)
        # Get the label of the touch picture
        pick_img_label = self.driver.find_element_by_css_selector('img.geetest_item_img')
        # Get a link to a touch picture
        src = pick_img_label.get_attribute('src')
        response = requests.get(src).content
        # File reading and writing
        f = BytesIO()
        f.write(response)
        img = Image.open(f)
        # Gets the size ratio of the picture to the browser's label
        scale = [pick_img_label.size['width'] / img.size[0],
                 pick_img_label.size['height'] / img.size[1]]
        # Call super Eagle
        cjy = chaojiying.Chaojiying_Client(self.chaojiying_user_name, self.chaojiying_password, self.chaojiying_ID)
        result = cjy.PostPic(response, 9005)
        # Analyze the results
        position = result['pic_str'].split('|')
        position = [[int(j) for j in i.split(',')] for i in position]
        if result['err_no'] == 0:
            for item in position:
                ActionChains(self.driver).move_to_element_with_offset(pick_img_label, item[0] * scale[0],
                                                                      item[1] * scale[1]).click().perform()
                time.sleep(1)
            time.sleep(2)
        # Positioning OK button
        btn = self.driver.find_element_by_css_selector('div.geetest_commit_tip')
        time.sleep(1)
        btn.click()

    def detect(self):
        """
        Handling in case of login failure
        :return:None
        """
        current = self.driver.current_url
        if (
                current == 'https://passport.bilibili.com/account/security#/home' or
                current == 'https://www.bilibili.com/'):
            print('Success!!')
        else:
            self.chaojiying.ReportError(self.pic_id)
            self.pick_code()
            self.detect()

    def main(self):
        """
        Main function
        :return:None
        """
        self.open()
        time.sleep(2)
        button = self.get_touclick_button()
        button.click()
        self.pick_code()
        time.sleep(2)
        # self.detect()


if __name__ == '__main__':
    bilibili = BiliiliSpider()
    bilibili.main()
    

Topics: Python Selenium crawler