Crawler cracking 12306 verification code to realize login operation

Posted by mr_tron on Tue, 23 Jun 2020 04:18:34 +0200

Crawler cracking 12306 verification code to realize login operation

1, Preparation

Before crawling, let's see what the 12306 verification code looks like

See this verification code, have wood very flustered, this tm can also crack???
Answer: of course.

Let's meet a verification code recognition platform, super eagle.


If you don't have a registered partner, you can register one. It's still easy to use the platform for personal testing, and the price is not expensive.

After successful registration, go to the user center

Warm tip: you can get 1000 points when you bind wechat for the first time. After all, I'm a prostitute, hehe.

Enter the software id and generate a software id. after that, we only need to use the software id.

Then click development documents, python, and download.


Download it and get the following.

2, Full code

At this point, the basic preparation is completed, not much to say - code.

open chaojiying.py File to copy its contents to the python file to be written

First, I will give you the complete code, and then we will analyze it step by step

import requests
from hashlib import md5

class Chaojiying_Client(object):

    def __init__(self, username, password, soft_id):
        self.username = username
        password = password.encode('utf8')

        self.password = md5(password).hexdigest()
        self.soft_id = soft_id
        self.base_params = {
            'user': self.username,
            'pass2': self.password,
            'softid': self.soft_id,
        }
        self.headers = {
            'Connection': 'Keep-Alive',
            'User-Agent': 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)',
        }

    def PostPic(self, im, codetype):
        """
        im: Picture byte
        codetype: Topic type reference http://www.chaojiying.com/price.html
        """
        params = {
            'codetype': codetype,
        }
        params.update(self.base_params)
        files = {'userfile': ('ccc.jpg', im)}
        r = requests.post('http://upload.chaojiying.net/Upload/Processing.php', data=params, files=files,
                          headers=self.headers)
        return r.json()

    def ReportError(self, im_id):
        """
        im_id:Picture of wrong topic ID
        """
        params = {
            'id': im_id,
        }
        params.update(self.base_params)
        r = requests.post('http://upload.chaojiying.net/Upload/ReportError.php', data=params, headers=self.headers)
        return r.json()

=========================================================================================
# chaojiying = Chaojiying_Client('ppx666', '07244058664', '906006')  # User center > > software ID generates a replacement 96001
# im = open('a.jpg', 'rb').read()  # Local image file path to replace a.jpg sometimes required by WIN system//
# print(chaojiying.PostPic(im, 9004)['pic_str'])  # 1902 official website of verification code type

from selenium import webdriver
from PIL import Image
import time
from selenium.webdriver import ActionChains


bro = webdriver.Chrome()
#Full screen
bro.maximize_window()

bro.get("https://kyfw.12306.cn/otn/resources/login.html")
time.sleep(2)
bro.find_element_by_class_name("login-hd-account").click()


# save_screenshot is to take a screenshot of the current page and save it
bro.save_screenshot('aa.png')

# Determine the coordinates of the upper left corner and the lower right corner corresponding to the verification code picture
code_img_ele = bro.find_element_by_id("J-loginImg")
location = code_img_ele.location  #The coordinates x,y of the upper left corner of the captcha picture
size = code_img_ele.size  # Length and width corresponding to the label of the verification code
# Coordinates of upper left and lower right
rangle = (
    int(location['x']),int(location['y']),int(location['x'] + size['width']),int(location['y'] + size['height'])
)

i = Image.open("./aa.png")
code_img_name = './code.png'
# Crop crop
frame = i.crop(rangle)
frame.save(code_img_name)

chaojiying = Chaojiying_Client('Super Eagle account', 'Super Eagle code', 'The software that we prepared before ID')  # User center > > software ID generates a replacement 96001
im = open('code.png', 'rb').read()  # Local image file path to replace a.jpg sometimes required by WIN system//
print(chaojiying.PostPic(im, 9004)['pic_str'])  # 1902 official website of verification code type

========================================================================================

result = chaojiying.PostPic(im, 9004)['pic_str']
all_list = []  # Store coordinates to be clicked
if '|' in result:
    list_1 = result.split("|")
    count_1 = len(list_1)
    for i in range(count_1):
        xy_list = []
        x = int(list_1[i].split(",")[0])
        y = int(list_1[i].split(",")[1])
        xy_list.append(x)
        xy_list.append(y)
        all_list.append(xy_list)
else:
    x = int(result.split(",")[0])
    y = int(result.split(",")[1])
    xy_list = []
    xy_list.append(x)
    xy_list.append(y)
    all_list.append(xy_list)
print(all_list)
# Traverse the list, and click the position specified by x,y corresponding to each list element using the action chain
for l in all_list:
    x = l[0]
    print(x)
    y = l[1]
    print(y)
    ActionChains(bro).move_to_element_with_offset(code_img_ele,x,y).click().perform()   # move_to_element_with_offset moves to the position of how far away from an element (upper left coordinate)
    time.sleep(0.5)

========================================================================================

bro.find_element_by_id("J-userName").send_keys("Fill in your 12306 account")
time.sleep(1)
bro.find_element_by_id("J-password").send_keys("Fill in your 12306 password")
time.sleep(1)
bro.find_element_by_id("J-login").click()
time.sleep(1)

3, Code analysis

OKOK, start to analyze!!! It's divided into four parts. I separated them with "===="

1, Part I
The first part is to let you download the python file. Don't ask, just don't know (∀ˇˇ). After all, I haven't studied it. Interested partners can do research.

2, Part II

# chaojiying = Chaojiying_Client('ppx666', '07244058664', '906006')  # User center > > software ID generates a replacement 96001
# im = open('a.jpg', 'rb').read()  # Local image file path to replace a.jpg sometimes required by WIN system//
# print(chaojiying.PostPic(im, 9004)['pic_str'])  # 1902 official website of verification code type

from selenium import webdriver
from PIL import Image
import time
from selenium.webdriver import ActionChains


bro = webdriver.Chrome()
#Full screen
bro.maximize_window()

bro.get("https://kyfw.12306.cn/otn/resources/login.html")
time.sleep(2)
bro.find_element_by_class_name("login-hd-account").click()


# save_screenshot is to take a screenshot of the current page and save it
bro.save_screenshot('aa.png')

# Determine the coordinates of the upper left corner and the lower right corner corresponding to the verification code picture
code_img_ele = bro.find_element_by_id("J-loginImg")
location = code_img_ele.location  #The coordinates x,y of the upper left corner of the captcha picture
size = code_img_ele.size  # Length and width corresponding to the label of the verification code
# Coordinates of upper left and lower right
rangle = (
    int(location['x']),int(location['y']),int(location['x'] + size['width']),int(location['y'] + size['height'])
)

i = Image.open("./aa.png")
code_img_name = './code.png'
# Crop crop
frame = i.crop(rangle)
frame.save(code_img_name)

chaojiying = Chaojiying_Client('Super Eagle account', 'Super Eagle code', 'The software that we prepared before ID')  # User center > > software ID generates a replacement 96001
im = open('code.png', 'rb').read()  # Local image file path to replace a.jpg sometimes required by WIN system//
print(chaojiying.PostPic(im, 9004)['pic_str'])  # 1902 official website of verification code type

I won't talk about the first guide bag
selenium, I'm sure you've learned it. It doesn't matter what you haven't learned. Let's develop documents for you Selenium Chinese document

bro = webdriver.Chrome()  use Chrome Browser driven

bro.maximize_window() Set full screen to open, because I found that if I don't set it, he won't open full screen, I don't know if you are


bro.get("https://kyfw.12306.cn/otn/resources/login.html")  This is it. get reach12306Of
time.sleep(2)		Don't sleep too fast for two seconds
bro.find_element_by_class_name("login-hd-account").click()  We think that the default way to open the login website is to scan the code for login, so we can use theclassClick the name of to log in the account

bro.save_screenshot('aa.png')		save_screenshot It is to take a screenshot of the current page and save it

code_img_ele = bro.find_element_by_id("J-loginImg")  according to id Find captcha picture
location = code_img_ele.location  #Coordinate x of the upper left corner of the captcha picture,y(location Property to return the picture object(This picture)Location in browser, returned as a dictionary)
size = code_img_ele.size  # The length and width corresponding to the label of the verification code (size returns the width and height of the picture)

rangle = (
    int(location['x']),int(location['y']),int(location['x'] + size['width']),int(location['y'] + size['height'])
)   Coordinates of upper left and lower right


i = Image.open("./aa.png")   Open truncated image
code_img_name = './code.png'
# Crop crop
frame = i.crop(rangle)  Cut according to the above coordinates
frame.save(code_img_name)  preservation

chaojiying = Chaojiying_Client('Super Eagle account', 'Super Eagle code', 'The software that we prepared before ID')  # User center>>Software ID Generate a replacement 96001
im = open('code.png', 'rb').read()  # Local picture file path to replace a.jpg Sometimes WIN System requirements//
print(chaojiying.PostPic(im, 9004)['pic_str'])  # 9004 Verification code type 

12306 verification code generally has four pictures at most, so 9004 is used

OK, the second part is finished.

3, Part III

result = chaojiying.PostPic(im, 9004)['pic_str']    Get the coordinates returned by super Eagle
all_list = []  # Store coordinates to be clicked

//Coordinate format is: x,x|x,x   (x Indicates the number of bits)

if '|' in result:		If there are multiple coordinates, there are“|",Single none“|"
    list_1 = result.split("|")    According to“|"division
    count_1 = len(list_1)		  Get the divided quantity
    for i in range(count_1):
    	//The following should be understandable. I won't analyze it. I will disassemble it according to the coordinate format
        xy_list = []
        x = int(list_1[i].split(",")[0])
        y = int(list_1[i].split(",")[1])
        xy_list.append(x)
        xy_list.append(y)
        all_list.append(xy_list)
else:
    x = int(result.split(",")[0])
    y = int(result.split(",")[1])
    xy_list = []
    xy_list.append(x)
    xy_list.append(y)
    all_list.append(xy_list)
print(all_list)
# Traversing the list, using the action chain for each list element corresponding to the x,y Click at the designated location
for l in all_list:
    x = l[0]
    print(x)
    y = l[1]
    print(y)
    ActionChains(bro).move_to_element_with_offset(code_img_ele,x,y).click().perform()   # move_to_element_with_offset moves to the position of how far away from an element (upper left coordinate)
    time.sleep(0.5)

4, Part IV

It doesn't need to be said, that is to find the account number and password according to the id and fill in the assignment box, and finally click the login button.
	Don't rest too fast in the middle

bro.find_element_by_id("J-userName").send_keys("fill in your 12306 account")
time.sleep(1)
bro.find_element_by_id("J-password").send_keys("fill in your 12306 password")
time.sleep(1)
bro.find_element_by_id("J-login").click()
time.sleep(1)

This is the end. I think a good partner can give me some praise! Thank you so much. (if there are any mistakes, please point them out in the comment area, thank you!)

Topics: Selenium Python PHP JSON