pyautogui automation operation script

Posted by jwilh on Sun, 20 Feb 2022 00:48:42 +0100

Previously, I wrote an automatic answer script for encrypted video playback (written with easy language desert plug-in)

There are also merchants' automatic voice response (also easy language + desert plug-in)

There is also an automatic check-in for Android end nails written by autojs

There is also a treasure collar meow coin. There are no screenshots here

Even some web page scripts, such as oil monkey and Chrome expansion, can be regarded as script development.

This kind of code is usually called RPA (robot process automation), but since playing with the network protocol, it seems that I haven't touched anything like automation operation script (the protocol offline is really fragrant, and the efficiency is high, but it needs a certain reverse ability), but for some things that need automation, I can only rely on scripts.

use

pyautogui is a python version of API encapsulation operation for windows, and the main function of this kind of operation is to find the window, find the mouse position, control the mouse click and move, as well as keyboard information input, and carry out a series of process control to achieve the desired purpose. Therefore, relevant APIs will be provided for calling. Here is an article PyAutoGUI super full introduction | automatic control based on python | work automation I won't introduce the api.

example

Just write a simple example of opening the wechat window and automatically looking for the avatar of key figures to send you. By the way, let's explain the various processes of writing an automated script.

Step 1: find the window

If you want to write an automated script, you must first confirm the scope, so as to avoid unnecessary area search and improve efficiency. The scope of the example here is the whole wechat window. Through some window detection tools (here use the fine programming assistant), you can get the window title and window class name, which are used to locate the window (window handle).

You can get the window handle through the following code

def findWindow():
    windows = pyautogui.getWindowsWithTitle('WeChat')
    if len(windows) == 0:
        raise Exception("Wechat window not found")
    return windows[0]


wxWindow = findWindow()
wxWindow.activate() # Activate the window to bring it to the front

Step 2: find the map and click

To find the corresponding contact, you need to find the relevant characteristics of the contact, such as avatar and nickname. Here, the avatar is used as a demonstration.

Since you want to take the avatar as the feature, you need to save the avatar in advance, and then use the api to find the coordinates of the picture

def clickAvatar():
    try:
        location = pyautogui.locateOnScreen('avatar.png')
        print(location)  # Box(left=293, top=402, width=40, height=40)
        pyautogui.click(location)
        # pyautogui.click('avatar.png') # If the coordinates cannot be used, you can use this command to identify the map + click
    except:
        print('Avatar not found')

It should be noted that the avatar is best captured completely (small and accurate), because it should match exactly (all pixels and resolution).

Step 3: input content

After the above two steps, you can normally open the corresponding contact to chat with him. Now you need to enter the content into the chat box, and then find the send button and click it as in the second step.

import pyperclip


def paste(content):
    pyperclip.copy(content)
    pyautogui.hotkey('ctrl', 'v')
    

content = u'Hello'
paste(content)

Since the content we want to input contains Chinese, but it is impossible to input Chinese directly in general keyboard instructions, we need to make some modifications, paste the built-in clip version that needs to be input, and then use the combination key ctrl + V to paste it into the window. The specific code is shown above (pyperclip needs to be introduced)

Then, the same as the second step, find the send button and click

def clickSend():
     pyautogui.click('send.png') 

Complete code

import pyperclip
import pyautogui

pyautogui.PAUSE = 1 # The number of seconds that the call pauses after executing the action. It can only be used after executing some pyautogui actions. It is recommended to use time sleep

def findWindow():
    windows = pyautogui.getWindowsWithTitle('WeChat')
    if len(windows) == 0:
        raise Exception("Wechat window not found")
    return windows[0]


def clickAvatar():
    try:
        location = pyautogui.locateOnScreen('avatar.png')
        print(location) # Box(left=293, top=402, width=40, height=40)
        pyautogui.click(location)
        # pyautogui.click('avatar.png')  # Use this command if the coordinates are not available
    except:
        print('Avatar not found')

def clickSend():
     pyautogui.click('send.png')


def paste(content):
    pyperclip.copy(content)
    pyautogui.hotkey('ctrl', 'v')


if __name__ == '__main__':
    wxWindow = findWindow()
    wxWindow.activate()  # Activate the window to bring it to the front
    clickAvatar()

    content = 'Hello'
    paste(content)

    clickSend()

Demonstration effect

Experience feeling

However, there are still many areas that need to be improved. For example, in the case of multiple wechat windows, it is more recommended to use win32gui for the operation of windows. Secondly, when looking for pictures, we use full screen to look for pictures, but we have found that the area where the pictures are located is the size of wechat windows, so we can search the range to find them faster.

The above is just a simple example. In fact, automation needs to consider a lot. For example, the video automatic question answering I wrote at that time needs to regularly (1 second) monitor whether the answer window pops up, then judge the content of the question and obtain the question bank from the existing question bank. Instead of seemingly meaningless and actually meaningless like the above, if it is strengthened, for example, judge whether the wechat icon flashes (someone sends a message), and then judge whether there are keywords to reply to the other party's chat content, In fact, you can do a simple robot customer service chat (for some platforms that do not support automatic reply, automatic script is very useful). However, the specific use scenarios need to be considered separately. Just take a look at the examples shown in this article.

However, in the overall experience, it's much better to say it or not than easy language. If I'm allowed to write the automatic operation of window, I certainly don't hesitate to choose python to write it.