[2021] digital huarongdao

Posted by AV on Sat, 01 Jan 2022 07:36:49 +0100

Digital Huarong Road

Digital Huarong Road is to rearrange the digital squares on the chessboard in order from left to right and from top to bottom with as few steps and as short a time as possible. Simple classification is spatial sorting that limits movement.

Long ago, I wanted to train a model that can play digital huarongdao games by myself!! Reinforcement learning has been used, but the effect is not ideal (see Non convergent reinforcement learning project -- Digital huarongdao ), so now look at the effect in this way!

Let's see the effect first

data set

The generated 3 * 3 size digital huarongdao data set takes the number 9 as a movable grid, and the data set format is as follows:

deep:0
1 2 3 
4 5 6 
7 8 9 
action:-842150451

deep indicates the depth (number of layers). The middle three layers are numbers separated by spaces. action_dict={0:down,1:up,2:left,3:right}
The dataset is generated exhaustively in c language, with a total of 181440 (= 9! / 2), because the final result cannot be obtained due to the partial arrangement (about half) of Huarong Road. The dataset generation file is CreateDate.cpp, which can be run in Windows.

Why not generate a 4 * 4 dataset?

In fact, I'm generating. It's been running for three days. It's stuck in the 20th layer. There are more than 1.6 million different orders in the 20th layer. If the final total number = 16/ 2. Then I need to generate 10461394944000, that is, more than 10 trillion permutations. I don't know how long it will take to run.

deep=1, len=2 totallen=2
deep=2, len=4 totallen=6
deep=3, len=10 totallen=16
deep=4, len=24 totallen=40
deep=5, len=54 totallen=94
deep=6, len=107 totallen=201
deep=7, len=212 totallen=413
deep=8, len=446 totallen=859
deep=9, len=946 totallen=1805
deep=10, len=1948 totallen=3753
deep=11, len=3938 totallen=7691
deep=12, len=7808 totallen=15499
deep=13, len=15544 totallen=31043
deep=14, len=30821 totallen=61864
deep=15, len=60842 totallen=122706
deep=16, len=119000 totallen=241706
deep=17, len=231844 totallen=473550
deep=18, len=447342 totallen=920892
deep=19, len=859744 totallen=1780636
deep=20, len=1637383 totallen=3418019

Random forest was used to classify Huarong Road data

The accuracy of classification shall reach 1.0, and n_ You can set estimators above 80.

with open("data/data118590/file.txt","r",encoding="utf-8") as fp:
    lines=fp.readlines()
    train_data=[]
    train_label=[]
    i=5
    while i < len(lines):
        if lines[i].startswith("deep"):
            i+=1
            continue
        if lines[i].startswith("action"):
            train_label.append(int(lines[i].strip().split(":")[-1]))
            i+=1
        else:
            data=lines[i].strip('\n')+lines[i+1].strip('\n')+lines[i+2].strip('\n')
            train_data.append([int(d) for d in data.strip().split(" ")])
            i=i+3
print(len(train_data))
print(len(train_label))
181439
181439
import joblib
from sklearn.ensemble import RandomForestClassifier

rf_clf = RandomForestClassifier(n_estimators=80,random_state=0)#n_estimators=70 score 0.999988977015576, 80 score 1.0, so I choose 80
rf_clf.fit(train_data,train_label)
score_train = rf_clf.score(train_data,train_label)
print(score_train)
joblib.dump(rf_clf, 'rf_clf.joblib')
1.0





['rf_clf.joblib']

Verify it in the QQ applet

This code runs using Netease's airtest (Windows Environment). Although the ocr recognition effect is not very good, as long as there are no more than 9 at the beginning of prediction, it can run smoothly later, but it's still lazy to use it. Ha ha, you should train a digital classification network to ensure that the data input into the classifier is accurate every time. Airtest is actually easier to use!!

# -*- encoding=utf8 -*-
__author__ = "Dancing gun god"

from airtest.core.api import *
import os
import cv2
import numpy as np
import paddlehub as hub
import joblib
from airtest.cli.parser import cli_setup

if not cli_setup():
    auto_setup(__file__, logdir=True, devices=["Your equipment?cap_method=MINICAP&&ori_method=MINICAPORI&&touch_method=MAXTOUCH",])
    
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

print("start...")
w,h=device().get_current_resolution()#Get phone resolution
print(w,h)

ocr = hub.Module(name="chinese_ocr_db_crnn_mobile")
auto_setup(__file__)

map_size = (3,3)
interval=150
groundnum=map_size[0]*map_size[1]
boundary = [(30,550),(1100,1620)]
digit = [ str(i+1) for i in range(groundnum)]

def check_result(maps,map_size,digit):
    for i in range(map_size[0]):
        for j in range(map_size[1]):
            if maps[i,j]!=int(digit[i*map_size[0]+j]):
                return False
    return True

def check_map(maps):
    tamp = np.zeros(groundnum, dtype = np.bool)
    for i in maps:
        tamp[i - 1] = True
    return tamp.all()

def CreateMap(results, digit):
    maps=np.ones(map_size,dtype=int)*groundnum
    datas=results[0]['data']
    for data in datas:
        if data['text'] in digit:
            point = [(data['text_box_position'][0][0]+data['text_box_position'][2][0])//2,
                     (data['text_box_position'][0][1]+data['text_box_position'][2][1])//2]
            maps[(point[1]-boundary[0][1])//((boundary[1][1]-boundary[0][1])//map_size[1])][(point[0]-boundary[0][0])//((boundary[1][0]-boundary[0][0])//map_size[0])]=int(data['text'])
    return maps

#Negative calculation function
def imcomplement(img):
    table = np.array([255-i for i in np.arange(0, 256)]).astype("uint8")    
    return cv2.LUT(img, table) #Lookup table function using OpenCV

def swap(maps,point1,point2):
    temp=maps[point1[1]][point1[0]]
    maps[point1[1]][point1[0]]=maps[point2[1]][point2[0]]
    maps[point2[1]][point2[0]]=temp
    return maps
restart = True
while True:
    snapshot(filename='pic.jpg')

    np_images =[imcomplement(cv2.imread('log/pic.jpg'))] 
    results = ocr.recognize_text(
                        images=np_images,         # Picture data, ndarray Shape is [H, W, C], BGR format;
                        use_gpu=True,            # Whether GPU is used; If GPU is used, please set CUDA first_ VISIBLE_ Devices environment variable
                        output_dir='ocr_result',  # The path to save the picture. It is set to OCR by default_ result;
                        visualization=True,       # Whether to save the recognition result as a picture file;
                        box_thresh=0.5,           # Detect the threshold of text box confidence;
                        text_thresh=0.5)          # The threshold of Chinese text recognition confidence;
    #print(results)
    new_maps=CreateMap(results,digit)
    if check_map(new_maps)  or restart:
        maps = new_maps
        restart = False
    print(maps)
    rf_clf = joblib.load("rf_clf.joblib")
    action = rf_clf.predict([maps.flatten()])[0]
    print(action)
    def findgound(maps,map_size):
        ground=[0,0]
        for i in range(map_size[0]):
            for j in range(map_size[1]):
                if maps[i,j]==groundnum:
                    ground[0]=j
                    ground[1]=i
                    print(ground)
                    return ground
    ground = findgound(maps,map_size)
    
    def up(maps):
        touch((boundary[0][0]+interval+(boundary[1][0]-boundary[0][0])//map_size[0]*ground[0],boundary[0][1]+interval+(boundary[1][1]-boundary[0][1])//map_size[1]*(ground[1]-1)))
        return swap(maps, ground,(ground[0],ground[1]-1))
    def down(maps):
        touch((boundary[0][0]+interval+(boundary[1][0]-boundary[0][0])//map_size[0]*ground[0],boundary[0][1]+interval+(boundary[1][1]-boundary[0][1])//map_size[1]*(ground[1]+1)))
        return swap(maps, ground,(ground[0],ground[1]+1))
    def left(maps):
        touch((boundary[0][0]+interval+(boundary[1][0]-boundary[0][0])//map_size[0]*(ground[0]-1),boundary[0][1]+interval+(boundary[1][1]-boundary[0][1])//map_size[1]*ground[1]))
        return swap(maps, ground,(ground[0]-1,ground[1]))
    def right(maps):
        touch((boundary[0][0]+interval+(boundary[1][0]-boundary[0][0])//map_size[0]*(ground[0]+1),boundary[0][1]+interval+(boundary[1][1]-boundary[0][1])//map_size[1]*ground[1]))
        return swap(maps, ground,(ground[0]+1,ground[1]))
    action_dict={0:down,1:up,2:left,3:right}
    maps = action_dict[action](maps)
    if check_result(maps, map_size,digit):
        break
    sleep(2.0)

Please click here View the basic usage of this environment

Please click here for more detailed instructions.

Topics: AI Deep Learning paddlepaddle