autojs extraction of red text in text (image two value + Baidu Intelligent Cloud character recognition OCR interface call)

Posted by davard on Sat, 18 Dec 2021 15:29:49 +0100

1, Effect

screenshot

2, Process

1. Preparation - screenshot - extract main color

First cut a picture manually, such as the three words of the Party branch in the upper left picture, and then through the online website https://palettegenerator.com/ Extract the main color. For example, the main color of red is #C71718

After that, the screenshot of autojs is very simple. I won't say it here. If you have any questions, you can comment

//Request screenshot
if(!requestScreenCapture()){
    toast("Failed to request screenshot");
    exit();
}
//Save to memory card directory
captureScreen("/sdcard/1.png");

2. Image binarization

With the main color, through autojs images The interval (IMG, "#C71718", 70) function binarizes the picture

Binarize the picture. The colors outside the color interval ~ color + interval range become 0, and the colors within the range become 255. The addition and subtraction of color here is for each channel.

For example, images Interval (IMG, "#888888", 16). The color value of each channel is 0x88, and the range after adding and subtracting 16 is [0x78, 0x98]. Therefore, this code will change the color of #7878~#989898 to #FFFFFF and those outside this range to #000000.

var img = images.read("/sdcard/Pictures/QQ/c.jpg");

img=images.interval(img, "#C71718", 70)// 60-90 	// You can change this value later
images.save(img, "./Neighborhood yyds.jpg", "jpg", 100);

img.recycle();

3. Character recognition

Character recognition, I've written before python's Baidu Intelligent Cloud - character recognition
Here we call Baidu OCR character recognition interface for character recognition.

Official documents: https://cloud.baidu.com/doc/OCR/s/Ek3h7xypm

1. Register an account - it's in the previous article

2. Obtain API Key and Secret Key

3. Call interface

https://cloud.baidu.com/doc/OCR/s/Ck3h7y2ia

Request format - POST mode call

Return format - JSON format

Request restriction - the requested picture needs to be base64 encoded and urlencode d before being transmitted
Request format support: PNG, JPG, JPEG, BMP, TIFF, PNM, WebP
Size limit after image encoding: the size after base64 encoding urlencode shall not exceed 4M, the shortest side shall be at least 15px, and the longest side shall be at most 4096px

var img64 = images.toBase64(img, "png", 100); //The picture is base64 encoded

Call method 1: request URL data format

To send a request to the API service address using POST, you must take the parameter: access in the URL_ token

Note: access_ The token is valid for 30 days and needs to be replaced regularly every 30 days;

reference resources "Access Token get"
To the authorized service address https://aip.baidubce.com/oauth/2.0/token Send the request (POST is recommended), and take the following parameters in the URL:
grant_type: required parameter, fixed to client_credentials；
client_id: required parameter, API Key applied;
client_secret: required parameter, the applied Secret Key;

    var API_Key="Enter your own API_Key";
    var Secret_Key="Enter your own Secret_Key";
    //Send request to authorized service address
    var getTokenUrl="https://aip.baidubce.com/oauth/2.0/token";
    var token_Res = http.post(getTokenUrl, {
        grant_type: "client_credentials",
        client_id: API_Key,
        client_secret: Secret_Key,
    });
    var access_token=token_Res.body.json().access_token;

The JSON text parameters returned by the server are as follows:
access_token: Access Token to get;
expires_in: the validity period of the Access Token (in seconds, the validity period is 30 days);
Other parameters are ignored and not used temporarily;

Take general character recognition (high precision version) as an example
https://cloud.baidu.com/doc/OCR/s/1k3h7y3db

    var ocrUrl = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic";
    var ocr_Res = http.post(ocrUrl, {
        headers: {
            "Content-Type": "application/x-www-form-urlencoded"
        },
        access_token: access_token,	//Previously obtained access_token
        image: img64,		//And url/pdf_file, img64 is the incoming picture
    });
    var json = ocr_Res.body.json();

Required parameters: headers, access_token, image, and other parameters to view the document

4. Complete code

//Baidu Intelligent Cloud character recognition OCR interface call
function Baidu_ocr(img){
    log("Call Baidu ocr:");
    /**** Encode pictures******/
    var img64 = images.toBase64(img, "png", 100);

    /**** Get access_token ******/
    var API_Key="Enter your own API_Key";
    var Secret_Key="Enter your own Secret_Key";
    //Send request to authorized service address
    var getTokenUrl="https://aip.baidubce.com/oauth/2.0/token";
    var token_Res = http.post(getTokenUrl, {
        grant_type: "client_credentials",
        client_id: API_Key,
        client_secret: Secret_Key,
    });
    var access_token=token_Res.body.json().access_token;

    /**** Call ocr******/
    //High precision character recognition
    var ocrUrl = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic";
    var ocr_Res = http.post(ocrUrl, {
        headers: {
            "Content-Type": "application/x-www-form-urlencoded"
        },
        access_token: access_token,	//Previously obtained access_token
        image: img64,		//And url/pdf_file, img64 is the incoming picture
    });
    var json = ocr_Res.body.json();

    /**** Return results******/
    return json;
}

3, My groping process

1. Binarization

var img = images.read("/sdcard/Pictures/QQ/c.jpg");	//Read a picture

var c=[]	//Put the color values in the picture into the array
var cc = 0	//Record the number of color values
//Gets the color value at point (x, y)
for(x=0;x<250;x++){
    for(y=0;y<20;y++){
		var color = images.pixel(img, x, y);
		//Show this color
		if(colors.toString(color) != "#ffffffff"){ 	// It's not white
			log(x,y,colors.toString(color));
		
			c[cc] = colors.toString(color)
			cc=cc+1
		}
	}
} 

//Grayscale
img1=images.grayscale(img)
images.save(img1, "./Grayscale.jpg", "jpg", 100);
//Change all values greater than 20 in the picture to 255 and the rest to 0
img2=images.threshold(img, 19, 255, "BINARY")//20!   25 black words will show 15 red words are not clear
images.save(img2, "./threshold.jpg", "jpg", 100);
//Turn the in the range white
img3=images.inRange(img, "#C11719", "#FCBDBB")//fffffafa//ffffdddc
images.save(img3, "./Range.jpg", "jpg", 100);


/*img4=images.interval(img, "#C11719", 95)//C11719.100 Black character display
images.save(img4, "./4.jpg", "jpg", 100); */

img4=images.interval(img, "#C71718", 70)//C11719.70!! 60-90
images.save(img4, "./Neighborhood yyds.jpg", "jpg", 100);

/* img4=images.interval(img, "#C11719", 55)//C11719.90 The scarlet letter is not clear
images.save(img4, "./6.jpg", "jpg", 100); */

img.recycle();//Recycling pictures, screenshots are not required

In fact, I don't know much about the RGB, hexadecimal and gray value of this color. I found out the above results a little bit, which is not universal. Just learn from them

2. Character recognition

Step by step according to the document

1. Encode the picture
2. Get access_token
3. Obtain identification results

Although there are many codes, they are actually easy to understand. The parameters to be transmitted and returned are clearly written in the document

4, Complete code

var img = images.read("/sdcard/Pictures/QQ/c.jpg");	//Change yourself to a screenshot
log("Read picture")

img=images.interval(img, "#C71718", 70)//C11719.70!! 60-90
images.save(img, "./Neighborhood yyds.jpg", "jpg", 100);
log("Image binarization")

img.recycle();//Recycle pictures, screenshots are not required

var res=Baidu_ocr(img4)//Call Baidu map recognition

/*Splice the identification results*/
words = ''
for(i=0;i<res.words_result_num;i++){
    words = words + res.words_result[i].words
}
log("Identification results:" + words)




//Baidu Intelligent Cloud character recognition OCR interface call
function Baidu_ocr(img){
    log("Call Baidu ocr:");
    /**** Encode pictures******/
    var img64 = images.toBase64(img, "jpg", 100);//Convert screenshots

    /**** Get access_token ******/
    var API_Key="Oa3pVyYxbrPpUxAUwKFmnkQ0";
    var Secret_Key="h6nvjCpGDD3fUy3u6Qa3tS0yL6vSe7vq";
    //Send request to authorized service address
    var getTokenUrl="https://aip.baidubce.com/oauth/2.0/token";
    var token_Res = http.post(getTokenUrl, {
        grant_type: "client_credentials",
        client_id: API_Key,
        client_secret: Secret_Key,
    });
    var access_token=token_Res.body.json().access_token;

    /**** Call ocr******/
    //High precision character recognition
    var ocrUrl = "https://aip.baidubce.com/rest/2.0/ocr/v1/accurate_basic";
    var ocr_Res = http.post(ocrUrl, {
        headers: {
            "Content-Type": "application/x-www-form-urlencoded"
        },
        access_token: access_token,	//Previously obtained access_token
        image: img64,		//And url/pdf_file, img64 is the incoming picture
    });
    var json = ocr_Res.body.json();

    /**** Return results******/
    return json;
}

Topics: autojs

Programmer Think