aircv It is a small open source project released by Netease. It should also be the most cited project for simple image matching. The last article did how to use find of aircv_ Template description, however, it is not a mature project. There are many small pits in it, which need to be improved. Today, let's do a code logic analysis.
Core function find_template and find_all_template
find_ The template function returns the first best matching result (the position may not be at the top), while find_all_template returns all results greater than the specified confidence.
For example, you should find it in the screenshot of si no page
The results are shown in the figure below:
Let's go deep into the code and find_ The template is written as follows:
def find_template(im_source, im_search, threshold=0.5, rgb=False, bgremove=False): ''' @return find location if not found; return None ''' result = find_all_template(im_source, im_search, threshold, 1, rgb, bgremove) return result[0] if result else None
good heavens! Call find directly_ all_ Template, and then take the first of the returned values...
So find_all_template is the real core. Let's exclude irrelevant code and take a look at the most critical part:
def find_all_template(im_source, im_search, threshold=0.5, maxcnt=0, rgb=False, bgremove=False): # Matching algorithm, aircv is actually dead in the code, using CCOEFF_NORMED, the effect of most tests is indeed that this algorithm is better method = cv2.TM_CCOEFF_NORMED # Get matching matrix res = cv2.matchTemplate(im_source, im_search, method) w, h = im_search.shape[1], im_search.shape[0] result = [] while True: # Find matching max min min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res) top_left = max_loc if max_val < threshold: break # Calculation center point middle_point = (top_left[0]+w/2, top_left[1]+h/2) # Calculate the points of the four corners and store them in the result set result.append(dict( result=middle_point, rectangle=(top_left, (top_left[0], top_left[1] + h), (top_left[0] + w, top_left[1]), (top_left[0] + w, top_left[1] + h)), confidence=max_val )) # Fill out the best matching area and continue to find the next one cv2.floodFill(res, None, max_loc, (-1000,), max_val-threshold+0.1, 1, flags=cv2.FLOODFILL_FIXED_RANGE) return result
Including CV2 Matchtemplate is an opencv method. Its return value is a matrix, which is equivalent to sliding a small image on a large image, moving one pixel at a time from the upper left corner, and then calculating a matching result to form a result matrix.
The result matrix size should be: (W - w + 1) x (H - h + 1), where w and h are the width and height of large graphs and w and h are the width and height of small graphs.
The point with the maximum value in the matrix indicates that when the upper left corner of the small graph is at this position, the matching degree is the highest, and the first matching result is obtained.
Finally CV2 The function of floodfill is to fill the area of the maximum value of the result matrix with other numbers, so that the next maximum value can be found, and the phenomenon of area overlap can be avoided (otherwise, the next maximum value may be in the area just found).
A few small problems
- Grayscale images and images with transparent channels are not supported
In fact, opencv's matchTemplate originally only supports grayscale images, but in most cases, we are looking for color images, so when aircv packaging, we separate the three color channels of bgr, call matchTemplate respectively, and then merge the results.
However, this package is not compatible with the original gray-scale image, which will make an error, and if the source image has a transparent channel, it will also make an error. Therefore, I specially submitted a PR, but it has not been processed. It seems that no one has maintained the project.
- It seems that the flood fill has attracted a lot of people. The slice of numpy should be realized
Write a small piece of code to verify it - Cannot handle scaling of pictures
The contents in the template image and the original image are not necessarily the same size, so you need to do some scaling and try again. - The image matching effect is not ideal
If it is a completely consistent graph, find_ The effect of template is very good, but when fuzzy matching is required, some people's eyes can't recognize the same image at first sight, and sometimes they will be mistaken. This may need to be improved from two aspects:
--Adjustment algorithm: feature point detection algorithm such as SIFT can solve the problem of image matching with inconsistent size and angle; Halcon and other commercial software adopt shape based matching, which has better matching effect.
--Add information: sometimes, in addition to the template image itself, we may know more information, such as the possible location and range of the image, the style of the area around the image, etc., to help improve the recognition accuracy and reduce misidentification and omission.