[Python + + opencv + Python] license plate extraction, segmentation and recognition

Posted by ricta on Tue, 08 Mar 2022 05:50:40 +0100

If you want to complete the project, I will put the github link at the end of the article:

You can see that the final identification license plate number is min G99999.

In fact, I accidentally thought of doing a small project with c + + in the winter of the previous year, and then realized the license plate extraction and segmentation with c++ opencv. Then I found some blogs to do it myself, and then did it. But the effect is not very good. Using c + + method can only achieve license plate extraction and character segmentation, but the last step is the recognition of cut characters, After reading the statements of many leaders, they said that CNN was the best, so segmentation was achieved at that time, and the last step of identification could not be done.

The above is the effect of running with c + + and the log of this small project at that time. Only license plate extraction and segmentation are achieved.

Then last year, I went to learn in-depth learning, using pytorch as the framework, trained many CNN models, and then thought about building a CNN model with python + opencv + pytorch to solve this problem. At present, I use the data collected by myself to run this model, and the accuracy rate is about 90%, Then use the license plate characters generated by yourself to run, and the accuracy is OK. (if you need to train the license plate character data set, you can give me a favor and send me an email)
My thinking is divided into three steps:
1. Pull the license plate out of the picture
2. Separate the characters of the license plate one by one
3. resize the license plate to the corresponding size
4. Train the license plate character classification model with CNN
5. Use the trained model to run the license plate characters we extracted to get the license plate number
Step 1: pull out the license plate
Here, we mainly use a series of operations of expansion and corrosion in opencv to extract the rectangle of the license plate, and then extract the license plate according to the extracted rectangle.
Step 2: divide the extracted license plate into one character
Here, we first binarize the license plate, and then segment the characters according to the projection of each pixel on the x-axis. (the effect of the method I wrote in the first and second steps is fairly good, but I think it should be optimized. The smaller the image noise of the extracted characters, the higher the recognition rate of CNN, so there is no need to refer to my method here. The main thing is... This code is too bad to look back on myself)

def readjpg():

    img = cv2.imread(plate_path)
    # cv2.imshow('test', img)

    n = 1
    img_width = img.shape[0]
    img_height = img.shape[1]

    img_resize_width = round(n*img_width)
    img_resize_height = round(n*img_height)

    print(f'width:{img_width}, height:{img_height}')
    print(f'round_width:{img_resize_width}, rpund_height:{img_resize_height}')

    new_img_1 = cv2.resize(img, (img_resize_height, img_resize_width))
    # cv2.imshow('img2', new_img_1)
    # cv2.imshow('img', img)

    # Convert the input image from one color format to another (the default color format in OpenCV is usually called RGB, but it is actually BGR (bytes are the opposite)
    mark = cv2.cvtColor(new_img_1, cv2.COLOR_BGR2GRAY)
    # cv2.imshow('mark', mark)

    # Do Gaussian blur first
    mark = cv2.GaussianBlur(mark, (3, 3), 3, 0)
    # cv2.imshow('guss', mark)

    # edge detection 
    mark = cv2.Canny(mark, 300, 200, 3)
    # cv2.imshow('candy', mark)

    # Corrosion and expansion
    kernel_X = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 1))           # Define rectangular convolution kernel
    mark = cv2.dilate(mark, kernel_X, (-1, -1),iterations=2)                # Expansion operation
    mark = cv2.erode(mark, kernel_X, (-1, -1), iterations=4)                # Corrosion operation

    kernel_Y = cv2.getStructuringElement(cv2.MORPH_RECT, (1, 15))           # Define rectangular convolution kernel
    mark = cv2.dilate(mark, kernel_X, (-1, -1), iterations=2)               # Expansion operation
    mark = cv2.erode(mark, kernel_Y, (-1, -1), iterations=1)                # Corrosion operation

    mark = cv2.dilate(mark, kernel_Y, (-1, -1), iterations=2)
    mark = cv2.medianBlur(mark, 15)
    mark = cv2.medianBlur(mark, 15)

    # cv2.imshow('erode', mark)

    conyours, h = cv2.findContours(mark, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    # print(len(conyours))
    find_palat_flag = False
    for index in range(len(conyours)):
        area = cv2.contourArea(conyours[index])
        print(area)
        if area > MIN_PALAT_AREA:
            rect = cv2.boundingRect(conyours[index])
            # print(rect)
            print(rect[0], rect[1], rect[2], rect[3])
            wid_div_height = rect[2]/rect[3]
            print(f'wid_div_height:{wid_div_height}')
            if wid_div_height > 3 and wid_div_height< 8:
                find_palat_flag = True
                print(rect)
                img_x = int(rect[0])
                img_y = int(rect[1])
                img_width = int(rect[2])
                img_height = int(rect[3])
                print(f'x:{img_x}, y:{img_y}, width:{img_width}, height:{img_height}')

                # imgx[110:130,50:70,2] indicates a range: [height start point: height end point, width start point: width end point, which channel], and the starting points are in the upper left corner
                plate_img = new_img_1[img_y:img_y + img_height, img_x-10:img_x + img_width]    # Add 10 on both sides of the block width of the recognized license plate
                # plate_img = cv2.cvtColor(plate_img, cv2.COLOR_BGR2HSV)
                plate_img = cv2.cvtColor(plate_img, cv2.COLOR_BGR2GRAY) # Convert to grayscale image
                # plate_img = cv2.Canny(plate_img, 450, 120, 3)           # edge detection 
                # Perform closed operation
                # kernel = np.ones((3, 3), np.uint8)
                # plate_img = cv2.morphologyEx(plate_img, cv2.MORPH_CLOSE, kernel)
                # cv2.imshow('palat2', plate_img)
                _, plate_img = cv2.threshold(plate_img, 140, 255, cv2.THRESH_BINARY)    # Binarization

                # Corrosion and expansion
                kernel_X = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))  # Define rectangular convolution kernel
                plate_img = cv2.dilate(plate_img, kernel_X, (-1, -1), iterations=1)  # Expansion operation
                plate_img = cv2.erode(plate_img, kernel_X, (-1, -1), iterations=1)  # Corrosion operation

                cv2.imshow('palat3', plate_img) # Print out the pulled license plate
                cv2.imwrite('palat.jpg', plate_img)

                # Split license plate
                # Vertical projection
                plate_width = img_width + 10
                plate_height = img_height

                pix_list = []
                for i in range(plate_width):
                    num_pix = 0
                    for j in range(plate_height):
                        if plate_img[j][i] > 0:
                            num_pix += 1
                        # print(f'plate_img[{j}][{i}]:{plate_img[j][i]}')

                    num_pix = num_pix - 2
                    if num_pix <= 0:
                        num_pix = 0
                    print(f'num_pix:{num_pix}')

                    pix_list.append(num_pix)

                next_pix_len = 0
                index_start_list = []
                index_end_list = []
                flag_1 = True
                sum_len = 0
                sum_len_list = []
                print(f'pix_list_len:{len(pix_list)}')
                for i in range(len(pix_list)):
                    if pix_list[i] > 0:
                        sum_len += pix_list[i]
                        next_pix_len += 1
                        if flag_1:
                            index_start = i
                            index_start_list.append(index_start)
                            flag_1 = False
                    else:
                        if next_pix_len >=3:
                            sum_len_list.append(sum_len)
                            # print(f'sum_len = {sum_len}')
                            sum_len = 0
                            print(f'i:{i} next_pix_len:{next_pix_len}')
                            flag_1 = True
                            index_end_list.append(next_pix_len + index_start)
                        next_pix_len = 0
                    # print(f'index_start = {index_start}')
                # print(index_start_list)
                print(index_end_list)
                print(sum_len_list)
                sum_sort = []
                for index_o in range(len(sum_len_list)):
                    sum_sort.append(sum_len_list[index_o])
                print(f'sum_sort:[{sum_sort}]')

                # print(sorted(sum_len_list))
                print(f'len(index_end_list) = {len(index_end_list)}')
                sum_len_list_sort = sorted(sum_len_list)
                print(f'sum_len_list_sort:[{sum_len_list_sort}]')
                print(f'sum_sort:[{sum_sort}]')
                if len(sum_len_list_sort) > 7:
                    for index_m in range(0, len(sum_len_list_sort) - 7):
                        for index_p in range(len(sum_sort)):
                            if sum_sort[index_p] == sum_len_list_sort[index_m]:
                                print(f'{sum_sort[index_p]}=={sum_len_list_sort[index_m]}')
                                print(f'idx = {index_p}')
                                # print(f'index_start_list[index_p]={index_start_list[index_p]}')
                                del index_start_list[index_p]
                                del index_end_list[index_p]
                for index_i in range(len(index_end_list)):
                    print(f'[{index_start_list[index_i]}~{index_end_list[index_i]}]')
                    # cv2.imwrite(f'{index_i}.jpg', plate_img[0:plate_height, index_start_list[index_i]:index_end_list[index_i]+2])
                    singnum_img = plate_img[0:plate_height, index_start_list[index_i]:index_end_list[index_i]+2]
                    singnum_img_width = singnum_img.shape[1]
                    singnum_img_height = singnum_img.shape[0]
                    # print(f'singnum_img width:{singnum_img_width} singnum_img height:{singnum_img_height}')
                    y_top = 0
                    y_down = 0
                    y_pix_up_flag = True
                    y_pix_down_flag = True
                    for index_num_img_y in range(singnum_img_height):
                        for index_num_img_x in range(singnum_img_width):
                            if singnum_img[index_num_img_y][index_num_img_x] > 0:
                                y_pix_down_flag = False
                                if y_pix_up_flag:
                                    y_top = index_num_img_y
                                    y_pix_up_flag = False
                            else:
                                if not y_pix_down_flag:
                                    y_down = index_num_img_y
                                    y_pix_down_flag = True
                    print(f'y_top:{y_top}  y_down:{y_down}')
                    singnum_img = singnum_img[y_top:y_down+1, 0:singnum_img_width]
                    singnum_img_width = singnum_img.shape[1]
                    singnum_img_height = singnum_img.shape[0]
                    print(f'singnum_img width:{singnum_img_width} singnum_img height:{singnum_img_height}')
                    cv2.imwrite(f'{root_path}\\single_num\\{index_i}.jpg',singnum_img)

                # (img_x, img_y) is the coordinate of the upper left corner (img_x+img_width, img_height+img_y) is the coordinate of the lower right corner, and the two diagonal points determine a rectangle
                # cv2.rectangle(new_img_1, (img_x, img_y), (img_x+img_width, img_height+img_y),  (0, 0, 255), 2)
                cv2.rectangle(new_img_1, rect, (0, 0, 255), 2)                              # Frame the recognized license plate in the image
                cv2.imshow('palat', new_img_1)

    if not find_palat_flag:
        print("Can't find palat!!!!")

    cv2.waitKey(0)
    return 0

This interface encapsulates the first step of picking up the license plate and the second step of license plate segmentation, which is poorly written

Step 3: resize the license plate to the corresponding size
The main purpose here is to put the pictures into the trained model later, and unify the size of all pictures. Specifically, fill the pictures with a size of 4:5, and then resize them to 32x40, so as to ensure that the extracted pictures will not stretch and deform when resizing.

def resize_image(image, height = IMAGE_HEIGHT, width = IMAGE_WIDTH):
    top, botton, left, right = 0, 0, 0, 0

    h, w, c = image.shape

    loggest_edge = max(h, w)

    # Calculate how much width the short side needs to increase to make it equal in length and width
    if h < loggest_edge:
        dh = loggest_edge - h
        top = dh // 2
        botton = dh - top
    elif w < loggest_edge:
        dw = IMG_WIDTH - w
        left = dw // 2
        right = dw - left
    else:
        pass

    BLACK = [0, 0, 0]
    # Convert the image into a square image, and fill the missing ones on both sides or up and down with black rectangles
    constant = cv2.copyMakeBorder(image, top, botton, left, right, cv2.BORDER_CONSTANT, value=BLACK)

    return cv2.resize(constant, (height, width))

def readpath(path_name):
    for dir_item in os.listdir(path_name):
        full_path = os.path.abspath(os.path.join(path_name, dir_item))  # Name of combined photo and path
        if os.path.isdir(full_path):    # If it is a folder, call recursively
            readpath(full_path)
        else:
            if dir_item.endswith('.jpg'):
                image = cv2.imread(full_path)
                image = resize_image(image, IMAGE_WIDTH, IMAGE_HEIGHT)

                images.append(image)
                # print('full_path:', full_path)
                # print('dir_item:', dir_item)
                labels.append(dir_item)
    return images, labels

def load_dataset(path_name):
    images, labels = readpath(path_name)

    resizedata_path = RESIZE_IMG_PATH
    # resizedata_path = 'D:\\DeapLearn Project\\Face_Recognition\\moreface\\7219face\\test\\resizeface\\'
    for i in range(len(images)):
        if not os.path.exists(resizedata_path):
            os.mkdir(resizedata_path)
        img_name = '%s//%s' % (resizedata_path, labels[i])
        cv2.imwrite(img_name, images[i])

Step 4: train the license plate character classification model with CNN
I found a data set of license plate characters, with a total of 0-9 numbers, 26 uppercase English letters of A-Z and 6 provincial abbreviations:
There are several binary character pictures in each classified folder, and then the model is trained according to this data set.

# Dataset class
class MyDataSet(Dataset):
    def __init__(self, data_path:str, transform=None):  # Incoming training sample path
        super(MyDataSet, self).__init__()
        self.data_path = data_path
        if transform is None:
            self.transform = transforms.Compose(
                [
                    transforms.Resize(size=(32, 40)), # Originally 32x40, there is no need to modify the size
                    transforms.ToTensor(),
                    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                    # transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)),
                ]
            )
        else:
            self.transform = transform
        self.path_list = os.listdir(data_path)

    def __getitem__(self, idx:int):
        img_path = self.path_list[idx]
        label = int(img_path.split('.')[1])
        label = torch.as_tensor(label, dtype=torch.int64)
        img_path = os.path.join(self.data_path, img_path)
        img = Image.open(img_path)
        img = self.transform(img)
        return img, label

    def __len__(self)->int:
        return len(self.path_list)


train_ds = MyDataSet(train_path)
test_data = MyDataSet(test_path)
# for i, item in enumerate(tqdm(train_ds)):
#     print(item)
#     break

# Data loading
new_train_loader = DataLoader(train_ds, batch_size=32, shuffle=True, pin_memory=True, num_workers=0)
new_test_loader = DataLoader(test_data, batch_size=32, shuffle=False, pin_memory=True, num_workers=0)

# for i, item in enumerate(new_train_loader):
#     print(item[0].shape)
#     break
#
# img_PIL_Tensor = train_ds[1][0]
# new_img_PIL = transforms.ToPILImage()(img_PIL_Tensor).convert('RGB')
# plt.imshow(new_img_PIL)
# plt.show()


# Set up training class
class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = torch.nn.Conv2d(3, 64, kernel_size=3, padding=1)
        self.conv2 = torch.nn.Conv2d(64, 64, kernel_size=3, padding=1)
        self.conv3 = torch.nn.Conv2d(64, 128, kernel_size=3, padding=1)
        self.conv4 = torch.nn.Conv2d(128, 128, kernel_size=3, padding=1)
        self.conv5 = torch.nn.Conv2d(128, 256, kernel_size=3, padding=1)
        self.conv6 = torch.nn.Conv2d(256, 256, kernel_size=3, padding=1)
        self.maxpooling = torch.nn.MaxPool2d(2)
        self.avgpool = torch.nn.AvgPool2d(2)
        self.globalavgpool = torch.nn.AvgPool2d((8, 10))
        self.bn1 = torch.nn.BatchNorm2d(64)
        self.bn2 = torch.nn.BatchNorm2d(128)
        self.bn3 = torch.nn.BatchNorm2d(256)
        self.dropout50 = torch.nn.Dropout(0.5)
        self.dropout10 = torch.nn.Dropout(0.1)

        self.fc1 = torch.nn.Linear(256, 40)

    def forward(self, x):
        batch_size = x.size(0)
        x = self.bn1(F.relu(self.conv1(x)))
        x = self.bn1(F.relu(self.conv2(x)))
        x = self.maxpooling(x)
        x = self.dropout10(x)
        x = self.bn2(F.relu(self.conv3(x)))
        x = self.bn2(F.relu(self.conv4(x)))
        x = self.maxpooling(x)
        x = self.dropout10(x)
        x = self.bn3(F.relu(self.conv5(x)))
        x = self.bn3(F.relu(self.conv6(x)))
        x = self.globalavgpool(x)
        x = self.dropout50(x)

        x = x.view(batch_size, -1)

        x = self.fc1(x)
        return x

Step 5: use the trained model to run the license plate characters we extracted to get the license plate number
Here, directly load the model trained in the previous step, and then import the resize d license plate character image into the model to get the predicted license plate number.

def test():
    correct = 0
    total = 0
    with torch.no_grad():
        for _, data in enumerate(new_test_loader, 0):
            inputs, _ = data[0], data[1]
            inputs = inputs.to(device)
            outputs = model(inputs)
            # print(outputs.shape)
            _, prediction = torch.max(outputs.data, dim=1)
            print('-'*40)
            # print(target)
            # print(prediction)
            print(f'Predicted license plate number:'
                  f'{SINGLE_CHAR_LIST[prediction[0]]}'
                  f'{SINGLE_CHAR_LIST[prediction[1]]}'
                  f'{SINGLE_CHAR_LIST[prediction[2]]}'
                  f'{SINGLE_CHAR_LIST[prediction[3]]}'
                  f'{SINGLE_CHAR_LIST[prediction[4]]}'
                  f'{SINGLE_CHAR_LIST[prediction[5]]}'
                  f'{SINGLE_CHAR_LIST[prediction[6]]}')

It can be seen that the predicted license plate number is min G99999
It is consistent with the car brand in the picture we entered.
The above is the whole process of license plate recognition. Here I only post the codes of some key steps. If you need the whole project, you can go to GitHub to get it.
Reference article:
c + + implementation of license plate recognition for singing er boss
PyTorch deep learning practice course by teacher Liu of station B

Project GitHub: Github link (click a star for help, thank you)
Send me an email if you want a dataset 1009088103@qq.com

Topics: OpenCV Pytorch Computer Vision Deep Learning