Paddedetection algorithm analysis (12)

Posted by Ty44ler on Mon, 27 Dec 2021 08:53:00 +0100

2021SC@SDUSC

Next, the competition champion model of paddedetection is analyzed

CascadeCA RCNN is the best single model of Baidu visual technology department in Google AI open images 2019 object detction competition. The single model helped the team win the second place among the teams with more than 500 parameters. Open Images Dataset V5(OIDV5) contains 500 categories, 173W training images and more than 1400W annotation frames. It is the largest known public data set for target detection. The data set address is: Open Images V6 . Technical proposal report address of the team in the competition: https://arxiv.org/pdf/1911.07171.pdf

1. Model principle

In the two stage model, it is common to predict the candidate boxes of some target objects. Between this candidate box and the real value, it is generally determined whether the box is a positive sample and the candidate box to be retained by calculating the cross area (IOU). The common IOU parameter setting is generally 0.5, but this 0.5 parameter setting will generally lead to many invalid objects, as shown in the left figure below. When this parameter is set to 0.7, it will be clearer as shown in the right figure.

However, what problems will it bring when setting 0.7? It is inevitable to miss some target boxes, especially small targets. At the same time, due to the small number of positive samples, it is easy to fit according to.

 

 

The focus of cascade RCNN is to solve the problem of IOU parameter setting. It sets up a cascade detection method to realize it. As shown in the d subgraph below:

In Figure d, its cascade characteristics can be clearly seen. Compared with figure b, its IOU parameters are different every time, which are normally set to 0.5, 0.6 and 0.7.

In this way, the cascade optimization detection of candidate frames can be realized. For example, the configuration of cascade RCNN in mmdetection is as follows:

 rcnn=[
        dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.5,
                neg_iou_thr=0.5,
                min_pos_iou=0.5,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=512,
                pos_fraction=0.25,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=-1,
            debug=False),
        dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.6,
                neg_iou_thr=0.6,
                min_pos_iou=0.6,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=512,
                pos_fraction=0.25,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=-1,
            debug=False),
        dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.7,
                min_pos_iou=0.7,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=512,
                pos_fraction=0.25,
                neg_pos_ub=-1,
                add_gt_as_proposals=True),
            pos_weight=-1,
            debug=False)
    ],


2. Experiment

In mmdetection, the model can be used for testing through a few lines of code. As follows:

 

The detection effect is as follows, but it is different from fast_ Compared with RCNN, the effect of this pre training model is not good.

 


 

Topics: Algorithm