PointPillars point cloud detection in OpenPCDet reasoning code explanation

Posted by RobM on Thu, 24 Feb 2022 08:59:10 +0100

The implementation and explanation of PointPillars' paper and training code have been analyzed in detail in the previous article. Please refer to the previous blog: PointPillars paper analysis and OpenPCDet code analysis_ Nnnathan blog - CSDN blog.

This blog will analyze the reasoning code of PointPillars model in OpenPCDet in detail and test the effect of reasoning.

Readers can download OpenPCDet and read and understand it according to the article.

Due to my lack of knowledge, there will inevitably be deficiencies in the analysis. You are welcome to correct and discuss. If you have good suggestions or opinions, you can leave a message in the comment area. Thank you!

PointPillars' paper address is:

https://arxiv.org/pdf/1812.05784.pdf

Parsing reference code:

https://github.com/open-mmlab/OpenPCDet

Detailed comments: the code has been placed in my github warehouse:

GitHub - nathansong / openpcddet Annotated: code analysis of openpcddet model

1, PointPillars network structure and data preprocessing

The network structure has been introduced in detail in the previous training structure. Here, we will skip this part and directly connect it to the parsing and reasoning implementation in the previous detection header implementation code.

Before reasoning, it is necessary to preprocess the original point cloud data; The point cloud outside the specified range needs to be removed, and VoxelGeneratorWrapper needs to be used to generate a pixel from the point cloud. See the training blog for details.

2, Network reasoning results

In the final results of PointPillars, we get three results: each anchor on the characteristic graph and each anchor predicts seven regression parameters, one category and one direction classification. Where 7 regression parameters (x, y, z, w, l, h, θ); x. Y and z predict the offset value from the target center point to the upper left vertex of the anchor, and w, l and H predict the adjustment coefficient based on the length, width and height of the anchor, θ The rotation angle and direction category of the box are predicted, and the orientation of the box is predicted. The relationship between the two directions is as follows (the deviation from the x-axis in the radar coordinate system is 45 degrees. The reason can be seen in this issue: https://github.com/open-mmlab/OpenPCDet/issues/80).

The following figure is from FCOS3D: https://arxiv.org/pdf/2104.10956.pdf

 

The code is in pcdet / Models / deny_ heads/anchor_ head_ single. py

import numpy as np
import torch.nn as nn
 
from .anchor_head_template import AnchorHeadTemplate
 
 
class AnchorHeadSingle(AnchorHeadTemplate):
    """
    Args:
        model_cfg: AnchorHeadSingle Configuration of
        input_channels: 384 Number of input channels
        num_class: 3
        class_names: ['Car','Pedestrian','Cyclist']
        grid_size: (432, 496, 1)
        point_cloud_range: (0, -39.68, -3, 69.12, 39.68, 1)
        predict_boxes_when_training: False
    """
 
    def __init__(self, model_cfg, input_channels, num_class, class_names, grid_size, point_cloud_range,
                 predict_boxes_when_training=True, **kwargs):
        super().__init__(
            model_cfg=model_cfg, num_class=num_class, class_names=class_names, grid_size=grid_size,
            point_cloud_range=point_cloud_range,
            predict_boxes_when_training=predict_boxes_when_training
        )
        # Each point has a priori box of 3 scales, and each priori box has two directions (0 degrees, 90 degrees) num_anchors_per_location:[2, 2, 2]
        self.num_anchors_per_location = sum(self.num_anchors_per_location)  # sum([2, 2, 2])
        # Conv2d(512,18,kernel_size=(1,1),stride=(1,1))
        self.conv_cls = nn.Conv2d(
            input_channels, self.num_anchors_per_location * self.num_class,
            kernel_size=1
        )
        # Conv2d(512,42,kernel_size=(1,1),stride=(1,1))
        self.conv_box = nn.Conv2d(
            input_channels, self.num_anchors_per_location * self.box_coder.code_size,
            kernel_size=1
        )
        # If there is directional loss, add directional convolution layer conv2d (512,12, kernel_size = (1,1), stripe = (1,1))
        if self.model_cfg.get('USE_DIRECTION_CLASSIFIER', None) is not None:
            self.conv_dir_cls = nn.Conv2d(
                input_channels,
                self.num_anchors_per_location * self.model_cfg.NUM_DIR_BINS,
                kernel_size=1
            )
        else:
            self.conv_dir_cls = None
        self.init_weights()
 
    # Initialization parameters
    def init_weights(self):
        pi = 0.01
        # Initialize classification convolution offset
        nn.init.constant_(self.conv_cls.bias, -np.log((1 - pi) / pi))
        # Initialize classification convolution weight
        nn.init.normal_(self.conv_box.weight, mean=0, std=0.001)
 
    def forward(self, data_dict):
        # Extract the information processed by backbone from the dictionary
        # spatial_ features_ Size, 216, 2D
        spatial_features_2d = data_dict['spatial_features_2d']
        # Category prediction of 6 a priori boxes above each coordinate point -- > (batch_size, 18, 200, 176)
        cls_preds = self.conv_cls(spatial_features_2d)
        # Parameter prediction of 6 a priori boxes above each coordinate point -- > (batch_size, 42, 200, 176), in which each a priori box needs to predict 7 parameters, namely (x, y, z, w, l, h, θ)
        box_preds = self.conv_box(spatial_features_2d)
        # Dimension adjustment: place the category in the last dimension [n, h, W, C] -- > (batch_size, 200, 176, 18)
        cls_preds = cls_preds.permute(0, 2, 3, 1).contiguous()
        # Dimension adjustment: place the a priori box adjustment parameters in the last dimension [n, h, W, C] -- > (batch_size, 200, 176, 42)
        box_preds = box_preds.permute(0, 2, 3, 1).contiguous()
        # Put the category and a priori box adjustment prediction results into the forward propagation dictionary
        self.forward_ret_dict['cls_preds'] = cls_preds
        self.forward_ret_dict['box_preds'] = box_preds
        # Direction classification prediction
        if self.conv_dir_cls is not None:
            # # Each a priori box is predicted to be one of two directions -- > (batch_size, 12, 200, 176)
            dir_cls_preds = self.conv_dir_cls(spatial_features_2d)
            # Put the category and a priori box direction prediction results into the last dimension [n, h, W, C] -- > (batch_size, 248, 216, 12)
            dir_cls_preds = dir_cls_preds.permute(0, 2, 3, 1).contiguous()
            # Put the direction prediction result into the forward propagation dictionary
            self.forward_ret_dict['dir_cls_preds'] = dir_cls_preds
        else:
            dir_cls_preds = None
 
        """
        If it is in training mode, each a priori box needs to be assigned GT To calculate loss
        """
        if self.training:
            # targets_dict = {
            #     'box_cls_labels': cls_labels, # (4,211200)
            #     'box_reg_targets': bbox_targets, # (4,211200, 7)
            #     'reg_weights': reg_weights # (4,211200)
            # }
            targets_dict = self.assign_targets(
                gt_boxes=data_dict['gt_boxes']  # (4,39,8)
            )
            # Put the GT allocation result into the forward propagation dictionary
            self.forward_ret_dict.update(targets_dict)
 
        # If it is not the training mode, the prediction for box is generated directly
        if not self.training or self.predict_boxes_when_training:
            # Decode and generate the final result according to the prediction result
            batch_cls_preds, batch_box_preds = self.generate_predicted_boxes(
                batch_size=data_dict['batch_size'],
                cls_preds=cls_preds, box_preds=box_preds, dir_cls_preds=dir_cls_preds
            )
            data_dict['batch_cls_preds'] = batch_cls_preds  # (1, 211200, 3) 70400*3=211200
            data_dict['batch_box_preds'] = batch_box_preds  # (1, 211200, 7)
            data_dict['cls_preds_normalized'] = False
 
        return data_dict

After head prediction, three tensors can be obtained, which are:

Category prediction of each anchor: (batch_size, 248, 216, 18)

Prediction of 7 regression parameters of each anchor: (batch_size, 248, 216, 42)

Direction classification of each anchor: (batch_size, 248, 216, 12)

Among them, 18 can be regarded as 6 anchors, and each anchor predicts 3 categories;

Among them, 42 can be regarded as 6 anchors, and each anchor predicts 7 regression parameters;

Among them, 12 can be regarded as 6 anchors, and each anchor predicts 2 directions;


Note: batch is the default for reasoning_ Size is 1.

Next, generate the prediction results:

Code: pcdet / Models / deny_ heads/anchor_ head_ template. py

    def generate_predicted_boxes(self, batch_size, cls_preds, box_preds, dir_cls_preds=None):
        """
        Args:
            batch_size:
            cls_preds: (N, H, W, C1)
            box_preds: (N, H, W, C2)
            dir_cls_preds: (N, H, W, C3)

        Returns:
            batch_cls_preds: (B, num_boxes, num_classes)
            batch_box_preds: (B, num_boxes, 7+C)

        """
        if isinstance(self.anchors, list):
            # Whether to use long forecast. No by default
            if self.use_multihead:
                anchors = torch.cat([anchor.permute(3, 4, 0, 1, 2, 5).contiguous().view(-1, anchor.shape[-1])
                                     for anchor in self.anchors], dim=0)
            else:
                """
                Each category anchor Generation of:
                [(Z, Y, X, anchor scale, This scale anchor direction, 7 Regression parameters)
                (Z, Y, X, anchor scale, This scale anchor direction, 7 Regression parameters)
                (Z, Y, X, anchor scale, This scale anchor direction, 7 Regression parameters)]
                Splice in the penultimate dimension
                anchors dimension (Z, Y, X, 3 individual anchor scale, Two directions per scale, 7)
                            (1, 248, 216, 3, 2, 7)
                """
                anchors = torch.cat(self.anchors, dim=-3)
        else:
            anchors = self.anchors
        # Calculate the total number of anchors Z * y * x * num_ of_ anchor_ scale*anchor_ rot
        num_anchors = anchors.view(-1, anchors.shape[-1]).shape[0]
        # (batch_size, Z*Y*X*num_of_anchor_scale*anchor_rot, 7)
        batch_anchors = anchors.view(1, -1, anchors.shape[-1]).repeat(batch_size, 1, 1)

        # The prediction results are flatten as one-dimensional
        # (batch_size, Z*Y*X*num_of_anchor_scale*anchor_rot, 3)
        batch_cls_preds = cls_preds.view(batch_size, num_anchors, -1).float() \
            if not isinstance(cls_preds,
                              list) else cls_preds
        # (batch_size, Z*Y*X*num_of_anchor_scale*anchor_rot, 7)
        batch_box_preds = box_preds.view(batch_size, num_anchors, -1) if not isinstance(box_preds, list) \
            else torch.cat(box_preds, dim=1).view(batch_size, num_anchors, -1)
        # Decode the seven predicted box parameters
        batch_box_preds = self.box_coder.decode_torch(batch_box_preds, batch_anchors)
        # Direction prediction of each anchor
        if dir_cls_preds is not None:
            # 0.78539 direction offset
            dir_offset = self.model_cfg.DIR_OFFSET
            # 0
            dir_limit_offset = self.model_cfg.DIR_LIMIT_OFFSET  # 0
            # Set the direction prediction result flat to one-dimensional (batch_size, Z*Y*X*num_of_anchor_scale*anchor_rot, 2)
            dir_cls_preds = dir_cls_preds.view(batch_size, num_anchors, -1) if not isinstance(dir_cls_preds, list) \
                else torch.cat(dir_cls_preds, dim=1).view(batch_size, num_anchors, -1)  # (1, 321408, 2)
            # (batch_size, Z*Y*X*num_of_anchor_scale*anchor_rot)
            # Take out the direction classification of all anchor s: forward and reverse
            dir_labels = torch.max(dir_cls_preds, dim=-1)[1]
            # pi
            period = (2 * np.pi / self.model_cfg.NUM_DIR_BINS)
            # Set the angle between 0 and pi. In OpenPCDet, the coordinates use the unified standard coordinates, x forward, y left and z up
            # Referring to the reasons for training, now rotate the angle counterclockwise along the x axis by 45 degrees to get dir_rot
            dir_rot = common_utils.limit_period(
                batch_box_preds[..., 6] - dir_offset, dir_limit_offset, period
            )
            """
            Rotate the angle back to the lidar coordinate system, so you need to add back the 45 degrees you subtracted before,
            If dir_labels If it is 1, it indicates that the direction is 180 degrees, so it is necessary to add 180 degrees to the predicted angle information,
            Otherwise, the prediction angle is the obtained angle
            """
            batch_box_preds[..., 6] = dir_rot + dir_offset + period * dir_labels.to(batch_box_preds.dtype)
        
        # This item is not available in PointPillars
        if isinstance(self.box_coder, box_coder_utils.PreviousResidualDecoder):
            batch_box_preds[..., 6] = common_utils.limit_period(
                -(batch_box_preds[..., 6] + np.pi / 2), offset=0.5, period=np.pi * 2
            )

        return batch_cls_preds, batch_box_preds

The box decoding operation in the above code, that is, the inverse operation of coding:

Code: pcdet/utils/box_coder_utils.py

    def decode_torch(self, box_encodings, anchors):
        """
        Args:
            box_encodings: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading or *[cos, sin], ...]
            anchors: (B, N, 7 + C) or (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]

        Returns:

        """
        # This means torch The second parameter of split is torch split(tensor, split_size, dim=)  split_ Size is the size of each piece after segmentation, not the number of pieces!, Redundant parameters are received with * cags
        xa, ya, za, dxa, dya, dza, ra, *cas = torch.split(anchors, 1, dim=-1)
        # box PointPillar after split coding is False
        if not self.encode_angle_by_sincos:
            xt, yt, zt, dxt, dyt, dzt, rt, *cts = torch.split(box_encodings, 1, dim=-1)
        else:
            xt, yt, zt, dxt, dyt, dzt, cost, sint, *cts = torch.split(box_encodings, 1, dim=-1)
        # Calculate the length of anchor diagonal
        diagonal = torch.sqrt(dxa ** 2 + dya ** 2)  # (B, N, 1)-->(1, 321408, 1)
        # Operation of anchor and GT code in loss calculation: g represents GT and a represents anchor
        # ∆x = (x^gt − xa^da)/diagonal --> x^gt = ∆x * diagonal + x^da
        # The same below
        xg = xt * diagonal + xa
        yg = yt * diagonal + ya
        zg = zt * dza + za
        # ∆ L = inverse operation of log (L ^ GT / L ^ a -- > L ^ GT = exp (∆ l) * l^a
        # The same below
        dxg = torch.exp(dxt) * dxa
        dyg = torch.exp(dyt) * dya
        dzg = torch.exp(dzt) * dza

        # If the angle is cos and sin coding, the new decoding method PointPillar is False
        if self.encode_angle_by_sincos:
            rg_cos = cost + torch.cos(ra)
            rg_sin = sint + torch.sin(ra)
            rg = torch.atan2(rg_sin, rg_cos)
        else:
            # rts = [rg - ra] inverse of angle
            rg = rt + ra
        # PointPillar has no such item
        cgs = [t + a for t, a in zip(cts, cas)]
        return torch.cat([xg, yg, zg, dxg, dyg, dzg, rg, *cgs], dim=-1)

3, Reasoning result post-processing

Here, the nms operation without category is carried out for all the prediction results, and the final prediction results are obtained.

Code: pcdet/models/detectors/detector3d_template.py

    def post_processing(self, batch_dict):
        """
        Args:
            batch_dict:
                batch_size:
                batch_cls_preds: (B, num_boxes, num_classes | 1) or (N1+N2+..., num_classes | 1)
                                or [(B, num_boxes, num_class1), (B, num_boxes, num_class2) ...]
                multihead_label_mapping: [(num_class1), (num_class2), ...]
                batch_box_preds: (B, num_boxes, 7+C) or (N1+N2+..., 7+C)
                cls_preds_normalized: indicate whether batch_cls_preds is normalized
                batch_index: optional (N1+N2+...)
                has_class_labels: True/False
                roi_labels: (B, num_rois)  1 .. num_classes
                batch_pred_labels: (B, num_boxes, 1)
        Returns:

        """
        # post_process_cfg post-processing parameters, including nms type, threshold, equipment used, results retained at most after nms, confidence of output, etc
        post_process_cfg = self.model_cfg.POST_PROCESSING
        # Reasoning defaults to 1
        batch_size = batch_dict['batch_size']
        # Keep the dictionary for calculating recall
        recall_dict = {}
        # The prediction results are stored here
        pred_dicts = []
        # Frame by frame processing
        for index in range(batch_size):
            if batch_dict.get('batch_index', None) is not None:
                assert batch_dict['batch_box_preds'].shape.__len__() == 2
                batch_mask = (batch_dict['batch_index'] == index)
            else:
                assert batch_dict['batch_box_preds'].shape.__len__() == 3
                # Which frame is currently being processed
                batch_mask = index
            # box_preds shape (number of all anchor s, 7)
            box_preds = batch_dict['batch_box_preds'][batch_mask]
            # After copying, it is used for recall calculation
            src_box_preds = box_preds

            if not isinstance(batch_dict['batch_cls_preds'], list):
                # (number of all anchor s, 3)
                cls_preds = batch_dict['batch_cls_preds'][batch_mask]
                # ditto
                src_cls_preds = cls_preds
                assert cls_preds.shape[1] in [1, self.num_class]

                if not batch_dict['cls_preds_normalized']:
                    # The BCE used in the loss function calculation, so the sigmoid activation function is used here to obtain the category probability
                    cls_preds = torch.sigmoid(cls_preds)
            else:
                cls_preds = [x[batch_mask] for x in batch_dict['batch_cls_preds']]
                src_cls_preds = cls_preds
                if not batch_dict['cls_preds_normalized']:
                    cls_preds = [torch.sigmoid(x) for x in cls_preds]

            # Whether to use multi category NMS calculation, No.
            if post_process_cfg.NMS_CONFIG.MULTI_CLASSES_NMS:
                if not isinstance(cls_preds, list):
                    cls_preds = [cls_preds]
                    multihead_label_mapping = [torch.arange(1, self.num_class, device=cls_preds[0].device)]
                else:
                    multihead_label_mapping = batch_dict['multihead_label_mapping']

                cur_start_idx = 0
                pred_scores, pred_labels, pred_boxes = [], [], []
                for cur_cls_preds, cur_label_mapping in zip(cls_preds, multihead_label_mapping):
                    assert cur_cls_preds.shape[1] == len(cur_label_mapping)
                    cur_box_preds = box_preds[cur_start_idx: cur_start_idx + cur_cls_preds.shape[0]]
                    cur_pred_scores, cur_pred_labels, cur_pred_boxes = model_nms_utils.multi_classes_nms(
                        cls_scores=cur_cls_preds, box_preds=cur_box_preds,
                        nms_config=post_process_cfg.NMS_CONFIG,
                        score_thresh=post_process_cfg.SCORE_THRESH
                    )
                    cur_pred_labels = cur_label_mapping[cur_pred_labels]
                    pred_scores.append(cur_pred_scores)
                    pred_labels.append(cur_pred_labels)
                    pred_boxes.append(cur_pred_boxes)
                    cur_start_idx += cur_cls_preds.shape[0]

                final_scores = torch.cat(pred_scores, dim=0)
                final_labels = torch.cat(pred_labels, dim=0)
                final_boxes = torch.cat(pred_boxes, dim=0)
            else:
                # Get the maximum probability of category prediction and the corresponding index value
                cls_preds, label_preds = torch.max(cls_preds, dim=-1)
                if batch_dict.get('has_class_labels', False):
                    label_key = 'roi_labels' if 'roi_labels' in batch_dict else 'batch_pred_labels'
                    label_preds = batch_dict[label_key][index]
                else:  
                    # Category forecast plus 1
                    label_preds = label_preds + 1
                    
                # Classless NMS operation
                # selected: returns the anchor index left behind
                # selected_scores: returns the confidence score of the anchor left behind
                selected, selected_scores = model_nms_utils.class_agnostic_nms(
                    # Category prediction probability and anchor regression parameters of each anchor
                    box_scores=cls_preds, box_preds=box_preds, 
                    nms_config=post_process_cfg.NMS_CONFIG,
                    score_thresh=post_process_cfg.SCORE_THRESH
                )
                # None
                if post_process_cfg.OUTPUT_RAW_SCORE:  
                    max_cls_preds, _ = torch.max(src_cls_preds, dim=-1)
                    selected_scores = max_cls_preds[selected]

                # Get the score of the final category prediction
                final_scores = selected_scores
                # Get the final category prediction result according to the selected
                final_labels = label_preds[selected]
                # The final box regression result is obtained according to the selected
                final_boxes = box_preds[selected]  
                
            # If there is no GT label in batch_ In dict, the recall value will not be calculated
            recall_dict = self.generate_recall_record(
                box_preds=final_boxes if 'rois' not in batch_dict else src_box_preds,
                recall_dict=recall_dict, batch_index=index, data_dict=batch_dict,
                thresh_list=post_process_cfg.RECALL_THRESH_LIST
            )
            # Generate the result Dictionary of the final forecast
            record_dict = {
                'pred_boxes': final_boxes,
                'pred_scores': final_scores,
                'pred_labels': final_labels
            }
            pred_dicts.append(record_dict)

        return pred_dicts, recall_dict

No category nms operation

Code: pcdet/models/model_utils/model_nms_utils.py

def class_agnostic_nms(box_scores, box_preds, nms_config, score_thresh=None):
    # 1. Firstly, according to the confidence threshold, the filtering unit filters out most box es with low confidence to speed up the subsequent nms operation
    src_box_scores = box_scores
    if score_thresh is not None:
        # The prediction probability of the obtained category is greater than score_ mask of thresh
        scores_mask = (box_scores >= score_thresh)
        # According to the mask, which anchor categories are predicted to be greater than the score_ Thresh -- > anchor category
        box_scores = box_scores[scores_mask]
        # According to the mask, which anchor categories are predicted to be greater than the score_ Seven parameters of thresh -- > anchor regression
        box_preds = box_preds[scores_mask]

    # Initialize the empty list to store the anchor retained after nms
    selected = []
    # If the category prediction of anchor is greater than score_ nms is performed only if thresh is used; otherwise, null is returned
    if box_scores.shape[0] > 0:
        # Here, only the maximum K anchor confidence levels are reserved for nms operation,
        # k is the minimum value of min(nms_config.NMS_PRE_MAXSIZE, box_scores.shape[0])
        box_scores_nms, indices = torch.topk(box_scores, k=min(nms_config.NMS_PRE_MAXSIZE, box_scores.shape[0]))

        # box_scores_nms only gets the update result of the category;
        # Here, update the prediction results of boxes and update the prediction results of boxes according to the results re selected by tokK and sorted from large to small
        boxes_for_nms = box_preds[indices]
        # Call iou3d_ nms_ NMS of utils_ The GPU function performs NMS,
        # Returns the index of the reserved box, selected_scores = None
        # Find the box index value according to the returned index
        keep_idx, selected_scores = getattr(iou3d_nms_utils, nms_config.NMS_TYPE)(
            boxes_for_nms[:, 0:7], box_scores_nms, nms_config.NMS_THRESH, **nms_config
        )
        selected = indices[keep_idx[:nms_config.NMS_POST_MAXSIZE]]

    if score_thresh is not None:
        # If there is a confidence threshold, scores_mask is box_scores in SRC_ box_ The index in scores is the original index
        original_idxs = scores_mask.nonzero().view(-1)
        # Box represented by selected_ The selected index of scores. After this index,
        # Selected indicates src_box_scores the selected box index
        selected = original_idxs[selected]

    return selected, src_box_scores[selected]

nms_gpu

Code: pcdet/ops/iou3d_nms/iou3d_nms_utils.py

def nms_gpu(boxes, scores, thresh, pre_maxsize=None, **kwargs):
    """
    :param boxes: Screened anchor Seven regression prediction results(N, 7) [x, y, z, dx, dy, dz, heading]
    :param scores: Screened anchor Category of, and boxes One to one correspondence(N)
    :param thresh:
    :return:
    """
    assert boxes.shape[1] == 7
    # Sort the scores in descending order (from large to small) and take out the corresponding index
    # dim=0 sort by column, dim=1 sort by row, default dim=1
    # Because the incoming scores have been sorted before, the order is [0, 1, 2, 3,...]
    order = scores.sort(0, descending=True)[1]
    # If there is the maximum number of boxes before NMS (4096), take out the first 4096 box indexes
    if pre_maxsize is not None:
        order = order[:pre_maxsize]

    # Take out the box before NMS. It has been ordered before, and there is no change here
    boxes = boxes[order].contiguous()
    # Construct a boxes Vector PPP of size dimension
    keep = torch.LongTensor(boxes.size(0))
    # Call cuda function for acceleration
    # keep: the subscript of the record retention target box
    # num_out: returns the number of reserved
    num_out = iou3d_nms_cuda.nms_gpu(boxes, keep, thresh)
    # After iou3d_ nms_ After CUDA, the reason is to take the first num_ The reason for the number of out is that the maximum length of keep initialization is 4096
    return order[keep[:num_out].cuda()].contiguous(), None

4, Visualization

Visual code running:

1. Weight file: https://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/viewhttps://drive.google.com/file/d/1wMxWTpU1qUoY3DsCH31WJmvJxcjFXKlm/view

2. kitti dataset: The KITTI Vision Benchmark Suitehttp://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d

 

The code is in: Tools / demo py

import argparse
import glob
from pathlib import Path

try:
    import open3d
    from visual_utils import open3d_vis_utils as V
    OPEN3D_FLAG = True
except:
    import mayavi.mlab as mlab
    from visual_utils import visualize_utils as V
    OPEN3D_FLAG = False

import numpy as np
import torch

from pcdet.config import cfg, cfg_from_yaml_file
from pcdet.datasets import DatasetTemplate
from pcdet.models import build_network, load_data_to_gpu
from pcdet.utils import common_utils


class DemoDataset(DatasetTemplate):
    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.bin'):
        """
        Args:
            root_path:
            dataset_cfg:
            class_names:
            training:
            logger:
        """
        super().__init__(
            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
        )
        self.root_path = root_path
        self.ext = ext
        data_file_list = glob.glob(str(root_path / f'*{self.ext}')) if self.root_path.is_dir() else [self.root_path]

        data_file_list.sort()
        self.sample_file_list = data_file_list

    def __len__(self):
        return len(self.sample_file_list)

    def __getitem__(self, index):
        if self.ext == '.bin':
            points = np.fromfile(self.sample_file_list[index], dtype=np.float32).reshape(-1, 4)
        elif self.ext == '.npy':
            points = np.load(self.sample_file_list[index])
        else:
            raise NotImplementedError

        input_dict = {
            'points': points,
            'frame_id': index,
        }

        data_dict = self.prepare_data(data_dict=input_dict)
        return data_dict


def parse_config():
    parser = argparse.ArgumentParser(description='arg parser')
    parser.add_argument('--cfg_file', type=str, default='cfgs/kitti_models/pointpillar.yaml',
                        help='specify the config for demo')
    parser.add_argument('--data_path', type=str, default='/home/nathan/OpenPCDet/data/kitti/training/velodyne',
                        help='specify the point cloud data file or directory')
    parser.add_argument('--ckpt', type=str,
                        default="/home/nathan/OpenPCDet/output/kitti_models/pointpillar/default/ckpt/checkpoint_epoch_79.pth", help='specify the pretrained model')
    parser.add_argument('--ext', type=str, default='.bin', help='specify the extension of your point cloud data file')

    args = parser.parse_args()

    cfg_from_yaml_file(args.cfg_file, cfg)

    return args, cfg


def main():
    args, cfg = parse_config()
    logger = common_utils.create_logger()
    logger.info('-----------------Quick Demo of OpenPCDet-------------------------')
    demo_dataset = DemoDataset(
        dataset_cfg=cfg.DATA_CONFIG, class_names=cfg.CLASS_NAMES, training=False,
        root_path=Path(args.data_path), ext=args.ext, logger=logger
    )
    logger.info(f'Total number of samples: \t{len(demo_dataset)}')

    model = build_network(model_cfg=cfg.MODEL, num_class=len(cfg.CLASS_NAMES), dataset=demo_dataset)
    model.load_params_from_file(filename=args.ckpt, logger=logger, to_cpu=True)
    model.cuda()
    model.eval()
    with torch.no_grad():
        for idx, data_dict in enumerate(demo_dataset):
            logger.info(f'Visualized sample index: \t{idx + 1}')
            data_dict = demo_dataset.collate_batch([data_dict])
            load_data_to_gpu(data_dict)
            pred_dicts, _ = model.forward(data_dict)

            V.draw_scenes(
                points=data_dict['points'][:, 1:], ref_boxes=pred_dicts[0]['pred_boxes'],
                ref_scores=pred_dicts[0]['pred_scores'], ref_labels=pred_dicts[0]['pred_labels']
            )

            if not OPEN3D_FLAG:
                mlab.show(stop=True)

    logger.info('Demo done.')


if __name__ == '__main__':
    main()

Result 1:

Color camera 2:

Point cloud detection results:

 

 

Result 2

Color camera 2

 

 

References:

1,https://github.com/open-mmlab/OpenPCDet/

2,https://github.com/jjw-DL/OpenPCDet-Noted

3,Coordinate conversion using KITTI data set - Zhihu

4,[3D target detection] PointPillars paper and code analysis - Zhihu

5,[3D target detection] SECOND algorithm analysis - Zhihu

6,https://arxiv.org/abs/1812.05784

7,Sensors | Free Full-Text | SECOND: Sparsely Embedded Convolutional Detection

8,https://arxiv.org/abs/1711.06396

9,https://arxiv.org/abs/1612.00593

10,[3D computer vision] from pointnet to pointnet + + theory and pytorch code_ Small persistent blog - CSDN blog_ pointnet

11,[3D computer vision] code reading by pytorch of PointNet + +_ Small persistent blog - CSDN blog_ pointnet pytorch

12,The KITTI Vision Benchmark Suite

13,kitti dataset -- parameters_ Blog of cuichuanchen 3307 - CSDN blog_ kitti

14,https://arxiv.org/pdf/2104.10956.pdf

Topics: AI Pytorch Deep Learning Autonomous vehicles