PointGroup point cloud instance segmentation

Posted by sgoldenb on Mon, 31 Jan 2022 12:38:08 +0100

brief introduction

Classification and segmentation are two typical tasks in vision, and segmentation can be divided into semantic segmentation and instance segmentation The difference is that semantic segmentation divides the targets in the input into N N N categories, so is the output N N N categories Each output of instance segmentation contains only one target

This paper selects PointGroup model to do semantic segmentation on point cloud

environment

  • System: Ubuntu 16 04
  • Python version: python3 seven
  • Pytoch version: 1.1
  • Dataset: scannetv2
  • CUDA: 10.1
  • GPU: RTX 2080TI * 8

Among them, torch version must be 1.1, otherwise errors will occur due to some api changes

Project documents

git clone https://github.com/llijiang/PointGroup.git --recursive

The last - recursive is essential. This project requires several third-party git s, which can be automatically downloaded for us

The project directory structure is as follows:

PointGroup/
├─config/
├─data/
├─dataset/
├─doc/
├─lib/
├─model/
├─util/

Environmental preparation

It is recommended to create a new anaconda virtual environment, as follows:

conda create -n pointgroup python==3.7
source activate pointgroup
pip install -r requirements.txt

The following packages may fail to install due to network problems. It is recommended to hang up VPN and try again I tried for several days and succeeded

You must use these three commands to install, and the others installed through cmake are useless Originally, because Google sparsehash couldn't be installed, I installed the c + + version at the beginning, but it didn't work

conda install -c bioconda google-sparsehash 
conda install libboost
conda install -c daleydeng gcc-5

spconv

Next, find the path of boost / in the system and modify lib / spconv / cmakelists Include in TXT_ The sentence "directories ($include_path $)"

Now you can compile spconv:

cd lib/spconv
python setup.py bdist_wheel
cd dist
pip install ${whl}

Where, ${whl} is the whl file generated after successful compilation, which can be installed directly through pip

pointgroup_ops

The compilation process is as follows:

cd lib/pointgroup_ops
python setup.py develop

If you are prompted that a header file cannot be found, find the header file in the new anaconda PointGroup virtual environment (as long as all previous CONDA installations are successful, these header files must be in this virtual environment. Take a moment to find them), and add the header file with the following command:

python setup.py build_ext --include-dirs=${INCLUDE_PATH}
python setup.py develop

Where ${INCLUDE_PATH} is the path of the missing header file

data set

Download script

The data set of Scannetv2 is used in this paper. Due to the copyright problem, please go to This link , according to readme Copy ScanNet terms of use The contents in PDF are in word, and then transferred to ScanNet terms of use Pdf, which has been sent to you as an attachment scannet@googlegroups.com , the author will send you a download scan of the downloaded dataset within a week Py script and running tutorial

Get this download scan Py script can be run directly if it is in Python 2 environment. If it is in Python 3 environment, you need to make some changes to the urllib code. The change method is very simple. Change all urllib to urllib Request

Download dataset

The size of the entire scannetv2 data set is 1.3TB, so it is obviously impossible to download it all Fortunately, the data required by the model is only 20GB There are four types of files with suffixes that need to be downloaded:

  • _vh_clean_2.ply
  • _vh_clean_2.labels.ply
  • _vh_clean_2.0.010000.segs.json
  • .aggregation.json

Therefore, you need to run the following four commands:

python3 download-scannet.py -o scannet/ --type  _vh_clean_2.ply
python3 download-scannet.py -o scannet/ --type  _vh_clean_2.labels.ply
python3 download-scannet.py -o scannet/ --type  _vh_clean_2.0.010000.segs.json
python3 download-scannet.py -o scannet/ --type  .aggregation.json

For these four download tasks, after downloading all files with specified suffixes, four same files will be downloaded Zip file, but I don't know why The zip file cannot be downloaded through the script (an error will be reported) But these The zip file is not used by Zhang Hong in this project at all, so if you find that the terminal reports an error after downloading Zip file download failed, which is not affected

After downloading the dataset, you need to go to https://github.com/ScanNet/ScanNet/tree/master/Tasks/Benchmark Download scannetv2_train.txt, scannetv2_val.txt,scannetv2_test.txt these three files Their function is to divide the whole data set into training set, verification set and test set

After downloading, the directory structure of the dataset is as follows (only the directory structure is given, not all files are given):

├── scannet
│   ├── scans
│   │   ├── [scene_id]									# There should be the following four files under each folder
│   │   │   ├── [scene_id].aggregation.json
│   │   │   ├── [scene_id]_vh_clean_2.0.010000.segs.json
│   │   │   ├── [scene_id]_vh_clean_2.labels.ply
│   │   │   ├── [scene_id]_vh_clean_2.ply
│   ├── scans_test
│   │   ├── [scene_id]									# There should be one file below each folder
│   │   │   ├── [scene_id]_vh_clean_2.ply
│   ├── scannetv2_test.txt
│   ├── scannetv2_train.txt
│   ├── scannetv2_val.txt
│   ├── scannetv2-labels.combined.tsv

Partition dataset

Here I wrote a script called scan_ split. Py to verify the integrity of the data set As follows:

import os

dir_scans = 'scannet/scans/'		
dir_scans_test = 'scannet/scans_test/'
dir_train = 'dataset/scannetv2/train/'
dir_val   = 'dataset/scannetv2/val/'
dir_test  = 'dataset/scannetv2/test/'

for dir_ in [dir_test, dir_val, dir_train]:
    if not os.path.exists(dir_):
        os.makedirs(dir_)


def read_from_txt(path_txt):
    file = open(path_txt, 'r')
    data = []
    for line in file:
        data.append(line.split('\n')[0])
    return data


def check_files():
    """
    Run this function first, Ensure that all required files have been downloaded to the specified path
    """
    print('[INFO] begin checking files')
    for scan_name in os.listdir(dir_scans):
    
        print('[INFO] checking scan {} ...'.format(dir_scans + scan_name))
        check1 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name))
        check2 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name))
        check3 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name))
        check4 = os.path.exists(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name))

        assert check1, '[ERROR] scan {} has no `_vh_clean_2.ply` file'.format(dir_scans + scan_name)
        assert check2, '[ERROR] scan {} has no `_vh_clean_2.labels.ply` file'.format(dir_scans + scan_name)
        assert check3, '[ERROR] scan {} has no `_vh_clean_2.0.010000.segs.json` file'.format(dir_scans + scan_name)
        assert check4, '[ERROR] scan {} has no `.aggregation.json` file'.format(dir_scans + scan_name)
        print('[INFO] checking done')

    for scan_name in os.listdir(dir_scans_test):
    
        print('[INFO] checking scan {} ...'.format(dir_scans_test + scan_name))
        check1 = os.path.exists(dir_scans_test + scan_name + '/{}_vh_clean_2.ply'.format(scan_name))

        assert check1, '[ERROR] scan {} has no `_vh_clean_2.ply` file'.format(dir_scans + scan_name)
        print('[INFO] checking done')

    print('[INFO] checking files complete, all files exist')


def copy_files_to_dir(dir_to, path_files):
    for file in path_files:
        cmd = 'cp {} {}'.format(file, dir_to)
        os.system(cmd)
        print('[INFO] ', cmd)


def split_train_test_val():
    """
    Run this function again, The data set will be downloaded, Press train, val, test divide, Then copy to the specified path
    """
    print('[INFO] begin split train, val, test')
    for scan_name in os.listdir(dir_scans):
        path_files = []
        if scan_name in train:
            print('[INFO] scan {} in train.txt'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name))
            copy_files_to_dir(dir_train, path_files)
        elif scan_name in val:
            print('[INFO] scan {} in val.txt'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name))
            path_files.append(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name))
            copy_files_to_dir(dir_val, path_files)
    
    for scan_name in os.listdir(dir_scans_test):
        print('[INFO] scan {} in test.txt'.format(scan_name))
        path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name))
        copy_files_to_dir(dir_test, path_files)
    

if __name__ == '__main__':
    check_files() 
    print('='*50, '\n')
    train = read_from_txt('scannet/scannetv2_train.txt')
    val = read_from_txt('scannet/scannetv2_val.txt')
    test = read_from_txt('scannet/scannetv2_test.txt')
    split_train_test_val()

The script can be run directly. As long as there is no error during the operation, it indicates that the data set has been prepared as required

When ready, the directory structure of PointGroup/dataset is as follows:

PointGroup
├── dataset
│   ├── scannetv2
│   │   ├── train
│   │   │   ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
│   │   ├── val
│   │   │   ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
│   │   ├── test
│   │   │   ├── [scene_id]_vh_clean_2.ply 
│   │   ├── scannetv2-labels.combined.tsv

Finally, use the following command to generate the files required for training:

cd dataset/scannetv2
python prepare_data_inst.py --data_split train
python prepare_data_inst.py --data_split val
python prepare_data_inst.py --data_split test

train

CUDA_VISIBLE_DEVICES=0 python train.py --config config/pointgroup_run1_scannet.yaml 

The specific parameters are in config/pointgroup_run1_scannet.yaml file, you can change it freely After training, the weight and log will be saved in PointGroup/exp / directory You can open it with tensorboardX:

tensorboard --logdir=./exp --port=6666

Testing & Visualization

The author gives the weight after pre training, Link here . you can directly use this weight to test:

CUDA_VISIBLE_DEVICES=0 python test.py --config config/pointgroup_default_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

If necessary, save the offset and instance of each point_ SEG, config / pointgroup needs to be changed_ default_ scannet. In yaml, the last few lines:

TEST:
  split: val
  test_epoch: 384
  test_workers: 16
  test_seed: 567

  TEST_NMS_THRESH: 0.3
  TEST_SCORE_THRESH: 0.09
  TEST_NPOINT_THRESH: 100

  eval: True
  save_semantic: True
  save_pt_offsets: True
  save_instance: True

visualization

README. The visualization method in MD is too complex, and mayavi makes various errors during installation and use So I'm going to use my own method for visualization (if it can be implemented smoothly according to his method, it's no problem. Maybe I'm an example)

However, the problem now is that all mask files in his test results are 0 The confidence and classification information are clear, but the point cloud file after instance segmentation is missing

So dig a pit here... Fill it after it is solved... It's a big problem that the point cloud model can't be visualized

Topics: Python