brief introduction
Classification and segmentation are two typical tasks in vision, and segmentation can be divided into semantic segmentation and instance segmentation The difference is that semantic segmentation divides the targets in the input into N N N categories, so is the output N N N categories Each output of instance segmentation contains only one target
This paper selects PointGroup model to do semantic segmentation on point cloud
environment
- System: Ubuntu 16 04
- Python version: python3 seven
- Pytoch version: 1.1
- Dataset: scannetv2
- CUDA: 10.1
- GPU: RTX 2080TI * 8
Among them, torch version must be 1.1, otherwise errors will occur due to some api changes
Project documents
git clone https://github.com/llijiang/PointGroup.git --recursive
The last - recursive is essential. This project requires several third-party git s, which can be automatically downloaded for us
The project directory structure is as follows:
PointGroup/ ├─config/ ├─data/ ├─dataset/ ├─doc/ ├─lib/ ├─model/ ├─util/
Environmental preparation
It is recommended to create a new anaconda virtual environment, as follows:
conda create -n pointgroup python==3.7 source activate pointgroup pip install -r requirements.txt
The following packages may fail to install due to network problems. It is recommended to hang up VPN and try again I tried for several days and succeeded
You must use these three commands to install, and the others installed through cmake are useless Originally, because Google sparsehash couldn't be installed, I installed the c + + version at the beginning, but it didn't work
conda install -c bioconda google-sparsehash conda install libboost conda install -c daleydeng gcc-5
spconv
Next, find the path of boost / in the system and modify lib / spconv / cmakelists Include in TXT_ The sentence "directories ($include_path $)"
Now you can compile spconv:
cd lib/spconv python setup.py bdist_wheel cd dist pip install ${whl}
Where, ${whl} is the whl file generated after successful compilation, which can be installed directly through pip
pointgroup_ops
The compilation process is as follows:
cd lib/pointgroup_ops python setup.py develop
If you are prompted that a header file cannot be found, find the header file in the new anaconda PointGroup virtual environment (as long as all previous CONDA installations are successful, these header files must be in this virtual environment. Take a moment to find them), and add the header file with the following command:
python setup.py build_ext --include-dirs=${INCLUDE_PATH} python setup.py develop
Where ${INCLUDE_PATH} is the path of the missing header file
data set
Download script
The data set of Scannetv2 is used in this paper. Due to the copyright problem, please go to This link , according to readme Copy ScanNet terms of use The contents in PDF are in word, and then transferred to ScanNet terms of use Pdf, which has been sent to you as an attachment scannet@googlegroups.com , the author will send you a download scan of the downloaded dataset within a week Py script and running tutorial
Get this download scan Py script can be run directly if it is in Python 2 environment. If it is in Python 3 environment, you need to make some changes to the urllib code. The change method is very simple. Change all urllib to urllib Request
Download dataset
The size of the entire scannetv2 data set is 1.3TB, so it is obviously impossible to download it all Fortunately, the data required by the model is only 20GB There are four types of files with suffixes that need to be downloaded:
- _vh_clean_2.ply
- _vh_clean_2.labels.ply
- _vh_clean_2.0.010000.segs.json
- .aggregation.json
Therefore, you need to run the following four commands:
python3 download-scannet.py -o scannet/ --type _vh_clean_2.ply python3 download-scannet.py -o scannet/ --type _vh_clean_2.labels.ply python3 download-scannet.py -o scannet/ --type _vh_clean_2.0.010000.segs.json python3 download-scannet.py -o scannet/ --type .aggregation.json
For these four download tasks, after downloading all files with specified suffixes, four same files will be downloaded Zip file, but I don't know why The zip file cannot be downloaded through the script (an error will be reported) But these The zip file is not used by Zhang Hong in this project at all, so if you find that the terminal reports an error after downloading Zip file download failed, which is not affected
After downloading the dataset, you need to go to https://github.com/ScanNet/ScanNet/tree/master/Tasks/Benchmark Download scannetv2_train.txt, scannetv2_val.txt,scannetv2_test.txt these three files Their function is to divide the whole data set into training set, verification set and test set
After downloading, the directory structure of the dataset is as follows (only the directory structure is given, not all files are given):
├── scannet │ ├── scans │ │ ├── [scene_id] # There should be the following four files under each folder │ │ │ ├── [scene_id].aggregation.json │ │ │ ├── [scene_id]_vh_clean_2.0.010000.segs.json │ │ │ ├── [scene_id]_vh_clean_2.labels.ply │ │ │ ├── [scene_id]_vh_clean_2.ply │ ├── scans_test │ │ ├── [scene_id] # There should be one file below each folder │ │ │ ├── [scene_id]_vh_clean_2.ply │ ├── scannetv2_test.txt │ ├── scannetv2_train.txt │ ├── scannetv2_val.txt │ ├── scannetv2-labels.combined.tsv
Partition dataset
Here I wrote a script called scan_ split. Py to verify the integrity of the data set As follows:
import os dir_scans = 'scannet/scans/' dir_scans_test = 'scannet/scans_test/' dir_train = 'dataset/scannetv2/train/' dir_val = 'dataset/scannetv2/val/' dir_test = 'dataset/scannetv2/test/' for dir_ in [dir_test, dir_val, dir_train]: if not os.path.exists(dir_): os.makedirs(dir_) def read_from_txt(path_txt): file = open(path_txt, 'r') data = [] for line in file: data.append(line.split('\n')[0]) return data def check_files(): """ Run this function first, Ensure that all required files have been downloaded to the specified path """ print('[INFO] begin checking files') for scan_name in os.listdir(dir_scans): print('[INFO] checking scan {} ...'.format(dir_scans + scan_name)) check1 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name)) check2 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name)) check3 = os.path.exists(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name)) check4 = os.path.exists(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name)) assert check1, '[ERROR] scan {} has no `_vh_clean_2.ply` file'.format(dir_scans + scan_name) assert check2, '[ERROR] scan {} has no `_vh_clean_2.labels.ply` file'.format(dir_scans + scan_name) assert check3, '[ERROR] scan {} has no `_vh_clean_2.0.010000.segs.json` file'.format(dir_scans + scan_name) assert check4, '[ERROR] scan {} has no `.aggregation.json` file'.format(dir_scans + scan_name) print('[INFO] checking done') for scan_name in os.listdir(dir_scans_test): print('[INFO] checking scan {} ...'.format(dir_scans_test + scan_name)) check1 = os.path.exists(dir_scans_test + scan_name + '/{}_vh_clean_2.ply'.format(scan_name)) assert check1, '[ERROR] scan {} has no `_vh_clean_2.ply` file'.format(dir_scans + scan_name) print('[INFO] checking done') print('[INFO] checking files complete, all files exist') def copy_files_to_dir(dir_to, path_files): for file in path_files: cmd = 'cp {} {}'.format(file, dir_to) os.system(cmd) print('[INFO] ', cmd) def split_train_test_val(): """ Run this function again, The data set will be downloaded, Press train, val, test divide, Then copy to the specified path """ print('[INFO] begin split train, val, test') for scan_name in os.listdir(dir_scans): path_files = [] if scan_name in train: print('[INFO] scan {} in train.txt'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name)) copy_files_to_dir(dir_train, path_files) elif scan_name in val: print('[INFO] scan {} in val.txt'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.labels.ply'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.0.010000.segs.json'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}.aggregation.json'.format(scan_name)) copy_files_to_dir(dir_val, path_files) for scan_name in os.listdir(dir_scans_test): print('[INFO] scan {} in test.txt'.format(scan_name)) path_files.append(dir_scans + scan_name + '/{}_vh_clean_2.ply'.format(scan_name)) copy_files_to_dir(dir_test, path_files) if __name__ == '__main__': check_files() print('='*50, '\n') train = read_from_txt('scannet/scannetv2_train.txt') val = read_from_txt('scannet/scannetv2_val.txt') test = read_from_txt('scannet/scannetv2_test.txt') split_train_test_val()
The script can be run directly. As long as there is no error during the operation, it indicates that the data set has been prepared as required
When ready, the directory structure of PointGroup/dataset is as follows:
PointGroup ├── dataset │ ├── scannetv2 │ │ ├── train │ │ │ ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json │ │ ├── val │ │ │ ├── [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json │ │ ├── test │ │ │ ├── [scene_id]_vh_clean_2.ply │ │ ├── scannetv2-labels.combined.tsv
Finally, use the following command to generate the files required for training:
cd dataset/scannetv2 python prepare_data_inst.py --data_split train python prepare_data_inst.py --data_split val python prepare_data_inst.py --data_split test
train
CUDA_VISIBLE_DEVICES=0 python train.py --config config/pointgroup_run1_scannet.yaml
The specific parameters are in config/pointgroup_run1_scannet.yaml file, you can change it freely After training, the weight and log will be saved in PointGroup/exp / directory You can open it with tensorboardX:
tensorboard --logdir=./exp --port=6666
Testing & Visualization
The author gives the weight after pre training, Link here . you can directly use this weight to test:
CUDA_VISIBLE_DEVICES=0 python test.py --config config/pointgroup_default_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$
If necessary, save the offset and instance of each point_ SEG, config / pointgroup needs to be changed_ default_ scannet. In yaml, the last few lines:
TEST: split: val test_epoch: 384 test_workers: 16 test_seed: 567 TEST_NMS_THRESH: 0.3 TEST_SCORE_THRESH: 0.09 TEST_NPOINT_THRESH: 100 eval: True save_semantic: True save_pt_offsets: True save_instance: True
visualization
README. The visualization method in MD is too complex, and mayavi makes various errors during installation and use So I'm going to use my own method for visualization (if it can be implemented smoothly according to his method, it's no problem. Maybe I'm an example)
However, the problem now is that all mask files in his test results are 0 The confidence and classification information are clear, but the point cloud file after instance segmentation is missing
So dig a pit here... Fill it after it is solved... It's a big problem that the point cloud model can't be visualized