Plan: MK 2nd place in the regular season

Posted by Nirvana on Mon, 31 Jan 2022 03:46:11 +0100

Oar regular season: remote sensing image land segmentation - 8th place scheme in November

1. Introduction to the competition

1.1 game page portal: Regular season: remote sensing image block segmentation

1.2 introduction to competition questions

This competition is adapted from the preliminary competition of 2020 CCF BDCI remote sensing image block segmentation. Remote sensing image parcel segmentation aims to analyze the pixel level content of remote sensing image and extract and classify the categories of interest in remote sensing image. It has high practical value in urban and rural planning, flood control and disaster relief and other fields, and has also attracted extensive attention in industry. The existing data processing methods of remote sensing image block segmentation are limited to specific scenes and specific data sources, and the accuracy can not meet the requirements. Therefore, in practical application, it still relies heavily on manual processing, which needs to consume a lot of human, material and financial resources. The purpose of this competition is to measure the effect of remote sensing image block segmentation model on multiple categories (such as buildings, roads, woodlands, etc.), fully mine heterogeneous remote sensing image data from multiple sources and scenes by using artificial intelligence technology, create efficient and practical algorithms, and improve the analysis and extraction ability of remote sensing images. The purpose of this competition is to analyze the pixel level content of remote sensing images, extract and classify the categories of interest in remote sensing images, so as to measure the effect of remote sensing image parcel segmentation model on multiple categories (such as buildings, roads, woodlands, etc.).

1.3 data description

This competition provides desensitized remote sensing image data of multiple regions, and each contestant can build their own plot segmentation model based on these data.

Training data set
The sample picture and its annotation are shown in the following figure:



Training dataset file name: train_and_label.zip

It contains two sub files: training data set (original picture) file and training data set (labeled picture) file. The details are as follows:

Training dataset (original picture) file name: img_train

It contains 66653 JPG pictures with a resolution of 2m/pixel and a size of 256 * 256. The name of each picture is like t000123 jpg.

Training data set (labeled picture) file name: lab_train

It contains 66653 PNG pictures with a resolution of 2m/pixel and a size of 256 * 256. The name of each picture is like t000123 png.

Note: all PNG images include 4 categories, with pixel values of 0, 1, 2 and 3 respectively. In addition, the pixel value 255 is an unmarked area, indicating that the category of the corresponding area is uncertain, and this part of the area will not be considered in the evaluation.

Test data set
Test dataset file name: img_test.zip, as follows:

It contains 4609 JPG pictures with a resolution of 2m/pixel and a size of 256 * 256. The file name is 123 jpg.

2. Introduction of ideas

Note that this version is mainly to learn to use PaddleSeg and be familiar with AiStudio platform, so there is no special modification to the model or analysis according to the characteristics of the data set.

In order to get started quickly, I found the official information from the discussion area baseline based on PaddleSeg.

First, follow the tutorial to prepare the environment and data.

Environmental installation

!git clone https://gitee.com/paddlepaddle/PaddleSeg.git

# Install required dependencies
!pip install -r PaddleSeg/requirements.txt

Decompress data set

The data is stored in the data path. Every time you close the notebook, it will be deleted, so you need to decompress it again every time.

# Modify "data80164" according to your pathname
!unzip -q data/data80164/train_and_label.zip
!unzip -q data/data80164/img_test.zip

data processing

The training set data is divided into train and val, and the file name is written into txt. The subsequent data reading configuration is read from txt file. It can be written once and reused later, which is also convenient to compare and observe whether the model changes have been improved.

import os
import numpy as np

datas = []
image_base = 'img_train'   # Training set primitive path
annos_base = 'lab_train'   # Training set label path

# Read original drawing file name
ids_ = [v.split('.')[0] for v in os.listdir(image_base)]

# Write the image set and label path of the training set into the data
for id_ in ids_:
    img_pt0 = os.path.join(image_base, '{}.jpg'.format(id_))
    img_pt1 = os.path.join(annos_base, '{}.png'.format(id_))
    datas.append((img_pt0.replace('/home/aistudio', ''), img_pt1.replace('/home/aistudio', '')))
    if os.path.exists(img_pt0) and os.path.exists(img_pt1):
        pass
    else:
        raise "path invalid!"

# Print the length of data and specific storage examples
print('total:', len(datas))
print(datas[0][0])
print(datas[0][1])
print(datas[10][:])
import numpy as np

# Four types of labels are of little use here. The competition evaluation is based on 0, 1, 2 and 3 categories
labels = ['Architecture', 'Cultivated land', 'woodland',  'other']

# Write labels to label file
with open('labels.txt', 'w') as f:
    for v in labels:
        f.write(v+'\n')

# Randomly disrupt data
np.random.seed(5)
np.random.shuffle(datas)

# For the division of verification set and training set, 0.05 means that 5% is training set and 95% is training set
split_num = int(0.05*len(datas))

# Divide training set and verification set
train_data = datas[:-split_num]
valid_data = datas[-split_num:]

# Write training set list
with open('train_list.txt', 'w') as f:
    for img, lbl in train_data:
        f.write(img + ' ' + lbl + '\n')

# Write validation set list
with open('valid_list.txt', 'w') as f:
    for img, lbl in valid_data:
        f.write(img + ' ' + lbl + '\n')

# Print training set and test set size
print('train:', len(train_data))
print('valid:', len(valid_data))

model training

There are two configuration files that need to be modified by yourself.

For quick experiment, I replaced the backbone of deep labv3 + with mobilenetv2, and also used the pre training model. Considering that the image size used this time is 256 * 256 pixels, the hole convolution rate of ASPP is adjusted to reduce the hole rate from [12,24,36] to {1,3,6], and the new structure is named DSPP.

In order to make full use of GPU video memory and maximize the batch size, 240 is basically the limit, and the number of training iterations is increased.

deeplabv3p_mobilenetv2_g.yml

_base_: '../_base_/cityscapes.yml'

batch_size: 240
iters: 60000

model:
  type: DeepLabV3P_DSPP
  backbone:
    type: MobileNetV2
    channel_ratio: 1.0
    min_channel: 16
    pretrained: https://bj.bcebos.com/paddleseg/dygraph/mobilenetv2.tar.gz
    # output_stride: 16
    # multi_grid: [1, 1, 2]
  num_classes: 4
  backbone_indices: [0, 3]
  dspp_ratios: [1, 3, 6]
  # mid_channels: 256
  dspp_out_channels: 256
  align_corners: False
  pretrained: null

pred_dataset:
  num_classes: 4
  transforms:
    - type: Normalize

The second configuration file is the benchmark file, which mainly modifies the txt path and the number of figure categories.

Simply try the effects of different transforms, and the effect of removing color and contrast is better. At the same time, I found a point I didn't notice in November. The default value of random scaling and cutting is too large. The image itself is only 256256 and scaled to 1080512. No wonder the previous training is so slow. If the batchsize is a little larger, it will explode the display memory. After modifying to 256 * 256, the batch size is expanded ten times, and the effect is improved obviously.

batch_size: 2
iters: 80000

train_dataset:
  type: Dataset
  dataset_root: /home/aistudio
  train_path: /home/aistudio/train_list.txt
  num_classes: 4
  transforms:
    - type: ResizeStepScaling
      min_scale_factor: 0.5
      max_scale_factor: 2.0
      scale_step_size: 0.25
    - type: RandomPaddingCrop
      crop_size: [256, 256]
    - type: RandomHorizontalFlip
    # - type: RandomDistort
    #   brightness_range: 0.4
    #   contrast_range: 0.4
    #   saturation_range: 0.4
    - type: Normalize
  mode: train

val_dataset:
  type: Dataset
  dataset_root: /home/aistudio
  val_path: /home/aistudio/valid_list.txt
  num_classes: 4
  transforms:
    - type: Normalize
  mode: val


optimizer:
  type: sgd
  momentum: 0.9
  weight_decay: 4.0e-5

lr_scheduler:
  type: PolynomialDecay
  learning_rate: 0.01
  end_lr: 0
  power: 0.9

loss:
  types:
    - type: CrossEntropyLoss
  coef: [1]

Then you can start training.

!python PaddleSeg/train.py \
        --config PaddleSeg/configs/deeplabv3p/deeplabv3p_mobilenetv2.yml \
        --use_vdl \
        --do_eval \
        --save_interval 1000 \
        --save_dir output \
        --num_workers 4 \
        --learning_rate 0.1

Use the saved best after model training_ Model. In order to facilitate the submission of results, the prediction file is slightly modified and the output method meeting the submission requirements is added.

# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import math

import cv2
import numpy as np
import paddle
from paddle.fluid.layers.tensor import save

from paddleseg import utils
from paddleseg.core import infer
from paddleseg.utils import logger, progbar


def mkdir(path):
    sub_dir = os.path.dirname(path)
    if not os.path.exists(sub_dir):
        os.makedirs(sub_dir)


def partition_list(arr, m):
    """split the list 'arr' into m pieces"""
    n = int(math.ceil(len(arr) / float(m)))
    return [arr[i:i + n] for i in range(0, len(arr), n)]


def predict(model,
            model_path,
            transforms,
            image_list,
            image_dir=None,
            save_dir='output',
            aug_pred=False,
            scales=1.0,
            flip_horizontal=True,
            flip_vertical=False,
            is_slide=False,
            stride=None,
            crop_size=None):
    """
    predict and visualize the image_list.

    Args:
        model (nn.Layer): Used to predict for input image.
        model_path (str): The path of pretrained model.
        transforms (transform.Compose): Preprocess for input image.
        image_list (list): A list of image path to be predicted.
        image_dir (str, optional): The root directory of the images predicted. Default: None.
        save_dir (str, optional): The directory to save the visualized results. Default: 'output'.
        aug_pred (bool, optional): Whether to use mulit-scales and flip augment for predition. Default: False.
        scales (list|float, optional): Scales for augment. It is valid when `aug_pred` is True. Default: 1.0.
        flip_horizontal (bool, optional): Whether to use flip horizontally augment. It is valid when `aug_pred` is True. Default: True.
        flip_vertical (bool, optional): Whether to use flip vertically augment. It is valid when `aug_pred` is True. Default: False.
        is_slide (bool, optional): Whether to predict by sliding window. Default: False.
        stride (tuple|list, optional): The stride of sliding window, the first is width and the second is height.
            It should be provided when `is_slide` is True.
        crop_size (tuple|list, optional):  The crop size of sliding window, the first is width and the second is height.
            It should be provided when `is_slide` is True.

    """
    utils.utils.load_entire_model(model, model_path)
    model.eval()
    nranks = paddle.distributed.get_world_size()
    local_rank = paddle.distributed.get_rank()
    if nranks > 1:
        img_lists = partition_list(image_list, nranks)
    else:
        img_lists = [image_list]

    added_saved_dir = os.path.join(save_dir, 'added_prediction')
    pred_saved_dir = os.path.join(save_dir, 'pseudo_color_prediction')
    org_saved_dir = os.path.join(save_dir, 'result')

    logger.info("Start to predict...")
    progbar_pred = progbar.Progbar(target=len(img_lists[0]), verbose=1)
    with paddle.no_grad():
        for i, im_path in enumerate(img_lists[local_rank]):
            im = cv2.imread(im_path)
            ori_shape = im.shape[:2]
            im, _ = transforms(im)
            im = im[np.newaxis, ...]
            im = paddle.to_tensor(im)

            if aug_pred:
                pred = infer.aug_inference(
                    model,
                    im,
                    ori_shape=ori_shape,
                    transforms=transforms.transforms,
                    scales=scales,
                    flip_horizontal=flip_horizontal,
                    flip_vertical=flip_vertical,
                    is_slide=is_slide,
                    stride=stride,
                    crop_size=crop_size)
            else:
                pred = infer.inference(
                    model,
                    im,
                    ori_shape=ori_shape,
                    transforms=transforms.transforms,
                    is_slide=is_slide,
                    stride=stride,
                    crop_size=crop_size)
            pred = paddle.squeeze(pred)
            pred = pred.numpy().astype('uint8')

            # get the saved name
            if image_dir is not None:
                im_file = im_path.replace(image_dir, '')
            else:
                im_file = os.path.basename(im_path)
            if im_file[0] == '/' or im_file[0] == '\\':
                im_file = im_file[1:]

            # save added image
            added_image = utils.visualize.visualize(im_path, pred, weight=0.6)
            added_image_path = os.path.join(added_saved_dir, im_file)
            mkdir(added_image_path)
            cv2.imwrite(added_image_path, added_image)

            # save pseudo color prediction
            pred_mask = utils.visualize.get_pseudo_color_map(pred)
            pred_saved_path = os.path.join(
                pred_saved_dir,
                os.path.splitext(im_file)[0] + ".png")
            mkdir(pred_saved_path)
            pred_mask.save(pred_saved_path)

            # Output for submission is added here
            org_saved_path = os.path.join(
                org_saved_dir, 
                os.path.splitext(im_file)[0] + ".png"
            )
            mkdir(org_saved_path)
            cv2.imwrite(org_saved_path, pred)

            # pred_im = utils.visualize(im_path, pred, weight=0.0)
            # pred_saved_path = os.path.join(pred_saved_dir, im_file)
            # mkdir(pred_saved_path)
            # cv2.imwrite(pred_saved_path, pred_im)

            progbar_pred.update(i + 1)

!python PaddleSeg/predict.py \
        --config PaddleSeg/configs/deeplabv3p/deeplabv3p_mobilenetv2.yml \
        --model_path output/best_model/model.pdparams \
        --image_path data/img_testA \
        --aug_pred

3. Experience sharing

In December, I continued to be familiar with the code, mainly looking at the image input preprocessing. The whole code runs down two points.

  1. Finding the appropriate combination of preprocessing can significantly improve the classification accuracy.
  2. Use large batch size as much as possible and make full use of video memory to improve efficiency and classification accuracy at the same time.

For those who first contact PaddleSeg, according to the official example, they can easily get through the process.

A simple summary of the process is:

  • download PaddleSeg
    
  • Configure the environment (this step seems to be skipped)
    
  • Decompress data and build train and val list txt file
    
  • Modify the number of corresponding paths and figure categories in the model configuration file
    
  • Training and forecasting
    
  • Submit results
    

The configuration of the model in PaddleSeg is the yml file used. People who haven't used it still need to take time to get familiar with it. The models are modular design. It is convenient to replace different modules through configuration files. You can also write new modules yourself. In general, it is easy to make simple changes. I tried to modify the multi-scale module ASPP, which can be easily embedded into the existing model.

Topics: Python AI paddlepaddle