The generalization ability of the model can be improved through data enhancement during training, and its accuracy can also be improved through test enhancement (sacrificing a certain speed). This operation is common in the game. The methods of image enhancement include affine change (translation, scaling, rotation, mirror image, etc.) and color change (contrast, brightness, saturation, hue, noise), etc. by image enhancement of the test data, and then soft voting (accumulating according to the probability value) or hard voting (accumulating according to the argmax classification result) for multiple operation results, It can prompt the performance of the model to a certain extent. The following recommends a TTA framework applied to the padding framework GitHub - AgentMaker/PaTTA: A test times augmentation toolkit based on paddle2.0.
Support classification model, semantic segmentation model and key point model. The installation command is pip install patta
1. Supported data expansion method
The atom expansion methods supported by Patta are shown in Table 1
method | parameter | value |
---|---|---|
HorizontalFlip | - | - |
VerticalFlip | - | - |
HorizontalShift | shifts | List[float] |
VerticalShift | shifts | List[float] |
Rotate90 | angles | List[0, 90, 180, 270] |
Scale | scales interpolation | List[float] "nearest"/"linear" |
Resize | sizes original_size interpolation | List[Tuple[int, int]] Tuple[int,int] "nearest"/"linear" |
Add | values | List[float] |
Multiply | factors | List[float] |
FiveCrops | crop_height crop_width | int int |
AdjustContrast | factors | List[float] |
AdjustBrightness | factors | List[float] |
AverageBlur | kernel_sizes | List[Union[Tuple[int, int], int]] |
GaussianBlur | kernel_sizes sigma | List[Union[Tuple[int, int], int]] Optional[Union[Tuple[float, float], float]] |
Sharpen | kernel_sizes | List[int] |
The built-in combination expansion method is shown below
- flip_transform (horizontal + vertical flips)
- hflip_transform (horizontal flip)
- d4_transform (flips + rotation 0, 90, 180, 270)
- multiscale_transform (scale transform, take scales as input parameter)
- five_crop_transform (corner crops + center crop)
- ten_crop_transform (five crops + five crops on horizontal flip)
The built-in data expansion methods may not meet the requirements, and more complex data expansion methods can be realized through user-defined combination
# defined 2 * 2 * 3 * 3 = 36 augmentations ! transforms = tta.Compose( [ tta.HorizontalFlip(), tta.Rotate90(angles=[0, 180]), tta.Scale(scales=[1, 2, 4]), tta.Multiply(factors=[0.9, 1, 1.1]), ] ) tta_model = tta.SegmentationTTAWrapper(model, transforms)
2. Result fusion method
The built-in result fusion expansion method is shown below
- mean
- gmean (geometric mean) # geometric mean, that is, the item in the equal ratio
- sum
- max
- min
- tsharpen (temperature sharpen with t=0.5)
3. Basic usage
Model loading, through the author's analysis of the source code, it is found that patta only supports loading models saved through jit
import patta as tta model = tta.load_model(path='output/model') #-------Here is TTA load_ Implementation method of model----- def load_model(path='output/model'): model = paddle.jit.load(path=path) return model
By modifying the model, we can enhance the data during testing
Semantic segmentation model enhancement
tta_model = tta.SegmentationTTAWrapper(model, tta.aliases.d4_transform(), merge_mode='mean')
Classification model enhancement
tta_model = tta.ClassificationTTAWrapper(model, tta.aliases.five_crop_transform())
Key point model enhancement
tta_model = tta.KeypointsTTAWrapper(model, tta.aliases.flip_transform(), scaled=True)
4. Usage for multiple input multiple output models
For the multi input and multi output model, it is impossible to automatically enhance the performance of the model during the test by modifying the model. The only way is to carry out the augmentation before inputting the data into the model through the for loop, then carry out the deaugment after outputting the data, accumulate the results, and finally carry out the result fusion
# Example how to process ONE batch on images with TTA # Here `image`/`mask` are 4D tensors (B, C, H, W), `label` is 2D tensor (B, N) def test_one(augmented_image, another_input_data): for transformer in transforms: # custom transforms or e.g. tta.aliases.d4_transform() # augment image augmented_image = transformer.augment_image(image) # pass to model model_output = model(augmented_image, another_input_data) # reverse augmentation for mask and label deaug_mask = transformer.deaugment_mask(model_output['mask']) deaug_label = transformer.deaugment_label(model_output['label']) # save results labels.append(deaug_mask) masks.append(deaug_label) # reduce results as you want, e.g mean/max/min label = mean(labels) mask = mean(masks) return label,mask