The use of paddlepaddle 9 MC Dropout

Posted by gevensen on Wed, 09 Feb 2022 20:33:20 +0100

MC Dropout refers to Monte Carlo dropout, which can improve the performance of the model in the test stage without changing the network structure and increasing training. The essence is to keep dropout in the active stage during the test. The network is propagated forward for many times. Because the neurons activated by dropout are different every time, the results will be different every time. Averaging the results of multiple outputs can improve the accuracy of the algorithm to a certain extent, but it will reduce the reasoning speed of the algorithm.

In paddlepaddle, dropout can not be activated in the testing phase. The root cause is that train and eval can not be specified in the list of parameters in dropout in paddlepaddle. The list of parameters is as follows: paddle.nn.Dropout(p=0.5, axis=None, mode="upscale_in_train”, name=None)

  • p (float): the probability of setting the input node to 0, that is, the discard probability. Default: 0.5.

  • Axis (int|list): Specifies the axis for Dropout operation on input | Tensor |. Default: None.

  • mode (str): there are two ways of discarding units, 'upscale'_ in_ Train 'and' downscale '_ in_ Infer ', default:' upscale '_ in_ train'. The calculation method is as follows:

    1. upscale_in_train to increase the output during training.

      • train: out = input * mask / ( 1.0 - p )

      • inference: out = input

    2. downscale_in_infer, reduce the output result during prediction

      • train: out = input * mask

      • inference: out = input * (1.0 - p)

  • Name (str, optional): the name of the operation (optional, the default value is None). For more information, see Name 

Therefore, the activation status of dropout can only be realized through model Train () activates dropout in the model, but sets model After train (), the author found that the gpu occupation in the forward propagation process of the model cannot be cleared, regardless of batch_ How small the size is, as long as there is more test data, it will lead to insufficient video memory. Therefore, MC is realized by referring to the video memory emptying mode in the model training process_ Dropout.

import paddle
#paddle.set_flags({'FLAGS_eager_delete_tensor_gb': 1.0})
paddle.set_flags({'FLAGS_fast_eager_deletion_mode': True })#Using fast garbage collection strategy has no effect in practice
def MC_Dropout(model,data,times=10):#Monte Carlo Dropout
    #In order to clear the video memory function in back propagation, learning_ A rate of 0 means that the model is not allowed to update parameters
    optim = paddle.optimizer.Adam(learning_rate=0.0,parameters=model.parameters())
    loss_fn = paddle.nn.CrossEntropyLoss(soft_label=True)#Use soft labels    
    for i in range(times):
        #Borrow backpropagation to free memory
        loss = loss_fn(out, out)
    result=np.array(result)#shape is t, B, C, t: times, b:batch_size,c:class_probability
    result=np.transpose(result,(1,2,0))#shape is b,c,t
    result=result.sum(axis=-1)#shape is b,c
    result=result.argmax(axis=-1)#shape is b,c
    return result
# If you want to load a built-in dataset, set custom_ Replace dataset with train_dataset is enough
train_loader =, batch_size=BATCH_SIZE, shuffle=False)
print('=============train model=============')
for batch_id, data in enumerate(train_loader()):
    x_data = data[0]
    names = data[1]

Description of video memory emptying method: Based on pad The optimizer and loss function perform false back propagation (let the learning rate be 0 and loss be 0), so that the optimizer will not substantially update the parameters of the model.

Implementation of data loader ImageClsTestDataset: pass in the image path and automatically load it into the list without generating a txt list

import paddle
from import Dataset
from import transforms
from PIL import Image
import numpy as np
import os

class ImageClsTestDataset(Dataset):
    def __init__(self,input_shape,root):
        super(ImageClsTestDataset, self).__init__()
        #ToTensor PIL the input data with shape (H x W x C) Image or numpy Ndarray is converted to (C x H x W) and normalized. If you want to keep the shape unchanged, you can set the parameter data_format set to 'HWC'
        #In the padding model, the data is in CHW format
                    transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225),data_format='CHW')
        self.length = len(self.lists)

    def __getitem__(self, index):
        name = self.root+'/'+self.lists[index]
        image      =
        np_img     = np.array(image)
        if len(np_img.shape)==2:#Prevent grayscale images in data
        image      = self.preprocess_image(np_img[:,:,:3])#np_img[:,:,:3] prevent the existence of RGBA four channel data in the data
        return image, name

    def __len__(self):
        return self.length

Topics: Machine Learning Deep Learning paddlepaddle