Testing CIFAR10 data set with AlexNet in paddy

Posted by buzz on Tue, 21 Dec 2021 13:17:07 +0100

Introduction: We built the AlexNet network using the paddy framework, and tested the classification effect of AlexNet for Cifar10 using its supreme version on AI Studio. The classification effect of basic training on the test set can not exceed 60%, which is still a certain distance from the 80% classification effect mentioned in some articles.

Key words: Cifar10, Alexnet

§01 AlexNet

1.1 background introduction

in Requirements for the fourth operation of artificial neural network in 2021 The fourth assignment requirements in NN course are given. For the Cifar10 data set, see The fourth assignment of artificial neural network in 2021 - question 3 Cifar10 We try to use BP and LeNet structure for training, and the accuracy on the test set can not exceed 30%. But the accuracy of the test set soon reached saturation.

the network structure is simply modified, the learning rate is adjusted and the Dropout layer is used, which has no impact on the results.

reference blog Implementation and comparison of deep learning recognition CIFAR10: pytorch training LeNet, AlexNet and VGG19 (II) Introduced in AlexNet The network is built and tested on the paddy platform.

1.2 original code

the original text is combined according to the structure of AlexNet The CIFAR-10 dataset Picture features (32) × thirty-two × 3) , fine tune the AlexNet network structure:

network structure of AlexNet:

▲ Figure 1.2 1. Network structure of alexnet

for CIFAR10, the picture is 3232, and the size is much smaller than 227227. Therefore, the network structure and parameters need to be fine tuned:

Convolution layer 1: Kernel Size 7 * 7, step 2, filling 2
The last max pool layer is deleted

1.2. 1 network code

the network definition code is as follows:

 1 class AlexNet(nn.Module):
 2     def __init__(self):
 3         super(AlexNet, self).__init__()
 4 
 5         self.cnn = nn.Sequential(
 6             # Convolution layer 1, 3-channel input, 96 convolution kernels, kernel size 7 * 7, step size 2, filling 2
 7             # After this layer, the image size becomes 32-7 + 2 * 2 / 2 + 1, 15 * 15
 8             # After 3 * 3 Maximum pooling and 2 steps, the image becomes 15-3 / 2 + 1, 7 * 7
 9             nn.Conv2d(3, 96, 7, 2, 2),
10             nn.ReLU(inplace=True),
11             nn.MaxPool2d(3, 2, 0),
12 
13             # Convolution layer 2, 96 input channels, 256 convolution cores, core size 5 * 5, step size 1, filling 2
14             # After this layer, the image becomes 7-5 + 2 * 2 / 1 + 1, 7 * 7
15             # After 3 * 3 Maximum pooling and 2 steps, the image becomes 7-3 / 2 + 1, 3 * 3
16             nn.Conv2d(96, 256, 5, 1, 2),
17             nn.ReLU(inplace=True),
18             nn.MaxPool2d(3, 2, 0),
19 
20             # Convolution layer 3256 input channels, 384 convolution cores, core size 3 * 3, step size 1, filling 1
21             # After this layer, the image becomes 3-3 + 2 * 1 / 1 + 1, 3 * 3
22             nn.Conv2d(256, 384, 3, 1, 1),
23             nn.ReLU(inplace=True),
24 
25             # Convolution layer 3384 input channels, 384 convolution kernels, kernel size 3 * 3, step size 1, filling 1
26             # After this layer, the image becomes 3-3 + 2 * 1 / 1 + 1, 3 * 3
27             nn.Conv2d(384, 384, 3, 1, 1),
28             nn.ReLU(inplace=True),
29 
30             # Convolution layer 3384 input channels, 256 convolution cores, core size 3 * 3, step size 1, filling 1
31             # After this layer, the image becomes 3-3 + 2 * 1 / 1 + 1, 3 * 3
32             nn.Conv2d(384, 256, 3, 1, 1),
33             nn.ReLU(inplace=True)
34         )
35 
36         self.fc = nn.Sequential(
37             # 256 features, 3 * 3 for each feature
38             nn.Linear(256*3*3, 1024),
39             nn.ReLU(),
40             nn.Linear(1024, 512),
41             nn.ReLU(),
42             nn.Linear(512, 10)
43         )
44 
45     def forward(self, x):
46         x = self.cnn(x)
47 
48         # x.size()[0]: batch size
49         x = x.view(x.size()[0], -1)
50         x = self.fc(x)
51 
52         return x

1.3 paddy model implementation

use the neural network model in Paddle to build Alexnet.

1.3. 1. Build Alexnet network

(1) Network code

import paddle

class alexnet(paddle.nn.Layer):
    def __init__(self, ):
        super(alexnet, self).__init__()
        self.conv1 = paddle.nn.Conv2D(in_channels=3, out_channels=96, kernel_size=7, stride=2, padding=2)
        self.conv2 = paddle.nn.Conv2D(in_channels=96, out_channels=256, kernel_size=5, stride=1, padding=2)
        self.conv3 = paddle.nn.Conv2D(in_channels=256, out_channels=384, kernel_size=3, stride=1, padding=1)
        self.conv4 = paddle.nn.Conv2D(in_channels=384, out_channels=384, kernel_size=3, stride=1, padding=1)
        self.conv5 = paddle.nn.Conv2D(in_channels=384, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.mp1    = paddle.nn.MaxPool2D(kernel_size=3, stride=2)
        self.mp2    = paddle.nn.MaxPool2D(kernel_size=3, stride=2)
        self.L1     = paddle.nn.Linear(in_features=256*3*3, out_features=1024)
        self.L2     = paddle.nn.Linear(in_features=1024, out_features=512)
        self.L3     = paddle.nn.Linear(in_features=512, out_features=10)

    def forward(self, x):
        x = self.conv1(x)
        x = paddle.nn.functional.relu(x)
        x = self.mp1(x)
        x = self.conv2(x)
        x = paddle.nn.functional.relu(x)
        x = self.mp2(x)
        x = self.conv3(x)
        x = paddle.nn.functional.relu(x)
        x = self.conv4(x)
        x = paddle.nn.functional.relu(x)
        x = self.conv5(x)
        x = paddle.nn.functional.relu(x)
        x = paddle.flatten(x, start_axis=1, stop_axis=-1)
        x = self.L1(x)
        x = paddle.nn.functional.relu(x)
        x = self.L2(x)
        x = paddle.nn.functional.relu(x)
        x = self.L3(x)
        return x

(2) Network structure

apply padding Summary check whether the network structure is correct.

model = alexnet()

paddle.summary(model, (100,3,32,32))

---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Conv2D-16     [[100, 3, 32, 32]]   [100, 96, 15, 15]       14,208     
  MaxPool2D-7   [[100, 96, 15, 15]]    [100, 96, 7, 7]           0       
   Conv2D-17     [[100, 96, 7, 7]]     [100, 256, 7, 7]       614,656    
  MaxPool2D-8    [[100, 256, 7, 7]]    [100, 256, 3, 3]          0       
   Conv2D-18     [[100, 256, 3, 3]]    [100, 384, 3, 3]       885,120    
   Conv2D-19     [[100, 384, 3, 3]]    [100, 384, 3, 3]      1,327,488   
   Conv2D-20     [[100, 384, 3, 3]]    [100, 256, 3, 3]       884,992    
   Linear-10       [[100, 2304]]         [100, 1024]         2,360,320   
   Linear-11       [[100, 1024]]          [100, 512]          524,800    
   Linear-12        [[100, 512]]          [100, 10]            5,130     
===========================================================================
Total params: 6,616,714
Trainable params: 6,616,714
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 1.17
Forward/backward pass size (MB): 39.61
Params size (MB): 25.24
Estimated Total Size (MB): 66.02
---------------------------------------------------------------------------

{'total_params': 6616714, 'trainable_params': 6616714}

in the process of network design, structural errors often occur between the volume layer and the full connection layer. After flattening, the data dimensions are not aligned. In the process of network definition, first remove the full connection layer after flattening, and confirm that the number of volume layers is 256 through the pad.summary output structure × three × After 3, connect the full connection layer. If an error occurs, each layer can be verified.

1.4 Cifar10 training Alex net

1.4. 1 load data

import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *

import paddle
from paddle.vision.transforms import Normalize
normalize = Normalize(mean=[0.5,0.5,0.5], std=[0.5,0.5,0.5], data_format='HWC')

from paddle.vision.datasets import Cifar10
cifar10_train = Cifar10(mode='train', transform=normalize)
cifar10_test = Cifar10(mode='test', transform=normalize)

train_dataset = [cifar10_train.data[id][0].reshape(3,32,32) for id in range(len(cifar10_train.data))]
train_labels = [cifar10_train.data[id][1] for id in range(len(cifar10_train.data))]

class Dataset(paddle.io.Dataset):
    def __init__(self, num_samples):
        super(Dataset, self).__init__()
        self.num_samples = num_samples

    def __getitem__(self, index):
        data = train_dataset[index]
        label = train_labels[index]
        return paddle.to_tensor(data,dtype='float32'), paddle.to_tensor(label,dtype='int64')

    def __len__(self):
        return self.num_samples

_dataset = Dataset(len(cifar10_train.data))
train_loader = paddle.io.DataLoader(_dataset, batch_size=100, shuffle=True)

1.4. 2. Build network

class alexnet(paddle.nn.Layer):
    def __init__(self, ):
        super(alexnet, self).__init__()
        self.conv1 = paddle.nn.Conv2D(in_channels=3, out_channels=96, kernel_size=7, stride=2, padding=2)
        self.conv2 = paddle.nn.Conv2D(in_channels=96, out_channels=256, kernel_size=5, stride=1, padding=2)
        self.conv3 = paddle.nn.Conv2D(in_channels=256, out_channels=384, kernel_size=3, stride=1, padding=1)
        self.conv4 = paddle.nn.Conv2D(in_channels=384, out_channels=384, kernel_size=3, stride=1, padding=1)
        self.conv5 = paddle.nn.Conv2D(in_channels=384, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.mp1    = paddle.nn.MaxPool2D(kernel_size=3, stride=2)
        self.mp2    = paddle.nn.MaxPool2D(kernel_size=3, stride=2)
        self.L1     = paddle.nn.Linear(in_features=256*3*3, out_features=1024)
        self.L2     = paddle.nn.Linear(in_features=1024, out_features=512)
        self.L3     = paddle.nn.Linear(in_features=512, out_features=10)

    def forward(self, x):
        x = self.conv1(x)
        x = paddle.nn.functional.relu(x)
        x = self.mp1(x)
        x = self.conv2(x)
        x = paddle.nn.functional.relu(x)
        x = self.mp2(x)
        x = self.conv3(x)
        x = paddle.nn.functional.relu(x)
        x = self.conv4(x)
        x = paddle.nn.functional.relu(x)
        x = self.conv5(x)
        x = paddle.nn.functional.relu(x)
        x = paddle.flatten(x, start_axis=1, stop_axis=-1)
        x = self.L1(x)
        x = paddle.nn.functional.relu(x)
        x = self.L2(x)
        x = paddle.nn.functional.relu(x)
        x = self.L3(x)
        return x

model = alexnet()

1.4. 3 training network

test_dataset = [cifar10_test.data[id][0].reshape(3,32,32) for id in range(len(cifar10_test.data))]
test_label = [cifar10_test.data[id][1] for id in range(len(cifar10_test.data))]

test_input = paddle.to_tensor(test_dataset, dtype='float32')
test_l = paddle.to_tensor(array(test_label)[:,newaxis])

optimizer = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())
def train(model):
    model.train()
    epochs = 2
    accdim = []
    lossdim = []
    testaccdim = []

    for epoch in range(epochs):
        for batch, data in enumerate(train_loader()):
            out = model(data[0])
            loss = paddle.nn.functional.cross_entropy(out, data[1])
            acc = paddle.metric.accuracy(out, data[1])
            loss.backward()
            optimizer.step()
            optimizer.clear_grad()

            accdim.append(acc.numpy())
            lossdim.append(loss.numpy())

            predict = model(test_input)
            testacc = paddle.metric.accuracy(predict, test_l)
            testaccdim.append(testacc.numpy())

            if batch%10 == 0 and batch>0:
                print('Epoch:{}, Batch: {}, Loss:{}, Accuracys:{}{}'.format(epoch, batch, loss.numpy(), acc.numpy(), testacc.numpy()))

    plt.figure(figsize=(10, 6))
    plt.plot(accdim, label='Accuracy')
    plt.plot(testaccdim, label='Test')
    plt.xlabel('Step')
    plt.ylabel('Acc')
    plt.grid(True)
    plt.legend(loc='upper left')
    plt.tight_layout()

train(model)

1.4. 4 training results

Training parameters: BatchSize: 100
LearningRate: 0.001

if the BatchSize is too small, the training speed will slow down.

▲ Figure 1.4 1 variation curve of training accuracy and test accuracy

Training parameters: BatchSize: 5000
LearningRate: 0.0005

▲ Figure 1.4 2 changes in training accuracy and test accuracy

▲ Figure 1.4 3 changes in training accuracy and test accuracy

※ general ※ conclusion ※

built the AlexNet network with the paddy framework, and tested the classification effect of AlexNet for Cifar10 with its supreme version on AI Studio. The classification effect of basic training on the test set can not exceed 60%, which is still a certain distance from the 80% classification effect mentioned in some articles.

■ links to relevant literature:

● relevant chart links:

Figure 1.2 1. Network structure of alexnet
Figure 1.4 1 variation curve of training accuracy and test accuracy
Figure 1.4 2 changes in training accuracy and test accuracy
Figure 1.4 3 changes in training accuracy and test accuracy

Topics: Deep Learning paddlepaddle paddle

Programmer Think