Deep learning based on Keras -- the construction and training of LeNet

Posted by tracivia on Sat, 19 Feb 2022 16:48:25 +0100

Deep learning based on Keras (II) -- construction and training of LeNet

LeNet is a very efficient convolutional neural network for handwritten character recognition. Although the network is small, it contains the basic modules of deep learning: convolution layer, pooling layer and full connection layer. It is also the basis of other deep learning models.
For the introduction of LeNet, please refer to the link: LeNet
While watching LeNet, let's learn how to use Keras to build a simple DCNN network.

Keras function introduction

1. Convolution

For the understanding of convolution, please refer to the link: Convolution layer
This paper mainly introduces the way Keras realizes convolution and the parameters that need to be paid attention to

keras.layers.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None, dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

2D convolution layer (e.g. spatial convolution of image).

This layer creates a convolution kernel, and the convolution check layer convolutes the input to generate the output tensor. If use_bias is True, an offset vector is created and added to the output. Finally, if activation is not None, it is also applied to the output.

When using this layer as the first layer of the model, you need to provide input_shape parameter (integer tuple, excluding the axis represented by the sample), for example, input_shape=(128, 128, 3) represents 128x128 RGB image, which is displayed in data_ When format = "channels_last".

parameter

Filters: integer, the dimension of the output space (i.e. the output number of filters in convolution).

kernel_size: an integer, or a tuple or list represented by two integers, indicating the width and height of the 2D convolution window. Can be an integer that specifies the same value for all spatial dimensions.

Strings: an integer, or a tuple or list represented by two integers, indicating the step size of the convolution along the width and height direction. Can be an integer that specifies the same value for all spatial dimensions. Specify any string value= 1 and specified Division_ Rate value= 1 the two are incompatible.

padding: "valid" or "same" (case sensitive).

data_format: string, channels_last (default) or channels_ One of the first, indicating the order of dimensions in the input.
channels_ The corresponding input dimensions of last are (batch, height, width, channels), and channels_ The corresponding input size of the first is (batch, channels, height, width). It defaults to from the Keras configuration file ~ / Keras/keras. Image found in JSON_ data_ Format value. If you've never set it up, you'll use channels_last.

dilation_rate: a tuple or list of one integer or two integers, specifying the expansion rate of expansion convolution. Can be an integer that specifies the same value for all spatial dimensions. Currently, specify any dictionary_ Rate value= 1 and the specified string value= 1 the two are incompatible.
Activation: the activation function to use (see activations for details). If you do not specify, the activation function is not used (i.e. linear activation: a(x) = x).

use_bias: Boolean value, whether the layer uses offset vector.

kernel_ Initializer: initializer of kernel weight matrix (see initializers for details).

bias_initializer: initializer of offset vector (see initializers for details).

kernel_regularizer: the regularization function applied to the kernel weight matrix (see regularizer for details).

bias_regularizer: the regularization function applied to the offset vector (see regularizer for details).

activity_regularizer: the regularization function applied to the layer output (its activation value) (see regularizer for details).

kernel_constraint: the constraint function applied to the kernel weight matrix (see constraints for details).

bias_constraint: the constraint function applied to the offset vector (see constraints for details).

Enter size

If data_format=‘channels_first ', input 4D tensor with the size of (samples, channels, rows, cols).
If data_format=‘channels_last ', input 4D tensor with the size of (samples, rows, cols, channels).

Output size

If data_format=‘channels_first ', output 4D tensor with the size of (samples, filters, new_rows, new_cols).
If data_format=‘channels_last ', output 4D tensor with the size of (samples, new_rows, new_cols, filters).
The rows and cols values may have changed due to padding.

2. Activate function

The activation function can be realized by setting a separate activation layer, or by passing the activation parameter when constructing the layer object:

from keras.layers import Activation, Dense

model.add(Dense(64))
model.add(Activation('tanh'))

Equivalent to:

model.add(Dense(64, activation='tanh'))

You can also pass an element by element Theano/TensorFlow/CNTK function as the activation function:

from keras import backend as K

model.add(Dense(64, activation=K.tanh))
model.add(Activation(K.tanh))

3. Pooling

For the understanding of pooling layer, please refer to the link: Pooling

In addition, let's introduce the implementation method of pooling in Keras, mainly MaxPooling2D used for image data:

MaxPooling2D

keras.layers.MaxPooling2D(pool_size=(2, 2), strides=None, padding='valid', data_format=None)

parameter

pool_size: integer, or tuple represented by 2 integers, the factor that reduces the scale along the (vertical, horizontal) direction. (2, 2) will reduce both dimensions of the input tensor by half. If only one integer is used, both dimensions will use the same window length.

Strings: an integer, a tuple represented by two integers, or None. Represents the step value. If it is None, the default value is pool_size.

padding: "valid" or "same" (case sensitive).

data_format: string, channels_ Last (default) or channels_first one. Indicates the order in which dimensions are entered. channels_ Last stands for the input tensor whose size is (batch, height, width, channels), and channels_first represents the input tensor whose size is (batch, channels, height, width). The default value is based on the Keras configuration file ~ / Keras/keras. Image in JSON_ data_format value. If it has not been set, the default value is "channels_last".

Enter size

If data_format=‘channels_last ': 4D tensor whose dimension is (batch_size, rows, cols, channels)
If data_format=‘channels_first ': 4D tensor whose size is (batch_size, channels, rows, cols)

Output size

If data_format=‘channels_last ': 4D tensor whose dimension is (batch_size, pooled_rows, pooled_cols, channels)
If data_format=‘channels_first ': 4D tensor whose size is (batch_size, channels, pooled_rows, pooled_cols)

4.Flatten

The Flatten layer is implemented in keras layers. core. In the Flatten () class.

effect:

Flatten layer is used to "flatten" the input, that is, to make the multi-dimensional input one-dimensional. It is often used in the transition from convolution layer to full connection layer. Flatten does not affect the size of the batch.

keras.layers.Flatten(data_format=None)

parameter

data_format: a string whose value is channels_last (default) or channels_first. It indicates the order of the dimensions entered. The purpose of this parameter is to preserve the weight order when the model switches from one data format to another. channels_last corresponds to the input with size (batch,..., channels), and channels_first corresponds to an input with a size of (batch, channels,...). The default is image_ data_ The value of format can be found in the Keras configuration file ~ / keras/keras.json. If you have never set it up, it will be channels_last

For example:

model.add(Flatten())
model.add(Dense(500))
model.add(Activation('relu'))

5. Full connection layer

Each node of the whole connection layer is connected with all nodes of the previous layer to synthesize the features extracted from the front. Due to its fully connected characteristics, the parameters of the general fully connected layer are also the most.
fully connected layers (FC) play the role of "Classifier" in the whole convolutional neural network. If the operations of volume layer, pool layer and activation function layer are to map the original data to the hidden layer feature space, the full connection layer plays the role of mapping the learned "distributed feature representation" to the sample tag space.

keras.layers.Dense(units, activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)

It is the full connection layer you often use.
example

# As the first layer of Sequential model
model = Sequential()
model.add(Dense(32, input_shape=(16,)))
# Now the model will take the array with size (*, 16) as the input,
# The size of its output array is (*, 32)

# After the first layer, you no longer need to specify the entered size:
model.add(Dense(32))

parameter

units: positive integer, output spatial dimension.
Activation: activate the function (see activations for details). If not specified, the activation function is not used (i.e. "linear" activation: a(x) = x).
use_bias: Boolean value, whether the layer uses offset vector.
kernel_ Initializer: initializer of kernel weight matrix (see initializers for details).
bias_initializer: see initializers for offset vectors
kernel_regularizer: the regularization function applied to the kernel weight matrix (see regularizer for details).
bias_regularizer: regularization function applied to offset direction (see regularizer for details).
activity_regularizer: the regularization function (its "activation") applied to the output of the layer. (see regulator for details).
kernel_constraint: the constraint function applied to the kernel weight matrix (see constraints for details).
bias_constraint: the constraint function applied to the offset vector (see constraints for details).

Enter size

nD tensor, size: (batch_size,..., input_dim). The most common case is a 2D input with a size of (batch_size, input_dim).

Output size

nD tensor, size: (batch_size,..., units). For example, for a 2D input with a size of (batch_size, input_dim), the output size is (batch_size, units).

Set up LeNet

1. Model construction

# model
NB_classes = 10
optimizer = Adam()
dropout = 0.3
INPUT_SHAPE = (1,28, 28)
# Select channels_first: return (3256256)
# Select channels_last: return (256256,3)
model = Sequential()
# CONN->RELU->POOL
model.add(Conv2D(20, kernel_size=5, input_shape=INPUT_SHAPE, padding='same',data_format='channels_first'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), data_format='channels_first'))
# CONN->RELU->POOL
model.add(Conv2D(50, kernel_size=5, padding='same',data_format='channels_first'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), data_format='channels_first'))

# Flatten -> Dense ->RELU
# Flatten layer is used to "flatten" the input, and make the input mostly one-dimensional, which is used as the convolution layer to be the transition of the full connection layer
model.add(Flatten())
model.add(Dense(500))
model.add(Activation('relu'))

# Dense->softmax
model.add(Dense(NB_classes))
model.add(Activation('softmax'))
model.summary()
# Select appropriate loss function, objective function and evaluation function
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
plot_model(model, to_file='./2-Lenet-model_1.png', show_shapes=True)

After building the above model, we can get the following model structure

2. Data acquisition and preprocessing

# data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# we need a 60K x [1 x 28 x 28] shape as input to the CONVNET
X_train = X_train[:, np.newaxis, :, :]
X_test = X_test[:, np.newaxis, :, :]
# nomalize
X_train /= 255
X_test /= 255
y_train = np_utils.to_categorical(y_train, NB_classes)
y_test = np_utils.to_categorical(y_test, NB_classes)

3. Training

Use model Fit training model

# train
batch_size = 128
NB_epoch = 20
history = model.fit(X_train, y_train, batch_size=batch_size, epochs=NB_epoch, validation_split=0.2, verbose=1)
score = model.evaluate(X_test, y_test, batch_size=batch_size, verbose=1)
print('score:', score[1], '    accuracy:', score[1])
print('history:', history.history.keys())

4. Evaluation model

Use plt to draw the loss and accuracy in the process of model training

# summarize history for accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.savefig('lenet_acc.png')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.savefig('lenet_loss.png')
plt.show()

Here we have completed the construction and training of LeNet. Through this process, we have further study on the convolution layer, activation function, pooling layer, full connection layer, model training and evaluation. Start your construction quickly.

Topics: AI neural networks Deep Learning convolution