Elementary neural network
The problem we want to solve here is to convert the gray image of handwritten digits (28 pixels) × 28 pixels) into 10 categories (0 ~ 9). We will use the MNIST dataset, which contains 60000 training images and 10000 test images.
Step 1: prepare data
1. The Minist data set contains 60000 training sets and 10000 test data sets. It is divided into picture and label. The picture is a 28 * 28 pixel matrix, and the label is 0 ~ 9, a total of 10 numbers. 2. Define the train for reading MNIST dataset_ Reader and test_reader, which specifies the size of a Batch as 128, that is, 128 images are trained or verified at a time. 3.paddle. dataset. mnist. The train () or test() interface has carried out gray processing, normalization, centering and other processing for us.
#Import required packages import numpy as np import paddle as paddle import paddle.fluid as fluid from PIL import Image #Call the image processing library, including the image class import matplotlib.pyplot as plt import os train_reader = paddle.batch(paddle.reader.shuffle(paddle.dataset.mnist.train(), buf_size=512), batch_size=128) #train_reader is a data provider for training test_reader = paddle.batch(paddle.dataset.mnist.test(), batch_size=128) #test_reader is a data provider for testing #paddle.batch() means every batch_size forms a batch
Supplementary notes:
- Numpy: an open source numerical calculation extension package for Python. as means an alias for convenience after import.
- paddle is a deep learning framework in which fluid is used Data creates a data variable.
import paddle.fluid as fluid # Define a two-dimensional data variable x with data type int64. The first dimension of X is 3, and the second dimension is unknown, which can be determined during program execution. Therefore, the shape of X can be specified as [3, None] x = fluid.data(name="x", shape=[3, None], dtype="int64") # Most networks organize data in batch mode. The batch size is uncertain when defining, so the dimension of batch (usually the first dimension) can be specified as None batched_x = fluid.data(name="batched_x", shape=[None, 3, None], dtype='int64')
Use fluid layers. fill_ Constant to create a constant
import paddle.fluid as fluid data = fluid.layers.fill_constant(shape=[3, 4], value=16, dtype='int64')
-
Matplotlib.plt is object-oriented drawing
-
PaddlePaddle provides the interface to read the MINST training set and the test set, respectively, paddle.. dataset. mnist. Train () and pad dataset. mnist. test(). paddle.reader.shuffle() indicates the buf per cache_ Size data items and disrupt them.
Print it and look at the mnist dataset
temp_reader = paddle.batch(paddle.dataset.mnist.train(), batch_size=1) temp_data=next(temp_reader())#The handwritten digital data image of 28 * 28 is converted into vector form for storage, and the vector of 784 is obtained print(temp_data)
Step 2: configure network
The following code is to define a simple multi-layer perceptron. There are three layers in total, two hidden layers with a size of 100 and an output layer with a size of 10. Because MNIST data set is handwritten gray-scale images from 0 to 9 and there are 10 categories, the final output size is 10. The activation function of the last output layer is Softmax, so the last output layer is equivalent to a classifier. If an input layer is added, the structure of the multilayer perceptron is: input layer -- > > hidden layer -- > > hidden layer -- > > output layer.
# Define multilayer perceptron def multilayer_perceptron(input): # For the first full connection layer, the activation function is ReLU hidden1 = fluid.layers.fc(input=input, size=100, act='relu') #Constructing a full connection layer in neural network # For the second full connection layer, the activation function is ReLU hidden2 = fluid.layers.fc(input=hidden1, size=100, act='relu') # The full connection output layer with softmax as the activation function has a size of 10 prediction = fluid.layers.fc(input=hidden2, size=10, act='softmax') return prediction
Define the input layer, and the input is image data. The image is a 28 * 28 grayscale image, so the input shape is [1, 28, 28]. If the image is a 32 * 32 color image, the input shape is [3. 32, 32], because the grayscale image has only one channel, while the color image has three RGB channels.
# Define input / output layer image = fluid.layers.data(name='image', shape=[1, 28, 28], dtype='float32') #Single channel, 28 * 28 pixel value label = fluid.layers.data(name='label', shape=[1], dtype='int64') #Picture label
Supplementary notes:
- About padding fluid. data
paddle.fluid.data() is an OP (operator), which is used to create a global variable that can be accessed by operators in the calculation diagram and used as a placeholder for data input.
Name is paddle fluid. The name of the global variable created by data() is the prefix identification of the input layer output.
shape declares padding fluid. Dimension information of the global variable created by data().
None in the shape indicates the number of elements that are uncertain about the dimension, which will be determined during program execution.
- 1 in the shape can only be at the front of the shape, indicating that it can adapt to any batch size
dtype is a pad fluid. The data type of the global variable created by data(), which supports {bool,float16,float32,float64,int8,int16,int32,int64.
The data of the user feed must be the same as that of the pad fluid. The variables created by data () have the same shape. Although the data type of the feed is unsigned Byte, the softmax regression needs floating-point operation, so the data type is converted to float32
2. About padding fluid. layers. fc
paddle.fluid.layers.fc() is an OP, which is used to establish a full connection layer. Create a weight variable for each input Tensor, that is, a fully connected weight matrix from each input unit to each output unit.
The FC layer multiplies each input Tensor and its corresponding weights to obtain a shape of [M,size] output Tensor, where m is batch_size. If there are multiple input tensors, the calculation results of tensors with multiple shapes of [M,size] will be accumulated as the final output.
Here, call the defined network to obtain the classifier:
# Get classifier model = multilayer_perceptron(image)
Then, the loss function is defined. This time, the cross entropy loss function is used, which is commonly used in classification tasks. After defining a loss function, you can also average it, because it defines the loss value of a Batch. At the same time, we can also define an accuracy function, which can output the classification accuracy during our training.
# Get loss function and accuracy function cost = fluid.layers.cross_entropy(input=model, label=label) #The cross entropy loss function is used to describe the difference between the real sample label and the prediction probability #Loss of one sample avg_cost = fluid.layers.mean(cost) #Average loss of a batch acc = fluid.layers.accuracy(input=model, label=label)
Next, we define the optimization method. This time, we use the Adam optimization method, and specify the learning rate as 0.001.
# Define optimization method optimizer = fluid.optimizer.AdamOptimizer(learning_rate=0.001) #Optimization using Adam algorithm opts = optimizer.minimize(avg_cost)
Step3: model training & Step4: model evaluation
Then define a parser and initialization parameters
# Define a parser that uses CPU place = fluid.CPUPlace() exe = fluid.Executor(place) # Parameter initialization exe.run(fluid.default_startup_program())
Supplementary notes:
- When CPUPlace() is used, the CPU is used. If CUDAPlace() is used, the GPU is used.
- Only the parser can execute the program. There are two programs by default: default_startup_program() and default_main_program()
- default_startup_program() defines various operations such as model parameter initialization, optimizer parameter initialization, reader initialization, etc.
- default_main_program() defines various operations such as neural network model, forward and reverse calculation, model parameter update, optimizer parameter update and so on.
The entered data dimension is the label corresponding to the image data and the image. Each category of image must correspond to a label, which is an integer value incremented from 0.
# Define input data dimensions feeder = fluid.DataFeeder(place=place, feed_list=[image, label]) #The place parameter indicates that data such as numpy array passed in from Python should be converted to lodsensor on GPU or CPU #feed_ The list parameter is a list of variables #Data feed for data format conversion
Finally, we can start training. We train 5 passes this time. In the above, we have defined a function to calculate the accuracy rate, so we let it output the current accuracy rate during training. The principle of calculating the accuracy rate is very simple, that is, compare the predicted result of training with the real value to calculate the accuracy rate. After each Pass training, conduct another test, use the test set to test, and calculate the average value of current Cost and accuracy.
# Start training and testing for pass_id in range(5): # pass_id 0 to 4 # Conduct training for batch_id, data in enumerate(train_reader()): #Traverse train_reader train_cost, train_acc = exe.run(program=fluid.default_main_program(),#Run the main program feed=feeder.feed(data), #Feed data to the model fetch_list=[avg_cost, acc]) #fetch Error, accuracy, #fetch_list is the variable or named result that the user wants # Print information error and accuracy every 100 batches if batch_id % 100 == 0: print('Pass:%d, Batch:%d, Cost:%0.5f, Accuracy:%0.5f' % (pass_id, batch_id, train_cost[0], train_acc[0])) # Test test_accs = [] test_costs = [] #One test per training round for batch_id, data in enumerate(test_reader()): #Traversal test_reader test_cost, test_acc = exe.run(program=fluid.default_main_program(), #Perform training procedures feed=feeder.feed(data), #Feed data fetch_list=[avg_cost, acc]) #fetch error and accuracy test_accs.append(test_acc[0]) #Record the accuracy of each batch test_costs.append(test_cost[0]) #Record the error of each batch # Average the test results test_cost = (sum(test_costs) / len(test_costs)) #Average error per round test_acc = (sum(test_accs) / len(test_accs)) #Average accuracy per round print('Test:%d, Cost:%0.5f, Accuracy:%0.5f' % (pass_id, test_cost, test_acc)) #Save model model_save_dir = "/home/aistudio/data/hand.inference.model" # Create if the save path does not exist if not os.path.exists(model_save_dir): os.makedirs(model_save_dir) print ('save models to %s' % (model_save_dir)) fluid.io.save_inference_model(model_save_dir, #Path to save inference model ['image'], #inference requires the data of the feed [model], #The Variables that hold the inference results exe) #The executor saves the information model
Supplementary notes:
- for i,b... In enumerate (a) mode, I and B variables need to be assigned at the same time. I is assigned as a current element, as shown in the following table, and B is assigned as a current element.
- paddle.fluid.io.save_inference_model(dirname, feeded_var_names, target_vars, executor, main_program=None, model_filename=None, params_filename=None, export_for_deployment=True, program_only=False)
- append adding a new object at the end of the list will modify the original list.
Step5: model prediction
Before prediction, the image should be preprocessed in the same way as during training. First, grayscale, then compress the image size to 28 * 28, then convert the image into one-dimensional vector, and finally normalize the one-dimensional vector.
# Preprocess pictures def load_image(file): im = Image.open(file).convert('L') #Convert RGB into gray image, L represents gray image, and the pixel value of gray image is between 0 ~ 255 im = im.resize((28, 28), Image.ANTIALIAS) #Resize image with high quality the image size is 28 * 28 im = np.array(im).reshape(1, 1, 28, 28).astype(np.float32)#Returns an array of new shapes, turning it into a numpy array to match the data feed format. # print(im) im = im / 255.0 * 2.0 - 1.0 #Normalized to [- 1 ~ 1] print(im) return im #Use the Matplotlib tool to display this image. img = Image.open('data/data27012/6.png') plt.imshow(img) #Draw an image from an array plt.show() #Display image
Supplement · note:
- Image. The Img data type obtained by open () is an image object
- img.resize((width, height),Image.ANTIALIAS)
The first parameter: width,height, indicates to set the width and height of the incoming picture.
Second parameter:
Image.NEAREST: low quality
Image.BILINEAR: bilinear
Image. Cubic: cubic spline interpolation
Image.ANTIALIAS: high quality - The astype function is used for numeric type conversion in array
#Create executor for prediction infer_exe = fluid.Executor(place) inference_scope = fluid.core.Scope() #Used to get a new scope
Finally, the image is converted into a one-dimensional vector and predicted, and the data is transferred from the image in the feed. fetch_ The value of list is the last classifier of the network model, so the output result is the probability value of 10 Tags, and the sum of these probability values is 1.
Start prediction
Through fluid io. load_ inference_ Model, the predictor will start from params_ Read the trained model in dirname (model_save_dir) to predict the data never encountered.
# Load data and start prediction with fluid.scope_guard(inference_scope): #fluid. scope_ The guard interface can switch to a specified scope through the With statement. #Get the trained model #Load the inference model(inference model) from the specified directory [inference_program, #Reasoning Program feed_target_names, #Is a str list that contains the names of variables that need to provide data in the inference Program. fetch_targets] = fluid.io.load_inference_model(model_save_dir,#fetch_targets: is a list of variables from which we can get inference results. model_save_dir: path to save the model infer_exe) #infer_exe: run the executor of the information model img = load_image('data/data27012/6.png') results = exe.run(program=inference_program, #Run speculator feed={feed_target_names[0]: img}, #Feed img to be predicted fetch_list=fetch_targets) #Get the speculation,
- Supplementary note: fludi io. load_ inference_ The model returns a tuple of three elements.
After getting the probability value of each tag, we need to get the tag with the highest probability and print it out.
# Obtain the label with the highest probability lab = np.argsort(results) #The argsort function returns the index value of the result array value from small to large #print(lab) print("The prediction result of this picture label by: %d" % lab[0][0][-1]) #-1 stands for reading the penultimate column in the array