2.1 build a neural network using pytoch
Learning objectives:
- Master the basic process of constructing neural network with Pytorch
- Master the implementation process of constructing neural network with Pytorch
About torch.nn:
- Use Pytorch to build neural networks. The main tools are in torch.nn package
- nn relies on autograd to define the model and derive it automatically
Typical process of constructing neural network
- A neural network with scientific parameters is defined
- Traversal training data set
- Process the input data to flow into the neural network
- Calculate loss value
- The gradient of network parameters is back propagated
- Update the weight of the network with certain rules
Define a neural network implemented by PyTorch
CNN on the fifth floor
import torch import torch.nn as nn import torch.nn.functional as F #Define a simple network class class Net(nn.Module): def __init__(self): """ Five layer neural network """ super(Net,self).__init__() #Define the first layer convolution neural network, input channel dimension = 1, output channel dimension = 6, convolution kernel size 3 * 3 self.conv1=nn.Conv2d(1,6,3) #Define the second layer convolution neural network, input channel dimension = 6, output channel dimension = 16, convolution kernel size 3 * 3 self.conv2=nn.Conv2d(6,16,3) #Define a three-tier fully connected network self.fc1=nn.Linear(16*6*6,120) self.fc2=nn.Linear(120,84) self.fc3=nn.Linear(84,10) def forward(self,x): """ :param x: :return: """ #Execute the maximum pool operation in the pool window of (2, 2) x=F.max_pool2d(F.relu(self.conv1(x)),(2,2))#Throw into the first convolution layer - "relu" - Pool 2 * 2 pool x=F.max_pool2d(F.relu(self.conv2(x)),2) #Throw into the second convolution layer - "relu -" pooling x=x.view(-1,self.num_flat_features(x)) #Set the tensor of three dimensions to two dimensions x=F.relu(self.fc1(x)) x=F.relu(self.fc2(x)) x=self.fc3(x) return x def num_flat_features(self,x): """ :param x:The tensor after convolution x,Calculate its size,x It's a three-dimensional tensor. Take out the last two dimensions eg: (2,3,4) Return 3*4=12 :return: """ #Calculate size, except batch on dimension 0_ size size=x.size()[1:] num_features=1 for s in size: num_features*=s return num_features net=Net() print("the structure of net:",net) #output the structure of net: Net( (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1)) (conv2): Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1)) (fc1): Linear(in_features=576, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) )
Obtain all trainable parameters in the model
net.parameters()
params=list(net.parameters()) print("len of params:",len(params)) print(params[0].size()) print(params[0]) #output len of params: 10 torch.Size([6, 1, 3, 3]) Parameter containing: ......
Suppose the input size of the image is 32 * 32
input=torch.randn(1,1,32,32) out=net(input) print(out) #output tensor([[ 4.0471e-02, 8.6946e-02, -1.8538e-02, 5.5225e-02, -9.2332e-02, 3.0920e-02, -1.9139e-04, -2.0026e-01, -4.5713e-02, 2.6573e-02]], grad_fn=<AddmmBackward0>)
With the output tensor, gradient zeroing and back propagation operations can be performed
be careful:
- The neural network constructed by torch.nn only supports the input of mini batches, not a single sample
- For example, nn.Conv2d needs a 4D Tensor in the shape of (nSamples,nChannels,Height,Width). If the number is only but the nature is the same, you need to execute input.unsqueeze(0) to actively expand the 3D Tensor into a 4D Tensor
loss function
- The input of the loss function is an input pair (output,target), and then a value is calculated to evaluate the gap between output and target
- There are several different loss functions available in torch.nn. For example, nn.mselos calculates the mean square deviation loss to evaluate the gap between the input and the target value
- An example of calculating loss using nn.mselos
input=torch.randn(1,1,32,32) output=net(input) target=torch.randn(10) #Change the shape of the target to a two-dimensional tensor to match the output target=target.view(1,-1) criterion=nn.MSELoss() loss=criterion(output,target) print(loss)
The direction propagation chain of the neural network
input->conv2d->relu->maxpool2d ->conv2d->relu->maxpool2d ->view->linear->relu->linear->relu->linear ->MSELoss ->loss
When loss.backward() is called, the whole calculation chart will automatically derive loss, and all attributes require_ Tensors with grad = true will participate in the operation of gradient derivation and accumulate the gradient into the. Grad attribute in tensors
output=net(input) target=torch.randn(10) #Change the shape of the target to a two-dimensional tensor to match the output target=target.view(1,-1) criterion=nn.MSELoss() loss=criterion(output,target) print(loss) print(loss.grad_fn) print(loss.grad_fn.next_functions[0][0]) print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) #output tensor(0.6516, grad_fn=<MseLossBackward0>) <MseLossBackward0 object at 0x0000021826CEEE88> <AddmmBackward0 object at 0x00000218117E8908> <AccumulateGrad object at 0x0000021826CEEE88>
Back propagation
- It is very easy to perform back propagation in PyTorch. The whole operation is loss.backward()
- Before performing back propagation, the gradient must be cleared first, otherwise the gradient will be accumulated between different batch data
- Execute a back propagation demo
# Code for performing gradient zeroing in Pytorch net.zero_grad() print('conv1.bias.grad before backward') print(net.conv1.bias.grad) #Code that performs back propagation in Python loss.backward() print("conv1.bias.grad after backward") print(net.conv1.bias.grad)
Update to parameter
- The simplest algorithm for updating parameters is SGD random gradient descent
- The specific algorithm formula is: weight = weight learning_ rate*gradient
- First, SGD is implemented with traditional Python code as follows:
#SGD is implemented using traditional Python code learning_rate=0.01 for f in net.parameters(): f.data.sub_(f.grad.data*learning_rate)
- Use the standard code officially recommended by PyTorch as follows:
#First, import the optimizer package. optim contains several common optimization algorithms, such as SGD,Adam, etc import torch.optim as optim #Creating optimizer objects through optim optimizer=optim.SGD(net.parameters(),lr=0.01) #The optimizer performs a gradient zeroing operation optimizer.zero_grad() output=net(input) loss=criterion(output,target) #Perform a back propagation operation on the loss value loss.backward() #The update of parameters is performed through a line of standard code optimizer.step() print("DONE")
Subsection summary
Learn the typical process of building a neural network
- A neural network with learnable parameters is defined
- Traversal training data set
- Neural network for processing input data
- Calculate loss value
- The gradient of network parameters is back propagated
- Update the weight of the network with certain rules
Definition of learning loss function
- In the previous demo, we used torch. NN. Mselos () to calculate the mean square error
- When performing back propagation calculation through loss.backward(), the whole calculation graph will automatically derive loss, and all attributes are required_ Tensors with grad = true will participate in the operation of gradient derivation and accumulate the gradient into the. Grad attribute in tensors
Learn the calculation method of back propagation
- It is very easy to perform back propagation in pytoch. The whole operation is loss.backward()
- Before performing back propagation, clear the gradient first, otherwise the gradient will be accumulated between different batches of data
net.zero_grad() loss.backward()
Updating method of learning parameters
- Define an optimizer to perform parameter optimization and updating
optimizer=optim.SGD(net.parameters(),lr=0.01) - Specific parameter updates are performed through the optimizer
optimizer.step()