MNIST handwritten numeral recognition based on PyTorch

Posted by freelancedeve on Fri, 22 Oct 2021 05:01:40 +0200

Recently, I'm learning neural networks. Here's a little practice. This article is reprinted, Click here to jump to the original text.

1. Environment configuration

PyTorch is used here to train the model. The installation of PyTorch seems to have something to do with the machine's graphics card. You can search the installation tutorial online. After the configuration is complete, write the code in the Jupiter notebook. First, PyTorch is introduced.

import torch
import torchvision
from import DataLoader

2. Prepare data sets

Define some super parameters:

n_epochs = 3 			#Number of times to cycle the entire training data set
batch_size_train = 64
batch_size_test = 1000
learning_rate = 0.01	#Optimizer's hyperparameters
momentum = 0.5			#Optimizer's hyperparameters
log_interval = 10

random_seed = 1
torch.backends.cudnn.enabled = False

For repeatable experiments, random number seeds should be set to generate random numbers. In addition, the non deterministic algorithm cuDNN can be disabled.
Then we need the DataLoaders of the dataset. TorchVision can easily load MNIST datasets. Using batch_size=64 for training and size=1000 for testing the data set. The values 0.1307 and 0.3081 used in Normalize() conversion are the global mean and standard deviation of MNIST data set, which are taken as fixed values here.

train_loader =
  torchvision.datasets.MNIST('/files/', train=True, download=True,
                                 (0.1307,), (0.3081,))
  batch_size=batch_size_train, shuffle=True)

test_loader =
  torchvision.datasets.MNIST('/files/', train=False, download=True,
                                 (0.1307,), (0.3081,))
  batch_size=batch_size_test, shuffle=True)

Then you can see what a batch of test data consists of. example_targets is the digital label actually corresponding to the picture. A batch of test data is a shape tensor. After running, you can see torch.Size([1000,1,28,28]), which means that there are 28x28 pixel gray images of 1000 examples (i.e. there is no rgb channel).

examples = enumerate(test_loader)
batch_idx, (example_data, example_targets) = next(examples)

You can use matplotlib to draw some of these pictures.

import matplotlib.pyplot as plt

fig = plt.figure()
for i in range(6):
  plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
  plt.title("Ground Truth: {}".format(example_targets[i]))

3. Build a network

Now start building the network. We will use two 2d convolution layers, followed by two fully connected (or linear) layers. We will select the rectifier linear unit (ReLUs) as the activation function and use two dropout layers as the means of regularization. First, import some sub modules to make the code more readable.

import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In PyTorch, a good way to build a network is to create a new class for the network you want to build.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
        self.conv2_drop = nn.Dropout2d()
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    def forward(self, x):
        x = F.relu(F.max_pool2d(self.conv1(x), 2))
        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
        x = x.view(-1, 320)
        x = F.relu(self.fc1(x))
        x = F.dropout(x,
        x = self.fc2(x)
        return F.log_softmax(x)

Then initialize the network and optimizer.

network = Net()
optimizer = optim.SGD(network.parameters(), lr=learning_rate,

4. Model training

Establish training cycle: first, ensure that the network is in training mode. Then, each epoch iterates over all the training data once. Loading individual batches is handled by the DataLoader.

First, you need to use optimizer.zero_grad() manually sets the gradient to zero because PyTorch accumulates the gradient by default. Then, the output of the network (forward transfer) is generated, and the negative logarithmic probability loss between the output and the truth label is calculated. Now we collect a new set of gradients and propagate them back to each network parameter using optimizer.step().

We will also use some printouts to track progress. In order to create a good training curve in the future, two lists are also created to save training and test losses. On the x-axis, we want to show the number of training examples that the network sees during training.

train_losses = []
train_counter = []
test_losses = []
test_counter = [i*len(train_loader.dataset) for i in range(n_epochs + 1)]

Before starting the training, we will run a test cycle to see how much accuracy / loss can be achieved using only randomly initialized network parameters.

def train(epoch):
  for batch_idx, (data, target) in enumerate(train_loader):
    output = network(data)
    loss = F.nll_loss(output, target)
    if batch_idx % log_interval == 0:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        epoch, batch_idx * len(data), len(train_loader.dataset),
        100. * batch_idx / len(train_loader), loss.item()))
        (batch_idx*64) + ((epoch-1)*len(train_loader.dataset))), './model.pth'), './optimizer.pth')

The neural network module and optimizer can use. state_dict() saves and loads their internal state. In this way, if necessary, you can continue training from the previously saved state dict -- just call. load_state_dict(state_dict).

Now enter the test cycle. Here, the test loss is summarized and the correctly classified numbers are tracked to calculate the accuracy of the network.

def test():
  test_loss = 0
  correct = 0
  with torch.no_grad():
    for data, target in test_loader:
      output = network(data)
      test_loss += F.nll_loss(output, target, size_average=False).item()
      pred =, keepdim=True)[1]
      correct += pred.eq(
  test_loss /= len(test_loader.dataset)
  print('\nTest set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
    test_loss, correct, len(test_loader.dataset),
    100. * correct / len(test_loader.dataset)))

Using context manager no_grad(), we can avoid storing the calculation results of generating network output in the calculation diagram.

Start training! We will iterate over N in a loop_ Before epochs, we manually added a test() call to evaluate our model with randomly initialized parameters.

for epoch in range(1, n_epochs + 1):

5. Evaluate the performance of the model

After three stages of training, the accuracy of the test set has been very high. Let's draw the training curve.

fig = plt.figure()
plt.plot(train_counter, train_losses, color='blue')
plt.scatter(test_counter, test_losses, color='red')
plt.legend(['Train Loss', 'Test Loss'], loc='upper right')
plt.xlabel('number of training examples seen')
plt.ylabel('negative log likelihood loss')

From the training curve, it seems that we can even continue to train a few epoch s. But before that, let's look at a few more examples, as we did before, and compare the output of the model.

with torch.no_grad():
  output = network(example_data)
fig = plt.figure()
for i in range(6):
  plt.imshow(example_data[i][0], cmap='gray', interpolation='none')
  plt.title("Prediction: {}".format(, keepdim=True)[1][i].item()))

The prediction of these examples by the model seems to be correct.

6. Continuous training of checkpoints

Now continue training the network, or see how to save the state from the first training run_ Continue training in dicts. Initialize a new set of networks and optimizers.

continued_network = Net()
continued_optimizer = optim.SGD(network.parameters(), lr=learning_rate, momentum=momentum)

Use. load_state_dict(), we can now load the internal state of the network and optimize them the last time we save them.

network_state_dict = torch.load('model.pth')
optimizer_state_dict = torch.load('optimizer.pth')

Similarly, running a training cycle should immediately resume our previous training. To check this, just use the same list as before to track the loss value. Because of the way we build the test counter for the number of training examples we see, we have to add it manually here.

for i in range(4,9):

Again, we see that the accuracy of the test set has improved (running slower, much slower) from one epoch to another. Let's use the image to further check the training progress.

fig = plt.figure()
plt.plot(train_counter, train_losses, color='blue')
plt.scatter(test_counter, test_losses, color='red')
plt.legend(['Train Loss', 'Test Loss'], loc='upper right')
plt.xlabel('number of training examples seen')
plt.ylabel('negative log likelihood loss')

Topics: Python Pytorch Deep Learning