Module - Neural network training and inference

Posted by jblallement on Sun, 07 Jun 2020 05:50:38 +0200

API . Training neural network needs many steps. You need to specify how to input training data, initialize model parameters, perform forward and backward transfers in the network, update weights based on calculated gradients, perform model checkpoints, and so on. Most of these steps will eventually be repeated during the forecast process. For novices and experienced developers, all of these are daunting.
Fortunately, MXNet modularizes common code for training and reasoning in the module package. Module provides advanced and intermediate interfaces for executing predefined networks. Both interfaces can be used interchangeably. In this tutorial, we'll show you how to use these two interfaces.

 

Preliminary

The first is a primary usage demo:

import logging
import random
logging.getLogger().setLevel(logging.INFO)

import mxnet as mx
import numpy as np

mx.random.seed(1234)
np.random.seed(1234)
random.seed(1234)

fname = mx.test_utils.download('https://s3.us-east-2.amazonaws.com/mxnet-public/letter_recognition/letter-recognition.data')
data = np.genfromtxt(fname, delimiter=',')[:,1:]
label = np.array([ord(l.split(',')[0])-ord('A') for l in open(fname, 'r')])

batch_size = 32
ntrain = int(data.shape[0]*0.8)
train_iter = mx.io.NDArrayIter(data[:ntrain, :], label[:ntrain], batch_size, shuffle=True)
val_iter = mx.io.NDArrayIter(data[ntrain:, :], label[ntrain:], batch_size)

Network definition, using Symbol:

net = mx.sym.Variable('data')
net = mx.sym.FullyConnected(net, name='fc1', num_hidden=64)
net = mx.sym.Activation(net, name='relu1', act_type="relu")
net = mx.sym.FullyConnected(net, name='fc2', num_hidden=26)
net = mx.sym.SoftmaxOutput(net, name='softmax')
mx.viz.plot_network(net, node_attrs={"shape":"oval","fixedsize":"false"})

 

 

Creating a Module

Module class is used to introduce modules, which can be built by specifying the following parameters:

  • symbol: network definition
  • context: device (or device list) for execution
  • data_names: enter a list of data variable names
  • label_names: enter a list of label variable names

For the net defined above, there is only one data named data, and only one label is automatically named softmax_label, this is based on our name softmax in softmax output.

mod = mx.mod.Module(symbol=net,
                    context=mx.cpu(),
                    data_names=['data'],
                    label_names=['softmax_label'])

 

Intermediate-level Interface

We have created the module. Now let's look at how to run training and reasoning using the module's intermediate api. These APIs enable developers to flexibly perform step-by-step calculations by running forward and backward. It is also useful for debugging. In order to train a module, the following steps need to be implemented:

  • bind: prepare the environment for computing by allocating memory.
  • init_params: assign and initialize parameters.
  • init_optimizer: initializes the optimizer. The default is sgd.
  • metric.create : creates a calculated measure from the input measure name.
  • Forward: forward calculation.
  • update_metric: the calculated metric that calculates and accumulates the output of the last forward calculation.
  • Backward: backward calculation.
  • update: updates the parameters based on the installed optimizer and the gradient calculated in the previous previous pre post batch.

The specific implementation is as follows:

# allocate memory given the input data and label shapes
mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
# initialize parameters by uniform random numbers
mod.init_params(initializer=mx.init.Uniform(scale=.1))      # mxnet.initializer
# use SGD with learning rate 0.1 to train
mod.init_optimizer(optimizer='sgd', optimizer_params=(('learning_rate', 0.1), ))   # mxnet.optimiazer
# use accuracy as the metric
metric = mx.metric.create('acc')       # mxnet.mxtric
# train 5 epochs, i.e. going over the data iter one pass
for epoch in range(5):
    train_iter.reset()       # Re iteration
    metric.reset()           # Reassessment
    for batch in train_iter:
        mod.forward(batch, is_train=True)       # compute predictions   Forward
        mod.update_metric(metric, batch.label)  # accumulate prediction accuracy   Calculate evaluation index
        mod.backward()                          # compute gradients     # reverse
        mod.update()                            # update parameters     # Update parameters
    print('Epoch %d, Training %s' % (epoch, metric.get()))

be careful module.bind and symbol.bind dissimilarity.

Note: there are many abbreviations in mxnet: mx.symbol=mx.sys;    mx.initializer=mx.init;    mx.module=mx.mod 

Epoch 0, Training ('accuracy', 0.434625)
Epoch 1, Training ('accuracy', 0.6516875)
Epoch 2, Training ('accuracy', 0.6968125)
Epoch 3, Training ('accuracy', 0.7273125)
Epoch 4, Training ('accuracy', 0.7575625)

 

 

High-level Interface

In the previous section, intermediate API is used, so there are many steps. In this section, advanced API is used fit Function.  

1. Training

# reset train_iter to the beginning   Reset iterator
train_iter.reset()

# create a module
mod = mx.mod.Module(symbol=net,        # Create module
                    context=mx.cpu(),
                    data_names=['data'],
                    label_names=['softmax_label'])

# fit the module         # train
mod.fit(train_iter,
        eval_data=val_iter,
        optimizer='sgd',
        optimizer_params={'learning_rate':0.1},
        eval_metric='acc',
        num_epoch=7)
INFO:root:Epoch[0] Train-accuracy=0.325437
INFO:root:Epoch[0] Time cost=0.550
INFO:root:Epoch[0] Validation-accuracy=0.568500
INFO:root:Epoch[1] Train-accuracy=0.622188
INFO:root:Epoch[1] Time cost=0.552
INFO:root:Epoch[1] Validation-accuracy=0.656500
INFO:root:Epoch[2] Train-accuracy=0.694375
INFO:root:Epoch[2] Time cost=0.566
INFO:root:Epoch[2] Validation-accuracy=0.703500
INFO:root:Epoch[3] Train-accuracy=0.732187
INFO:root:Epoch[3] Time cost=0.562
INFO:root:Epoch[3] Validation-accuracy=0.748750
INFO:root:Epoch[4] Train-accuracy=0.755375
INFO:root:Epoch[4] Time cost=0.484
INFO:root:Epoch[4] Validation-accuracy=0.761500
INFO:root:Epoch[5] Train-accuracy=0.773188
INFO:root:Epoch[5] Time cost=0.383
INFO:root:Epoch[5] Validation-accuracy=0.715000
INFO:root:Epoch[6] Train-accuracy=0.794687
INFO:root:Epoch[6] Time cost=0.378
INFO:root:Epoch[6] Validation-accuracy=0.802250

By default, the parameters in fit are as follows:

eval_metric set to accuracy, optimizer to sgd and optimizer_params to (('learning_rate', 0.01),).

 

2. Prediction and evaluation
To predict a test set, call the predict() function, which will return all the predicted results.
y = mod.predict(val_iter)
assert y.shape == (4000, 26)

Sometimes we don't care about the specific prediction value, just want to know the indicators on the test set, then we can call the score() function to achieve. It will be evaluated based on the metric you provide:

score = mod.score(val_iter, ['acc'])
print("Accuracy score is %f" % (score[0][1]))
assert score[0][1] > 0.76, "Achieved accuracy (%f) is less than expected (0.76)" % score[0][1]
Accuracy score is 0.802250

Of course, other indicators can also be used: top_k_acc(top-k-accuracy), F1, RMSE, MSE, MAE, ce(CrossEntropy). For more indicators, see Evaluation metric.

 
 
3. Save and load model
You can save the following model after each epoch and call the following callback function:
# construct a callback function to save checkpoints
model_prefix = 'mx_mlp'
checkpoint = mx.callback.do_checkpoint(model_prefix)

mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)    # Write to epoch_ end_ Every epoch in the callback will be saved once
INFO:root:Epoch[0] Train-accuracy=0.098437
INFO:root:Epoch[0] Time cost=0.421
INFO:root:Saved checkpoint to "mx_mlp-0001.params"
INFO:root:Epoch[1] Train-accuracy=0.257437
INFO:root:Epoch[1] Time cost=0.520
INFO:root:Saved checkpoint to "mx_mlp-0002.params"
INFO:root:Epoch[2] Train-accuracy=0.457250
INFO:root:Epoch[2] Time cost=0.562
INFO:root:Saved checkpoint to "mx_mlp-0003.params"
INFO:root:Epoch[3] Train-accuracy=0.558187
INFO:root:Epoch[3] Time cost=0.434
INFO:root:Saved checkpoint to "mx_mlp-0004.params"
INFO:root:Epoch[4] Train-accuracy=0.617750
INFO:root:Epoch[4] Time cost=0.414
INFO:root:Saved checkpoint to "mx_mlp-0005.params"

To load model parameters, you can call load_checkpoint function, which will load Symbol and related parameters, and then send the loaded parameters into the module:

sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 3)
assert sym.tojson() == net.tojson()

# assign the loaded parameters to the module
mod.set_params(arg_params, aux_params)

If you just want to import a saved model to continue training, you can do without set_params () and pass the parameters directly in fit (). At this point, fit knows that you want to load an existing parameter instead of randomly initializing it for ab initio training. You can also set the begin again at this time_ Epoch indicates that we are training from an epoch.

mod = mx.mod.Module(symbol=sym)
mod.fit(train_iter,
        num_epoch=21,
        arg_params=arg_params,
        aux_params=aux_params,
        begin_epoch=3)
assert score[0][1] > 0.77, "Achieved accuracy (%f) is less than expected (0.77)" % score[0][1]
INFO:root:Epoch[3] Train-accuracy=0.555438
INFO:root:Epoch[3] Time cost=0.377
INFO:root:Epoch[4] Train-accuracy=0.616625
INFO:root:Epoch[4] Time cost=0.457
INFO:root:Epoch[5] Train-accuracy=0.658438
INFO:root:Epoch[5] Time cost=0.518
...........................................
INFO:root:Epoch[18] Train-accuracy=0.788687
INFO:root:Epoch[18] Time cost=0.532
INFO:root:Epoch[19] Train-accuracy=0.789562
INFO:root:Epoch[19] Time cost=0.531
INFO:root:Epoch[20] Train-accuracy=0.796250
INFO:root:Epoch[20] Time cost=0.531

Topics: network less