AirSim learning note 3: model training

Posted by thinkaboutit on Wed, 05 Jan 2022 18:22:46 +0100

Open source projects:
Project address:
Localization project:

Step 1 - model training

Now that we have some sense of the data being processed, we can start designing our model. In this notebook, we will define the network architecture and train the model. We will also discuss some transformations on the data in response to our observations in the data exploration section of the notebook.

Let's start by importing some libraries and defining some paths.

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Lambda, Input, concatenate
from keras.layers.normalization import BatchNormalization
from keras.layers.advanced_activations import ELU
from keras.optimizers import Adam, SGD, Adamax, Nadam
from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, CSVLogger, EarlyStopping
import keras.backend as K
from keras.preprocessing import image

from keras_tqdm import TQDMNotebookCallback

import json
import os
import numpy as np
import pandas as pd
from Generator import DriveDataGenerator
from Cooking import checkAndCreateDir
import h5py
from PIL import Image, ImageDraw
import math
import matplotlib.pyplot as plt

# < < configure the data set directory preprocessed in the previous step > >
COOKED_DATA_DIR = '../../AirSim/EndToEndLearningRawData/data_cooked/'

# < < model file output directory: as the gradient becomes smaller and smaller, the model will be updated gradually > >
Using TensorFlow backend.
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\ FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])

Let's read the data set in the exploration phase. If they do not exist, run the code snippets in the [previous] (data exploration and prepare. Ipynb) notebook to generate them.

train_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'train.h5'), 'r')
eval_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'eval.h5'), 'r')
test_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'test.h5'), 'r')

num_train_examples = train_dataset['image'].shape[0]
num_eval_examples = eval_dataset['image'].shape[0]
num_test_examples = test_dataset['image'].shape[0]


For image data, it is too expensive to load the entire data set into memory. Fortunately, Keras has the concept of data generators. The DataGenerator is just an iterator that reads data from disk in blocks. This allows you to keep your CPU and GPU busy and increase throughput.
We made some observations during the exploration phase. Now, let's come up with a strategy to incorporate them into our training algorithm:

  • Only a small part of the image is of interest (ROI) - when generating batches, we can delete the parts of the image that are not of interest.
  • The dataset shows the vertical flip tolerance - when generating batches, we can randomly flip some images and labels around the Y axis so that the model has new data learning.
  • The dataset should keep the changes of illumination unchanged - when generating batches, we can randomly increase or delete the brightness of the image, so that the model can know that the global changes of illumination should be ignored.
  • The dataset has a high proportion of zero value images - when generating batches, we can randomly reduce a percentage of data points, in which the steering angle is zero, so that the model can see a balanced dataset during training.
  • From our dataset, we need to turn to examples of strategies in order for the model to learn how to turn sharply - we deal with this problem in the preprocessing phase.
    Although Keras has some standard built-in image transformations, they are not enough for our purposes. For example, when using horizontal in a standard image data generator_ When flip = true, the symbol of the label will not be inverted. Fortunately, we can extend the ImageDataGenerator class and implement our own transformation logic. The code to do this is in generator Py - it's simple, but too long to be included in this notebook.
    Here, we will initialize the generator with the following parameters:
  • Zero_Drop_Percentage: 0.9 - that is, we will randomly discard 90% of the data points labeled 0
  • Brighten_Range: 0.4 - that is, the brightness of each image will be modified up to 40%. In order to calculate the "Brightness", we convert the image from RGB space to HSV space, scale the "V" coordinate up and down, and then convert it back to RGB space.
  • ROI: [76135,0255] - this is a x1, x2, y1, y2 rectangle that represents the region of interest of the image.

Thinking Exercise 1.1
Try fiddling with these parameters to see if you can get better results.

data_generator = DriveDataGenerator(rescale=1./255., horizontal_flip=True, brighten_range=0.4)
train_generator = data_generator.flow\
    (train_dataset['image'], train_dataset['previous_state'], train_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255])
eval_generator = data_generator.flow\
    (eval_dataset['image'], eval_dataset['previous_state'], eval_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255])    

Let's take a look at a batch of samples. The steering angle is indicated by the red line in the figure:

def draw_image_with_label(img, label, prediction=None):
    theta = label * 0.69 #Steering range for the car is +- 40 degrees -> 0.69 radians
    line_length = 50
    line_thickness = 3
    label_line_color = (255, 0, 0)
    prediction_line_color = (0, 0, 255)
    pil_image = image.array_to_img(img, K.image_data_format(), scale=True)
    print('Actual Steering Angle = {0}'.format(label))
    draw_image = pil_image.copy()
    image_draw = ImageDraw.Draw(draw_image)
    first_point = (int(img.shape[1]/2),img.shape[0])
    second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta))))
    image_draw.line([first_point, second_point], fill=label_line_color, width=line_thickness)
    if (prediction is not None):
        print('Predicted Steering Angle = {0}'.format(prediction))
        print('L1 Error: {0}'.format(abs(prediction-label)))
        theta = prediction * 0.69
        second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta))))
        image_draw.line([first_point, second_point], fill=prediction_line_color, width=line_thickness)
    del image_draw

[sample_batch_train_data, sample_batch_test_data] = next(train_generator)
for i in range(0, 3, 1):
    draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i])
Actual Steering Angle = [0.011892]

Actual Steering Angle = [0.033569]

Actual Steering Angle = [0.00726667]

Next, let's define the network architecture. We will use a standard combined convolution / max pooling layer to process images (we can't enter the details of each layer here, but you must look at the readme file mentioned in this book if you don't understand what happened). Then, we will inject the last known state of the vehicle into the neural network layer as an additional feature. Layer sizes and optimization parameters are determined experimentally - try adjusting them and see what happens!

image_input_shape = sample_batch_train_data[0].shape[1:]
state_input_shape = sample_batch_train_data[1].shape[1:]
activation = 'relu'

#Create the convolutional stacks
pic_input = Input(shape=image_input_shape)

img_stack = Conv2D(16, (3, 3), name="convolution0", padding='same', activation=activation)(pic_input)
img_stack = MaxPooling2D(pool_size=(2,2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution1')(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution2')(img_stack)
img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack)
img_stack = Flatten()(img_stack)
img_stack = Dropout(0.2)(img_stack)

#Inject the state input
state_input = Input(shape=state_input_shape)
merged = concatenate([img_stack, state_input])

# Add a few dense layers to finish the model
merged = Dense(64, activation=activation, name='dense0')(merged)
merged = Dropout(0.2)(merged)
merged = Dense(10, activation=activation, name='dense2')(merged)
merged = Dropout(0.2)(merged)
merged = Dense(1, name='output')(merged)

adam = Nadam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
model = Model(inputs=[pic_input, state_input], outputs=merged)
model.compile(optimizer=adam, loss='mse')
WARNING:tensorflow:From E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\ calling reduce_prod (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
WARNING:tensorflow:From E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\ calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead

Let's look at a summary of our model

Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 59, 255, 3)   0                                            
convolution0 (Conv2D)           (None, 59, 255, 16)  448         input_1[0][0]                    
max_pooling2d_1 (MaxPooling2D)  (None, 29, 127, 16)  0           convolution0[0][0]               
convolution1 (Conv2D)           (None, 29, 127, 32)  4640        max_pooling2d_1[0][0]            
max_pooling2d_2 (MaxPooling2D)  (None, 14, 63, 32)   0           convolution1[0][0]               
convolution2 (Conv2D)           (None, 14, 63, 32)   9248        max_pooling2d_2[0][0]            
max_pooling2d_3 (MaxPooling2D)  (None, 7, 31, 32)    0           convolution2[0][0]               
flatten_1 (Flatten)             (None, 6944)         0           max_pooling2d_3[0][0]            
dropout_1 (Dropout)             (None, 6944)         0           flatten_1[0][0]                  
input_2 (InputLayer)            (None, 4)            0                                            
concatenate_1 (Concatenate)     (None, 6948)         0           dropout_1[0][0]                  
dense0 (Dense)                  (None, 64)           444736      concatenate_1[0][0]              
dropout_2 (Dropout)             (None, 64)           0           dense0[0][0]                     
dense2 (Dense)                  (None, 10)           650         dropout_2[0][0]                  
dropout_3 (Dropout)             (None, 10)           0           dense2[0][0]                     
output (Dense)                  (None, 1)            11          dropout_3[0][0]                  
Total params: 459,733
Trainable params: 459,733
Non-trainable params: 0

There are many parameters! Fortunately, we have our own data enhancement strategy, so the network has the opportunity to integrate. Try adding / removing layers or changing their width to see how it affects the number of trainable parameters in the network.

A good feature of Keras is the ability to declare callbacks. These functions are executed after each stage of training. We will define several callback functions:

  • Reducelronplatform - if the model approaches the minimum and the learning rate is too high, the model will rotate around the minimum and never reach it. When the verification loss stops improving, this callback will allow us to reduce the learning rate, so that we can reach the best point.
  • CsvLogger - this allows us to record the output of the model after each epoch, which will allow us to track the process without using the console.
  • ModelCheckpoint - typically, we want to use the model with the least loss in the validation set. This callback will save the model each time the loss improvement is verified.
  • Stop early - we will want to stop training when we verify that the loss stops improving. Otherwise, we risk over fitting. The monitor will detect when the verification loss stops improving and stop the training process when this occurs.
plateau_callback = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=0.0001, verbose=1)
checkpoint_filepath = os.path.join(MODEL_OUTPUT_DIR, 'models', '{0}_model.{1}-{2}.h5'.format('model', '{epoch:02d}', '{val_loss:.7f}'))
checkpoint_callback = ModelCheckpoint(checkpoint_filepath, save_best_only=True, verbose=1)
csv_callback = CSVLogger(os.path.join(MODEL_OUTPUT_DIR, 'training_log.csv'))
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=10, verbose=1)
callbacks=[plateau_callback, csv_callback, checkpoint_callback, early_stopping_callback, TQDMNotebookCallback()]

It's time to train the model! By default, the model takes about 45 minutes to train on NVidia GTX970 GPU. Note: sometimes the model is stuck by a constant verification loss of up to 7 epoch s. If the model is allowed to continue running, the model should end with a verification loss of approximately 0.0003.

history = model.fit_generator(train_generator, steps_per_epoch=num_train_examples//batch_size, epochs=500, callbacks=callbacks,\
                   validation_data=eval_generator, validation_steps=num_eval_examples//batch_size, verbose=2)
Training:   0%|          | 0/500 [00:00<?, ?it/s]

Epoch 0:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 1/500
Epoch 00001: val_loss improved from inf to 0.02272, saving model to model\models\model_model.01-0.0227211.h5
 - 255s - loss: 0.0227 - val_loss: 0.0227

Epoch 1:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 2/500
Epoch 00002: val_loss did not improve
 - 263s - loss: 0.0224 - val_loss: 0.0227

Epoch 2:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 3/500
Epoch 00003: val_loss improved from 0.02272 to 0.02272, saving model to model\models\model_model.03-0.0227178.h5
 - 261s - loss: 0.0225 - val_loss: 0.0227

Epoch 3:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 4/500
Epoch 00004: val_loss improved from 0.02272 to 0.01308, saving model to model\models\model_model.04-0.0130766.h5
 - 247s - loss: 0.0214 - val_loss: 0.0131

Epoch 4:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 5/500
Epoch 00005: val_loss improved from 0.01308 to 0.00285, saving model to model\models\model_model.05-0.0028456.h5
 - 255s - loss: 0.0072 - val_loss: 0.0028

Epoch 5:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 6/500
Epoch 00006: val_loss improved from 0.00285 to 0.00107, saving model to model\models\model_model.06-0.0010740.h5
 - 275s - loss: 0.0036 - val_loss: 0.0011

Epoch 6:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 7/500
Epoch 00007: val_loss improved from 0.00107 to 0.00070, saving model to model\models\model_model.07-0.0006958.h5
 - 276s - loss: 0.0027 - val_loss: 6.9578e-04

Epoch 7:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 8/500
Epoch 00008: val_loss improved from 0.00070 to 0.00051, saving model to model\models\model_model.08-0.0005139.h5
 - 269s - loss: 0.0024 - val_loss: 5.1388e-04

Epoch 8:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 9/500
Epoch 00009: val_loss improved from 0.00051 to 0.00047, saving model to model\models\model_model.09-0.0004663.h5
 - 256s - loss: 0.0020 - val_loss: 4.6628e-04

Epoch 9:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 10/500
Epoch 00010: val_loss improved from 0.00047 to 0.00032, saving model to model\models\model_model.10-0.0003200.h5
 - 254s - loss: 0.0019 - val_loss: 3.1998e-04

Epoch 10:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 11/500
Epoch 00011: val_loss did not improve
 - 260s - loss: 0.0018 - val_loss: 4.4795e-04

Epoch 11:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 12/500
Epoch 00012: val_loss improved from 0.00032 to 0.00030, saving model to model\models\model_model.12-0.0003030.h5
 - 249s - loss: 0.0017 - val_loss: 3.0302e-04

Epoch 12:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 13/500
Epoch 00013: val_loss improved from 0.00030 to 0.00024, saving model to model\models\model_model.13-0.0002441.h5
 - 245s - loss: 0.0017 - val_loss: 2.4407e-04

Epoch 13:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 14/500
Epoch 00014: val_loss did not improve
 - 241s - loss: 0.0017 - val_loss: 2.6870e-04

Epoch 14:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 15/500
Epoch 00015: val_loss did not improve
 - 237s - loss: 0.0017 - val_loss: 2.5549e-04

Epoch 15:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 16/500
Epoch 00016: val_loss did not improve
 - 237s - loss: 0.0017 - val_loss: 2.9856e-04

Epoch 16:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 17/500
Epoch 00017: val_loss did not improve
 - 237s - loss: 0.0017 - val_loss: 2.6887e-04

Epoch 17:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 18/500
Epoch 00018: val_loss did not improve
 - 237s - loss: 0.0016 - val_loss: 2.9193e-04

Epoch 18:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 19/500
Epoch 00019: val_loss did not improve
 - 238s - loss: 0.0016 - val_loss: 3.0518e-04

Epoch 19:   0%|          | 0/1021 [00:00<?, ?it/s]

Epoch 20/500


KeyboardInterrupt                         Traceback (most recent call last)

<ipython-input-8-50126e7d2d8b> in <module>
      1 history = model.fit_generator(train_generator, steps_per_epoch=num_train_examples//batch_size, epochs=500, callbacks=callbacks,\
----> 2                    validation_data=eval_generator, validation_steps=num_eval_examples//batch_size, verbose=2)

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\legacy\ in wrapper(*args, **kwargs)
     85                 warnings.warn('Update your `' + object_name +
     86                               '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 87             return func(*args, **kwargs)
     88         wrapper._original_function = func
     89         return wrapper

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\engine\ in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   2145                     outs = self.train_on_batch(x, y,
   2146                                                sample_weight=sample_weight,
-> 2147                                                class_weight=class_weight)
   2149                     if not isinstance(outs, list):

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\engine\ in train_on_batch(self, x, y, sample_weight, class_weight)
   1837             ins = x + y + sample_weights
   1838         self._make_train_function()
-> 1839         outputs = self.train_function(ins)
   1840         if len(outputs) == 1:
   1841             return outputs[0]

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\ in __call__(self, inputs)
   2355         session = get_session()
   2356         updated =, feed_dict=feed_dict,
-> 2357                               **self.session_kwargs)
   2358         return updated[:len(self.outputs)]

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\ in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\ in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1126     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1127       results = self._do_run(handle, final_targets, final_fetches,
-> 1128                              feed_dict_tensor, options, run_metadata)
   1129     else:
   1130       results = []

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\ in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1342     if handle is None:
   1343       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1344                            options, run_metadata)
   1345     else:
   1346       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\ in _do_call(self, fn, *args)
   1348   def _do_call(self, fn, *args):
   1349     try:
-> 1350       return fn(*args)
   1351     except errors.OpError as e:
   1352       message = compat.as_text(e.message)

E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\ in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
   1327           return tf_session.TF_Run(session, options,
   1328                                    feed_dict, fetch_list, target_list,
-> 1329                                    status, run_metadata)
   1331     def _prun_fn(session, handle, feed_dict, fetch_list):


Let's do a quick check. We will load some training images and compare labels and predictions. If our model is learned properly, these values should be very close.

[sample_batch_train_data, sample_batch_test_data] = next(train_generator)
predictions = model.predict([sample_batch_train_data[0], sample_batch_train_data[1]])
for i in range(0, 3, 1):
    draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i], predictions[i])
Actual Steering Angle = [-0.03708]
Predicted Steering Angle = [-0.02686657]
L1 Error: [0.01021343]

Actual Steering Angle = [-0.02100967]
Predicted Steering Angle = [-0.01909251]
L1 Error: [0.00191716]

Actual Steering Angle = [-0.03047433]
Predicted Steering Angle = [-0.03524271]
L1 Error: [0.00476837]

It seems all right! Let's continue Next notebook AirSim actual operation model is used in.

Open source projects:
Project address:
Localization project:

Topics: Machine Learning Computer Vision Deep Learning Autonomous vehicles