Open source projects:
Project address: https://github.com/Microsoft/AutonomousDrivingCookbook
Localization project: https://gitee.com/zhoushimin123/autonomous-driving-cookbook
Step 1 - model training
Now that we have some sense of the data being processed, we can start designing our model. In this notebook, we will define the network architecture and train the model. We will also discuss some transformations on the data in response to our observations in the data exploration section of the notebook.
Let's start by importing some libraries and defining some paths.
from keras.preprocessing.image import ImageDataGenerator from keras.models import Sequential, Model from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Lambda, Input, concatenate from keras.layers.normalization import BatchNormalization from keras.layers.advanced_activations import ELU from keras.optimizers import Adam, SGD, Adamax, Nadam from keras.callbacks import ReduceLROnPlateau, ModelCheckpoint, CSVLogger, EarlyStopping import keras.backend as K from keras.preprocessing import image from keras_tqdm import TQDMNotebookCallback import json import os import numpy as np import pandas as pd from Generator import DriveDataGenerator from Cooking import checkAndCreateDir import h5py from PIL import Image, ImageDraw import math import matplotlib.pyplot as plt # < < configure the data set directory preprocessed in the previous step > > COOKED_DATA_DIR = '../../AirSim/EndToEndLearningRawData/data_cooked/' # < < model file output directory: as the gradient becomes smaller and smaller, the model will be updated gradually > > MODEL_OUTPUT_DIR = 'model'
Using TensorFlow backend. E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:493: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:494: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:495: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:496: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:497: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\framework\dtypes.py:502: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)])
Let's read the data set in the exploration phase. If they do not exist, run the code snippets in the [previous] (data exploration and prepare. Ipynb) notebook to generate them.
train_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'train.h5'), 'r') eval_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'eval.h5'), 'r') test_dataset = h5py.File(os.path.join(COOKED_DATA_DIR, 'test.h5'), 'r') num_train_examples = train_dataset['image'].shape[0] num_eval_examples = eval_dataset['image'].shape[0] num_test_examples = test_dataset['image'].shape[0] batch_size=32
For image data, it is too expensive to load the entire data set into memory. Fortunately, Keras has the concept of data generators. The DataGenerator is just an iterator that reads data from disk in blocks. This allows you to keep your CPU and GPU busy and increase throughput.
We made some observations during the exploration phase. Now, let's come up with a strategy to incorporate them into our training algorithm:
- Only a small part of the image is of interest (ROI) - when generating batches, we can delete the parts of the image that are not of interest.
- The dataset shows the vertical flip tolerance - when generating batches, we can randomly flip some images and labels around the Y axis so that the model has new data learning.
- The dataset should keep the changes of illumination unchanged - when generating batches, we can randomly increase or delete the brightness of the image, so that the model can know that the global changes of illumination should be ignored.
- The dataset has a high proportion of zero value images - when generating batches, we can randomly reduce a percentage of data points, in which the steering angle is zero, so that the model can see a balanced dataset during training.
- From our dataset, we need to turn to examples of strategies in order for the model to learn how to turn sharply - we deal with this problem in the preprocessing phase.
Although Keras has some standard built-in image transformations, they are not enough for our purposes. For example, when using horizontal in a standard image data generator_ When flip = true, the symbol of the label will not be inverted. Fortunately, we can extend the ImageDataGenerator class and implement our own transformation logic. The code to do this is in generator Py - it's simple, but too long to be included in this notebook.
Here, we will initialize the generator with the following parameters: - Zero_Drop_Percentage: 0.9 - that is, we will randomly discard 90% of the data points labeled 0
- Brighten_Range: 0.4 - that is, the brightness of each image will be modified up to 40%. In order to calculate the "Brightness", we convert the image from RGB space to HSV space, scale the "V" coordinate up and down, and then convert it back to RGB space.
- ROI: [76135,0255] - this is a x1, x2, y1, y2 rectangle that represents the region of interest of the image.
Thinking Exercise 1.1
Try fiddling with these parameters to see if you can get better results.
data_generator = DriveDataGenerator(rescale=1./255., horizontal_flip=True, brighten_range=0.4) train_generator = data_generator.flow\ (train_dataset['image'], train_dataset['previous_state'], train_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255]) eval_generator = data_generator.flow\ (eval_dataset['image'], eval_dataset['previous_state'], eval_dataset['label'], batch_size=batch_size, zero_drop_percentage=0.95, roi=[76,135,0,255])
Let's take a look at a batch of samples. The steering angle is indicated by the red line in the figure:
def draw_image_with_label(img, label, prediction=None): theta = label * 0.69 #Steering range for the car is +- 40 degrees -> 0.69 radians line_length = 50 line_thickness = 3 label_line_color = (255, 0, 0) prediction_line_color = (0, 0, 255) pil_image = image.array_to_img(img, K.image_data_format(), scale=True) print('Actual Steering Angle = {0}'.format(label)) draw_image = pil_image.copy() image_draw = ImageDraw.Draw(draw_image) first_point = (int(img.shape[1]/2),img.shape[0]) second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta)))) image_draw.line([first_point, second_point], fill=label_line_color, width=line_thickness) if (prediction is not None): print('Predicted Steering Angle = {0}'.format(prediction)) print('L1 Error: {0}'.format(abs(prediction-label))) theta = prediction * 0.69 second_point = (int((img.shape[1]/2) + (line_length * math.sin(theta))), int(img.shape[0] - (line_length * math.cos(theta)))) image_draw.line([first_point, second_point], fill=prediction_line_color, width=line_thickness) del image_draw plt.imshow(draw_image) plt.show() [sample_batch_train_data, sample_batch_test_data] = next(train_generator) for i in range(0, 3, 1): draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i])
Actual Steering Angle = [0.011892]
Actual Steering Angle = [0.033569]
Actual Steering Angle = [0.00726667]
Next, let's define the network architecture. We will use a standard combined convolution / max pooling layer to process images (we can't enter the details of each layer here, but you must look at the readme file mentioned in this book if you don't understand what happened). Then, we will inject the last known state of the vehicle into the neural network layer as an additional feature. Layer sizes and optimization parameters are determined experimentally - try adjusting them and see what happens!
image_input_shape = sample_batch_train_data[0].shape[1:] state_input_shape = sample_batch_train_data[1].shape[1:] activation = 'relu' #Create the convolutional stacks pic_input = Input(shape=image_input_shape) img_stack = Conv2D(16, (3, 3), name="convolution0", padding='same', activation=activation)(pic_input) img_stack = MaxPooling2D(pool_size=(2,2))(img_stack) img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution1')(img_stack) img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack) img_stack = Conv2D(32, (3, 3), activation=activation, padding='same', name='convolution2')(img_stack) img_stack = MaxPooling2D(pool_size=(2, 2))(img_stack) img_stack = Flatten()(img_stack) img_stack = Dropout(0.2)(img_stack) #Inject the state input state_input = Input(shape=state_input_shape) merged = concatenate([img_stack, state_input]) # Add a few dense layers to finish the model merged = Dense(64, activation=activation, name='dense0')(merged) merged = Dropout(0.2)(merged) merged = Dense(10, activation=activation, name='dense2')(merged) merged = Dropout(0.2)(merged) merged = Dense(1, name='output')(merged) adam = Nadam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08) model = Model(inputs=[pic_input, state_input], outputs=merged) model.compile(optimizer=adam, loss='mse')
WARNING:tensorflow:From E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\tensorflow_backend.py:1264: calling reduce_prod (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead WARNING:tensorflow:From E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\tensorflow_backend.py:1349: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead
Let's look at a summary of our model
model.summary()
__________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 59, 255, 3) 0 __________________________________________________________________________________________________ convolution0 (Conv2D) (None, 59, 255, 16) 448 input_1[0][0] __________________________________________________________________________________________________ max_pooling2d_1 (MaxPooling2D) (None, 29, 127, 16) 0 convolution0[0][0] __________________________________________________________________________________________________ convolution1 (Conv2D) (None, 29, 127, 32) 4640 max_pooling2d_1[0][0] __________________________________________________________________________________________________ max_pooling2d_2 (MaxPooling2D) (None, 14, 63, 32) 0 convolution1[0][0] __________________________________________________________________________________________________ convolution2 (Conv2D) (None, 14, 63, 32) 9248 max_pooling2d_2[0][0] __________________________________________________________________________________________________ max_pooling2d_3 (MaxPooling2D) (None, 7, 31, 32) 0 convolution2[0][0] __________________________________________________________________________________________________ flatten_1 (Flatten) (None, 6944) 0 max_pooling2d_3[0][0] __________________________________________________________________________________________________ dropout_1 (Dropout) (None, 6944) 0 flatten_1[0][0] __________________________________________________________________________________________________ input_2 (InputLayer) (None, 4) 0 __________________________________________________________________________________________________ concatenate_1 (Concatenate) (None, 6948) 0 dropout_1[0][0] input_2[0][0] __________________________________________________________________________________________________ dense0 (Dense) (None, 64) 444736 concatenate_1[0][0] __________________________________________________________________________________________________ dropout_2 (Dropout) (None, 64) 0 dense0[0][0] __________________________________________________________________________________________________ dense2 (Dense) (None, 10) 650 dropout_2[0][0] __________________________________________________________________________________________________ dropout_3 (Dropout) (None, 10) 0 dense2[0][0] __________________________________________________________________________________________________ output (Dense) (None, 1) 11 dropout_3[0][0] ================================================================================================== Total params: 459,733 Trainable params: 459,733 Non-trainable params: 0 __________________________________________________________________________________________________
There are many parameters! Fortunately, we have our own data enhancement strategy, so the network has the opportunity to integrate. Try adding / removing layers or changing their width to see how it affects the number of trainable parameters in the network.
A good feature of Keras is the ability to declare callbacks. These functions are executed after each stage of training. We will define several callback functions:
- Reducelronplatform - if the model approaches the minimum and the learning rate is too high, the model will rotate around the minimum and never reach it. When the verification loss stops improving, this callback will allow us to reduce the learning rate, so that we can reach the best point.
- CsvLogger - this allows us to record the output of the model after each epoch, which will allow us to track the process without using the console.
- ModelCheckpoint - typically, we want to use the model with the least loss in the validation set. This callback will save the model each time the loss improvement is verified.
- Stop early - we will want to stop training when we verify that the loss stops improving. Otherwise, we risk over fitting. The monitor will detect when the verification loss stops improving and stop the training process when this occurs.
plateau_callback = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=0.0001, verbose=1) checkpoint_filepath = os.path.join(MODEL_OUTPUT_DIR, 'models', '{0}_model.{1}-{2}.h5'.format('model', '{epoch:02d}', '{val_loss:.7f}')) checkAndCreateDir(checkpoint_filepath) checkpoint_callback = ModelCheckpoint(checkpoint_filepath, save_best_only=True, verbose=1) csv_callback = CSVLogger(os.path.join(MODEL_OUTPUT_DIR, 'training_log.csv')) early_stopping_callback = EarlyStopping(monitor='val_loss', patience=10, verbose=1) callbacks=[plateau_callback, csv_callback, checkpoint_callback, early_stopping_callback, TQDMNotebookCallback()]
It's time to train the model! By default, the model takes about 45 minutes to train on NVidia GTX970 GPU. Note: sometimes the model is stuck by a constant verification loss of up to 7 epoch s. If the model is allowed to continue running, the model should end with a verification loss of approximately 0.0003.
history = model.fit_generator(train_generator, steps_per_epoch=num_train_examples//batch_size, epochs=500, callbacks=callbacks,\ validation_data=eval_generator, validation_steps=num_eval_examples//batch_size, verbose=2)
Training: 0%| | 0/500 [00:00<?, ?it/s] Epoch 0: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 1/500 Epoch 00001: val_loss improved from inf to 0.02272, saving model to model\models\model_model.01-0.0227211.h5 - 255s - loss: 0.0227 - val_loss: 0.0227 Epoch 1: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 2/500 Epoch 00002: val_loss did not improve - 263s - loss: 0.0224 - val_loss: 0.0227 Epoch 2: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 3/500 Epoch 00003: val_loss improved from 0.02272 to 0.02272, saving model to model\models\model_model.03-0.0227178.h5 - 261s - loss: 0.0225 - val_loss: 0.0227 Epoch 3: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 4/500 Epoch 00004: val_loss improved from 0.02272 to 0.01308, saving model to model\models\model_model.04-0.0130766.h5 - 247s - loss: 0.0214 - val_loss: 0.0131 Epoch 4: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 5/500 Epoch 00005: val_loss improved from 0.01308 to 0.00285, saving model to model\models\model_model.05-0.0028456.h5 - 255s - loss: 0.0072 - val_loss: 0.0028 Epoch 5: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 6/500 Epoch 00006: val_loss improved from 0.00285 to 0.00107, saving model to model\models\model_model.06-0.0010740.h5 - 275s - loss: 0.0036 - val_loss: 0.0011 Epoch 6: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 7/500 Epoch 00007: val_loss improved from 0.00107 to 0.00070, saving model to model\models\model_model.07-0.0006958.h5 - 276s - loss: 0.0027 - val_loss: 6.9578e-04 Epoch 7: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 8/500 Epoch 00008: val_loss improved from 0.00070 to 0.00051, saving model to model\models\model_model.08-0.0005139.h5 - 269s - loss: 0.0024 - val_loss: 5.1388e-04 Epoch 8: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 9/500 Epoch 00009: val_loss improved from 0.00051 to 0.00047, saving model to model\models\model_model.09-0.0004663.h5 - 256s - loss: 0.0020 - val_loss: 4.6628e-04 Epoch 9: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 10/500 Epoch 00010: val_loss improved from 0.00047 to 0.00032, saving model to model\models\model_model.10-0.0003200.h5 - 254s - loss: 0.0019 - val_loss: 3.1998e-04 Epoch 10: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 11/500 Epoch 00011: val_loss did not improve - 260s - loss: 0.0018 - val_loss: 4.4795e-04 Epoch 11: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 12/500 Epoch 00012: val_loss improved from 0.00032 to 0.00030, saving model to model\models\model_model.12-0.0003030.h5 - 249s - loss: 0.0017 - val_loss: 3.0302e-04 Epoch 12: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 13/500 Epoch 00013: val_loss improved from 0.00030 to 0.00024, saving model to model\models\model_model.13-0.0002441.h5 - 245s - loss: 0.0017 - val_loss: 2.4407e-04 Epoch 13: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 14/500 Epoch 00014: val_loss did not improve - 241s - loss: 0.0017 - val_loss: 2.6870e-04 Epoch 14: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 15/500 Epoch 00015: val_loss did not improve - 237s - loss: 0.0017 - val_loss: 2.5549e-04 Epoch 15: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 16/500 Epoch 00016: val_loss did not improve - 237s - loss: 0.0017 - val_loss: 2.9856e-04 Epoch 16: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 17/500 Epoch 00017: val_loss did not improve - 237s - loss: 0.0017 - val_loss: 2.6887e-04 Epoch 17: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 18/500 Epoch 00018: val_loss did not improve - 237s - loss: 0.0016 - val_loss: 2.9193e-04 Epoch 18: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 19/500 Epoch 00019: val_loss did not improve - 238s - loss: 0.0016 - val_loss: 3.0518e-04 Epoch 19: 0%| | 0/1021 [00:00<?, ?it/s] Epoch 20/500 --------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) <ipython-input-8-50126e7d2d8b> in <module> 1 history = model.fit_generator(train_generator, steps_per_epoch=num_train_examples//batch_size, epochs=500, callbacks=callbacks,\ ----> 2 validation_data=eval_generator, validation_steps=num_eval_examples//batch_size, verbose=2) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs) 85 warnings.warn('Update your `' + object_name + 86 '` call to the Keras 2 API: ' + signature, stacklevel=2) ---> 87 return func(*args, **kwargs) 88 wrapper._original_function = func 89 return wrapper E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) 2145 outs = self.train_on_batch(x, y, 2146 sample_weight=sample_weight, -> 2147 class_weight=class_weight) 2148 2149 if not isinstance(outs, list): E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\engine\training.py in train_on_batch(self, x, y, sample_weight, class_weight) 1837 ins = x + y + sample_weights 1838 self._make_train_function() -> 1839 outputs = self.train_function(ins) 1840 if len(outputs) == 1: 1841 return outputs[0] E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs) 2355 session = get_session() 2356 updated = session.run(fetches=fetches, feed_dict=feed_dict, -> 2357 **self.session_kwargs) 2358 return updated[:len(self.outputs)] 2359 E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 893 try: 894 result = self._run(None, fetches, feed_dict, options_ptr, --> 895 run_metadata_ptr) 896 if run_metadata: 897 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 1126 if final_fetches or final_targets or (handle and feed_dict_tensor): 1127 results = self._do_run(handle, final_targets, final_fetches, -> 1128 feed_dict_tensor, options, run_metadata) 1129 else: 1130 results = [] E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 1342 if handle is None: 1343 return self._do_call(_run_fn, self._session, feeds, fetches, targets, -> 1344 options, run_metadata) 1345 else: 1346 return self._do_call(_prun_fn, self._session, handle, feeds, fetches) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args) 1348 def _do_call(self, fn, *args): 1349 try: -> 1350 return fn(*args) 1351 except errors.OpError as e: 1352 message = compat.as_text(e.message) E:\Tools\Anaconda3\envs\airsim2\lib\site-packages\tensorflow\python\client\session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 1327 return tf_session.TF_Run(session, options, 1328 feed_dict, fetch_list, target_list, -> 1329 status, run_metadata) 1330 1331 def _prun_fn(session, handle, feed_dict, fetch_list): KeyboardInterrupt:
Let's do a quick check. We will load some training images and compare labels and predictions. If our model is learned properly, these values should be very close.
[sample_batch_train_data, sample_batch_test_data] = next(train_generator) predictions = model.predict([sample_batch_train_data[0], sample_batch_train_data[1]]) for i in range(0, 3, 1): draw_image_with_label(sample_batch_train_data[0][i], sample_batch_test_data[i], predictions[i])
Actual Steering Angle = [-0.03708] Predicted Steering Angle = [-0.02686657] L1 Error: [0.01021343]
Actual Steering Angle = [-0.02100967] Predicted Steering Angle = [-0.01909251] L1 Error: [0.00191716]
Actual Steering Angle = [-0.03047433] Predicted Steering Angle = [-0.03524271] L1 Error: [0.00476837]
It seems all right! Let's continue Next notebook AirSim actual operation model is used in.
Open source projects:
Project address: https://github.com/Microsoft/AutonomousDrivingCookbook
Localization project: https://gitee.com/zhoushimin123/autonomous-driving-cookbook