Eye State Recognition in Deep Learning & Drawing of Confusion Matrix

Posted by gabrielkolbe on Mon, 03 Jan 2022 22:35:26 +0100

This experiment is based on the CNN network built by ourselves to classify eye state. Originally, it was intended to migrate learning to classify using VGG16 network. However, the effect of the experiment is very poor and the speed is very slow. It should be the blogger's own problem. However, the model of CNN network built by ourselves is very accurate and fast. This article focuses on confusion matrix drawing, something you haven't touched before.

1. Import Library

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import os,pathlib,PIL
from tensorflow import keras
from tensorflow.keras import layers,models,Sequential

2. Data Loading

The file path where the data is located

data_dir = "E:/tmp/.keras/datasets/Eye_photos"
data_dir = pathlib.Path(data_dir)
img_count = len(list(data_dir.glob('*/*.jpg')))#Total number of pictures

Setup of Hyperparameters

height = 224
width = 224
epochs = 10
batch_size = 64

Build an ImageDataGenerator. In previous experiments, I used to do data enhancement at this step, including left-right flipping, picture flipping at an angle, horizontal flipping, and so on. However, this is not done in this experiment. Because the eye states identified in this study include left, right, front and closed eyes, if data enhancement is made, the left view changes to the right view, so the data does not achieve the enhancement effect, but introduces noise data, which is not worth it.

train_data_gen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2)#Divide into training and test sets at a ratio of 8:2

Divided into training and test sets

train_ds = train_data_gen.flow_from_directory(
    directory=data_dir,
    target_size=(height,width),
    batch_size=batch_size,
    shuffle=True,
    class_mode='categorical',
    subset='training'
)
test_ds = train_data_gen.flow_from_directory(
    directory=data_dir,
    target_size=(height,width),
    batch_size=batch_size,
    shuffle=True,
    class_mode='categorical',
    subset='validation'
)
Found 3448 images belonging to 4 classes.
Found 859 images belonging to 4 classes.

View labels

all_images_paths = list(data_dir.glob('*'))##"*" matches 0 or more characters
all_images_paths = [str(path) for path in all_images_paths]
all_label_names = [path.split("\\")[5].split(".")[0] for path in all_images_paths]
['close_look', 'forward_look', 'left_look', 'right_look']

3.CNN Network Setup

model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(16,3,padding="same",activation="relu",input_shape=(height,width,3)),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(32,3,padding="same",activation="relu"),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Conv2D(64,3,padding="same",activation="relu"),
    tf.keras.layers.MaxPooling2D(),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(1024,activation="relu"),
    tf.keras.layers.Dense(512,activation="relu"),
    tf.keras.layers.Dense(4,activation="softmax")
])

Optimizer settings, specific principles can refer to the license plate to identify that blog.

initial_learning_rate = 1e-4
lr_sch = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=initial_learning_rate,
    decay_rate=0.96,
    decay_steps=20,
    staircase=True
)

The way loss values are calculated is described in the last blog post.

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=lr_sch),
    loss=tf.keras.losses.CategoricalCrossentropy(),
    metrics=['accuracy']
)

history = model.fit(
    train_ds,
    validation_data=test_ds,
    epochs=epochs
)

The results are as follows:

With epochs=20, the accuracy of the model is about 93%, which is considerable.

Save the model:

model.save("E:/tmp/.keras/datasets/model.h5")

Load Model

new_model = tf.keras.models.load_model("E:/tmp/.keras/datasets/model.h5")

Use models to predict pictures:

plt.figure(figsize=(10,5))
plt.suptitle("Forecast Results Display")
for images,labels in test_ds:
    for i in range(8):
        ax = plt.subplot(2,4,i+1)
        plt.imshow(images[i])
        img_array = tf.expand_dims(images[i],0)#Add one dimension
        pre = new_model.predict(img_array)
        plt.title(all_label_names[np.argmax(pre)])
        plt.axis("off")
    break
plt.show()

4. Confusion Matrix

Confusion matrix, also known as error matrix, is a standard format for accuracy evaluation and is expressed as a matrix of n rows and n columns. In image accuracy evaluation, it is mainly used to compare the classification results with the actual measured values. The accuracy of the classification results can be displayed in a confusion matrix. The confusion matrix is calculated by comparing the location and classification of each measured cell with the corresponding location and classification in the classified image.
The most familiar confusion matrix is the two-class confusion matrix:

TP = True Postive = True Positive; FP = False Positive = False Positive

FN = False Negative = false negative; TN = True Negative = True Negative

As for the multiclass confusion matrix, it is similar to the two-class confusion matrix. Let's draw a confusion matrix for eye state recognition.

sns.heatmap is the main tool used to draw the confusion matrix, which is a method under the seaborn package, as follows:

seaborn.heatmap(data, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', cbar=True, cbar_kws=None, cbar_ax=None, square=False, xticklabels='auto', yticklabels='auto', mask=None, ax=None, **kwargs)

In fact, all parameters except the first parameter, data, are default parameters and can be ignored. Here, if you receive a clean two-dimensional numpy array, you can see that the row label is 0,1,2, and if it is a DataFrame, you can mark it with the column name.

Libraries needed

from sklearn.metrics import confusion_matrix
import seaborn as sns
import pandas as pd

Define a function to draw the confusion matrix

#Draw confusion matrix
def plot_cm(labels,pre):
    conf_numpy = confusion_matrix(labels,pre)#Draw confusion matrix based on actual and predicted values
    conf_df = pd.DataFrame(conf_numpy,index=all_label_names,columns=all_label_names)#Put data and all_label_names made into DataFrame s
    plt.figure(figsize=(8,7))

    sns.heatmap(conf_df,annot=True,fmt="d",cmap="BuPu")#Draw data as confusion matrix
    plt.title('Confusion Matrix',fontsize = 15)
    plt.ylabel('True Value',fontsize = 14)
    plt.xlabel('predicted value',fontsize = 14)
    plt.show()

Get predicted and actual values

test_pre = []
test_label = []
for images,labels in test_ds:
    for image,label in zip(images,labels):
        img_array = tf.expand_dims(image,0)#Add a common dimension
        pre = new_model.predict(img_array)#Forecast results
        test_pre.append(all_label_names[np.argmax(pre)])#Pass predictions into the list
        test_label.append(all_label_names[np.argmax(label)])#Pass real results into the list
    break#Due to hardware problems. I only use one batch here, 64 pictures in total.
plot_cm(test_label,test_pre)#Draw confusion matrix


Try to refuel a

Topics: Python Machine Learning AI Deep Learning