TensorFlow realizes Denoising Autoencoder

Posted by kcpaige89 on Sun, 16 Jan 2022 04:13:03 +0100

Denoising autoencoder (DAE)

Before introducing the denoising autoencoder (DAE), first introduce an example of the use scene of DAE. When we take photos at night or in other dark environments, our photos are always filled with a lot of noise, which seriously affects the image quality, and the purpose of DAE is to remove the noise in these images. In order to better explain the DAE, we use a simple MNIST data set for demonstration to focus on the knowledge of DAE. As shown in the figure below, three sets of MNIST numbers are displayed. The top row of each group is original images; The middle line shows the input of DAE (noisy images). These inputs are the original images damaged by noise. When there is too much noise, it will be difficult for us to understand the damaged numbers; The last line shows the output of the DAE (Denoised Images).

Tips: if you don't know much about the self encoder, you can refer to it Detailed explanation and implementation of self encoder model (implemented by tensorflow2.x).

Next, let's actually build a DAE to eliminate noise in the image.

DAE model architecture

According to the introduction of DAE, the input can be defined as:
x = x o r i g + n o i s e x = x_{orig} + noise x=xorig+noise
among x o r i g x_{orig} xorig ， indicates noise n o i s e noise noise destroys the original MNIST image, and the purpose of the encoder is to learn the latent vector z z z. The loss function of DAE is expressed as:
L ( x o r i g , x ~ ) = M S E = 1 m ∑ i = 1 i = m ( x o r i g i − x ~ i ) 2 \mathcal L(x_{orig}, \tilde x)=MSE=\frac 1 m \sum_{i=1} ^{i=m}(x_{orig_i}-\tilde x_i)^2 L(xorig,x~)=MSE=m1i=1∑i=m(xorigi−x~i)2

Among them, m m m is the dimension of the output, for example, in the MNIST dataset, m = w i d t h × h e i g h t × c h a n n e l s = 28 × 28 × 1 = 784 m=width × height×channels=28 × 28 × 1 = 784 m=width×height×channels=28×28×1=784. x o r i g i x_{orig_i} xorigi and x i x_i xi ， respectively x o r i g x_{orig} xorig # and x ~ \tilde x Elements in x ~.

DAE implementation

Data preprocessing

In order to realize DAE, we first need to construct a training data set. The input data is the MNIST number with noise, and the training output data is the original clean MNIST number. The added noise needs to meet the Gaussian distribution and mean value μ = 0.5 μ = 0.5 μ= 0.5, standard deviation σ = 0.5 σ = 0.5 σ= 0.5. Since adding random noise may produce invalid pixel values less than 0 or greater than 1, the pixel values need to be trimmed to the range of [0.0, 1.0].

import numpy as np
import tensorflow as tf
from tensorflow import keras
from matplotlib import pyplot as plt
from PIL import Image

# Data loading
(x_train,_),(x_test,_) = keras.datasets.mnist.load_data()

# Data preprocessing
image_size = x_train.shape[1]
x_train = np.reshape(x_train,[-1,image_size,image_size,1])
x_test = np.reshape(x_test,[-1,image_size,image_size,1])
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Generate Gaussian noise
noise = np.random.normal(loc=0.5,scale=0.5,size=x_train.shape)
x_train_noisy = x_train + noise
noise = np.random.normal(loc=0.5,scale=0.5,size=x_test.shape)
x_test_noisy = x_test + noise

# Crop pixel values to within [0.0, 1.0]
x_train_noisy = np.clip(x_train_noisy,0.0,1.0)
x_test_noisy = np.clip(x_test_noisy,0.0,1.0)

Model construction and model training

# Super parameter
input_shape = (image_size,image_size,1)
batch_size = 32
kernel_size = 3
latent_dim = 16
layer_filters = [32,64]

"""
Model
"""
#encoder
inputs = keras.layers.Input(shape=input_shape,name='encoder_input')
x = inputs
for filters in layer_filters:
    x = keras.layers.Conv2D(filters=filters,
            kernel_size=kernel_size,
            strides=2,
            activation='relu',
            padding='same')(x)
shape = keras.backend.int_shape(x)

x = keras.layers.Flatten()(x)
latent = keras.layers.Dense(latent_dim,name='latent_vector')(x)
encoder = keras.Model(inputs,latent,name='encoder')
encoder.summary()

# decoder
latent_inputs = keras.layers.Input(shape=(latent_dim,),name='decoder_input')
x = keras.layers.Dense(shape[1]*shape[2]*shape[3])(latent_inputs)
x = keras.layers.Reshape((shape[1],shape[2],shape[3]))(x)
for filters in layer_filters[::-1]:
    x = keras.layers.Conv2DTranspose(filters=filters,
            kernel_size=kernel_size,
            strides=2,
            padding='same',
            activation='relu')(x)
outputs = keras.layers.Conv2DTranspose(filters=1,
        kernel_size=kernel_size,
        padding='same',
        activation='sigmoid',
        name='decoder_output')(x)
decoder = keras.Model(latent_inputs,outputs,name='decoder')
decoder.summary

autoencoder = keras.Model(inputs,decoder(encoder(inputs)),name='autoencoder')
autoencoder.summary()

# Model compilation and training
autoencoder.compile(loss='mse',optimizer='adam')
autoencoder.fit(x_train_noisy,
        x_train,validation_data=(x_test_noisy,x_test),
        epochs=10,
        batch_size=batch_size)

# Model test
x_decoded = autoencoder.predict(x_test_noisy)

rows,cols = 3,9
num = rows * cols
imgs = np.concatenate([x_test[:num],x_test_noisy[:num],x_decoded[:num]])
imgs = imgs.reshape((rows * 3, cols, image_size, image_size))
imgs = np.vstack(np.split(imgs,rows,axis=1))
imgs = imgs.reshape((rows * 3,-1,image_size,image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
imgs = (imgs * 255).astype(np.uint8)
plt.figure()
plt.axis('off')
plt.imshow(imgs,interpolation='none',cmap='gray')
plt.show()

Effect display

As shown in the figure above, when the noise level changes from σ = 0.5 σ=0.5 σ= 0.5 to σ = 0.75 σ=0.75 σ= 0.75 and σ = 1.0 σ=1.0 σ= At 1.0, DAE has certain robustness and can better restore the original image. However, in σ = 1.0 σ=1.0 σ= At 1.0, some numbers were not restored correctly.

Topics: TensorFlow Computer Vision Deep Learning

Programmer Think