Tensorflow similarity study notes 5

Posted by RedRasper on Fri, 24 Dec 2021 00:10:23 +0100

Tensorflow similarity study notes 5

2021SC@SDUSC
Learning content: Tensorflow concept summary and learning

feature extraction

Any of the steps in the previous operation:
Different image sizes;
There are 62 tags or target values (because your tags start at 0 and end at 61);
The distribution of traffic sign values is very uneven; There is virtually no connection between the large number of flags in the dataset.
Process your data in such a way that you are ready to input it into the neural network or any model you want to input. Let's start by extracting some features - rescaling the image and converting the image stored in the image array to gray. You will make this color conversion mainly because color is not very important in classification questions, such as the question you want to answer now. However, for detection, color does play a great role, so in these cases, this conversion is not necessary.
Rescaling the image in order to process different image sizes; This can be done easily with the help of the skimage or scikit image library, which is a set of algorithms for image processing.
In this case, the transform module comes in handy because it provides you with a resize() function; You will see that you use list understanding (again!) to resize each image to 28 x 28 pixels. Again, you see how the list is actually formed: for each image found in the image array, the conversion operation borrowed from the skimage library will be performed. Finally, store the result in the images28 variable:

# Import the `transform` module from `skimage` 
from skimage import transform 
# Rescale the images in the `images` array 
images28 = [transform.resize(image, (28, 28)) for image in images]

The image is now four-dimensional: if you convert images28 into an array and connect the attribute shape to it, you will see the printout telling you that the dimension of images28 is (4575, 28, 28, 3). The image is 784 dimensional (because your image is 28 x 28 pixels).
You can reuse the code used above in traffic_ Draw 4 random images with the help of signs variable to check the results of rescaling operation; Change all references to images to images28.
result:

Because of rescaling, the minimum and maximum values also change; Now they are all in the same scope, so there is no need to standardize the data.
When trying to answer a classification question, the color in the picture doesn't matter. This is why we still encounter the trouble of converting images to grayscale.
Just like rescaling, you can again rely on the scikit image library to help you; In this case, you need to use the color module with the rgb2gray() function to get to the desired location.
Don't forget to convert the images28 variable back to an array, because the rgb2gray() function does require an array as an argument.

# Import `rgb2gray` from `skimage.color` 
from skimage.color import rgb2gray 
# Convert `images28` to an array 
images28 = np.array(images28) 
# Convert `images28` to grayscale 
images28 = rgb2gray(images28)

Carefully check the results of gray conversion by drawing some images; Here, you can reuse and slightly adjust some code to display the adjusted image.
You must specify a color map or cmap and set it to gray to draw an image in grayscale. This is because imshow() uses a color map similar to a heat map by default.

These two steps are very basic steps; Other operations that can be attempted on the data include data enhancement (rotate, blur, move, change brightness...). You can also set up the entire data operation pipeline through which images are sent.

TensorFlow for deep learning

Neural network modeling
Use traditional alias tf Then, initialize the Graph with the help of Graph(). You can use this function to define calculations. With Graph, you don't need to calculate anything because it doesn't contain any values. It just defines the operation to run later.
In this case, as_default() sets the default context, which returns a context manager to make this particular Graph the default Graph. If you create multiple graphs in the same process, you can use this method: using this function, you will have a global default Graph. If you do not explicitly create a new Graph, all operations will be added to the Graph.
The model is established. When compiling it, a loss function, an optimizer and an index are defined. Now, when TensorFlow is used, all this happens in one step:
First, the input and label define placeholders because "real" data has not been put in. Placeholders are unassigned values and are initialized by the session when it runs. Therefore, when the session is finally run, these placeholders will get the value of the dataset passed in the run() function!
Then, establish the network. First, flatten the input with the help of the flatten() function, which will provide you with a shape array [None, 784] instead of [None, 28, 28], which is the shape of the grayscale image.
After flattening the input, a fully connected layer can be constructed to generate a logarithm with a size of [None, 62]. Logits is a function that operates on the unscaled output of the previous layer. It uses relative scale to understand that the unit is linear.
After building a multi-layer perceptron, you can define a loss function. The choice of the loss function depends on the task: in this case:

sparse_softmax_cross_entropy_with_logits()

This calculates the sparse softmax cross entropy between logits and tags. In other words, it measures the probability error in class mutually exclusive discrete classification tasks. This means that each entry is in a class. Here, a traffic sign can only have one label. Remember that although regression is used to predict continuous values, classification is used to predict discrete values or data point categories. Use reduce_mean() wraps this function, which calculates the average of the elements in the tensor dimension.
Define a training optimizer; Some of the most popular optimization algorithms are random gradient descent (SGD), ADAM and RMSprop. According to the selected algorithm, some parameters, such as learning rate or momentum, need to be adjusted. In this case, select ADAM optimizer and define the learning rate as 0.001.

# Import `tensorflow` 
import tensorflow as tf 

# Initialize placeholders 
x = tf.placeholder(dtype = tf.float32, shape = [None, 28, 28])
y = tf.placeholder(dtype = tf.int32, shape = [None])

# Flatten the input data
images_flat = tf.contrib.layers.flatten(x)

# Fully connected layer 
logits = tf.contrib.layers.fully_connected(images_flat, 62, tf.nn.relu)

# Define a loss function
loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels = y, 
                                                                    logits = logits))
# Define an optimizer 
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)

# Convert logits to label indexes
correct_pred = tf.argmax(logits, 1)

# Define an accuracy metric
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

Running neural network

The model has been built layer by layer. To do this, you first need to initialize a session with the help of Session() and pass the graphics defined in the previous section to the session. Next, you can run the session using run(), and you can pass the initialization operation to the session in the form of the init variable you defined in the previous section.
You can use this initialization session to start epochs or a training cycle. Select 201 because you want to be able to register the last loss_value; In the loop, the training optimizer runs the session with the defined loss (or accuracy) indicators. A feed_dict parameter is also passed, which can be used to provide data to the model. After every 10 epochs, a log will be obtained to better understand the loss or cost of the model.
There is no need to close the session manually. However, if you want to try different settings, you may need to use sess if you have defined the session as sess Close() does this, as shown in the following code block:

tf.set_random_seed(1234)
sess = tf.Session()

sess.run(tf.global_variables_initializer())

for i in range(201):
        print('EPOCH', i)
        _, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: images28, y: labels})
        if i % 10 == 0:
            print("Loss: ", loss)
        print('DONE WITH EPOCH')

Evaluation neural network

Still need to evaluate your neural network. In this case, it has been possible to try to understand the performance of the model by selecting 10 random images and comparing the predicted label with the real label.
You can print them first, or you can use matplotlib to draw traffic signs and compare them visually

# Import `matplotlib`
import matplotlib.pyplot as plt
import random

# Pick 10 random images
sample_indexes = random.sample(range(len(images28)), 10)
sample_images = [images28[i] for i in sample_indexes]
sample_labels = [labels[i] for i in sample_indexes]

# Run the "correct_pred" operation
predicted = sess.run([correct_pred], feed_dict={x: sample_images})[0]
                        
# Print the real and predicted labels
print(sample_labels)
print(predicted)

# Display the predictions and the ground truth visually.
fig = plt.figure(figsize=(10, 10))
for i in range(len(sample_images)):
    truth = sample_labels[i]
    prediction = predicted[i]
    plt.subplot(5, 2,1+i)
    plt.axis('off')
    color='green' if truth == prediction else 'red'
    plt.text(40, 10, "Truth:        {0}\nPrediction: {1}".format(truth, prediction), 
             fontsize=12, color=color)
    plt.imshow(sample_images[i],  cmap="gray")

plt.show()

Topics: Python Machine Learning TensorFlow