A case study of highway passenger volume, i.e. Highway Freight Volume Prediction Based on BP neural network

Posted by igebert on Tue, 28 Dec 2021 10:44:04 +0100

Programmer's view of love: love is a dead cycle, once executed, it will fall into; Falling in love with someone is a memory leak - you can never release it; When you really love someone, that is constant limit, which will never change; Girlfriend is a private variable, which can only be called by my class; Lover is a pointer. You must pay attention when using it, or it will bring great disaster.

Neural network is the basis of deep learning. I understand that neural network is the structure that can fit any kind of generalized linear model. BP algorithm is the algorithm for solving parameter w. the basis of neural network and weight learning algorithm are BP learning algorithm. The signal "forward propagation (FP)" calculates the loss and "back propagation (BP)" returns the error; Modify the weight of each layer according to the error value and continue the iteration.


Output layer error
O represents the predicted result and d represents the real result; The coefficient is calculated for the convenience of derivation

Like general neural networks, BP neural networks conduct FP conduction first, that is, forward conduction. In the case, only one hidden layer is set, so there are two parameter layers: W1, B1; w2,b2; Row and column of W parameter matrix: the number of neurons in the behavior output layer, and column is the number of neurons in the input layer.
The result of the hidden layer: O1=sigmoid(a1)=sigmoid(w1.x.T+b1). The hidden layer uses the sigmoid activation function
Output result: O2=a2=W2*O1+b2, the last layer does not use the activation function
The loss function cost=1/2(O2-y)^2. In the brackets is the square of the predicted value minus the actual value, while the error term we often use err=y-O2. In fact, err can also be equal to O2-y, so the err form determines whether we need to add a negative sign to the following err. Because the partial derivative of cost to O2 is (O2-y), the conversion of the relationship between err and (O2-y) needs special attention.
Here we define err=y-O2;
The loss function obtains the partial derivative of Wi, that is, the gradient value. The derivative process uses the differential chain derivative
As shown in the figure below, the gradient is equal to the error term multiplied by the input value of the previous layer. For the Wi of the last layer, we can get equal to - (y-oi) oi-1 T. That is - erroi-1 T. In the last layer, we did not convert the activation function of a2, so there was no derivation of the activation function about a2, that is, the derivation of O2 from a2 was 1, so it was ignored

The loss function biases ai to obtain the error vector. The derivative process uses differential chain derivation:


import numpy as np
import  pandas as pd
import matplotlib as mpl
import  matplotlib.pyplot as  plt
from sklearn.preprocessing import  MinMaxScaler

df=pd.read_csv('traffic_data.csv',encoding='GBK')
print(df.head)

x=df[['Population','Number of motor vehicles','Highway area']]
y=df[['Passenger volume','the volume of freight transport']]

print(x)
print(y)

#Normalize the maximum and minimum values of the data
x_scaler=MinMaxScaler(feature_range=(-1,1))
y_scaler=MinMaxScaler(feature_range=(-1,1))

x=x_scaler.fit_transform(x)
y=y_scaler.fit_transform(y)

#Transpose the sample and perform matrix operation
sample_in=x.T
sample_out=y.T

#BP neural network parameters
max_epochs=60000 #Number of loop iterations
learn_rate=0.035  #Learning rate
mse_final=6.5e-4  #Set a threshold of mean square error, and if it is less than it, the iteration will be stopped
sample_number=x.shape[0]  #Number of samples
input_number=x.shape[1]  #Input feature number
output_number=y.shape[1]  #Number of output targets
hidden_units=8 #Number of hidden layer neurons
print(sample_number,input_number,output_number)


#Define activation function Sigmod
# import math
def sigmoid(z):
    return  1/(1+np.exp(-z))

def sigmoid_delta(z): #partial derivative
    return 1/((1+np.exp(-z))**2)*np.exp(-z)

print(sigmoid(0),sigmoid_delta(0))

#A hidden layer
#W1 matrix: M rows and N columns, M is equal to the number of neurons in this layer, and N is equal to the number of input features
W1=0.5*np.random.rand(hidden_units,input_number)-0.1
b1=0.5*np.random.rand(hidden_units,1)-0.1

W2=0.5*np.random.rand(output_number,hidden_units)-0.1
b2=0.5*np.random.rand(output_number,1)-0.1

mse_history=[]  #Empty list to store iteration errors
#Do not set activation function
for i in range(max_epochs):
    #FP
    hidden_out=sigmoid(np.dot(W1,sample_in)+b1)  #np.dot moment matrix multiplication, hidden_ The out1 result is 8 rows and 20 columns
    network_out=np.dot(W2,hidden_out)+b2  #np.dot matrix multiplication, W2 is 2 rows and 8 columns, then the output result is 2 rows and 20 columns
    #error
    err=sample_out-network_out
    mse_err=np.average(np.square(err)) #Mean square error
    mse_history.append(mse_err)
    if mse_err<mse_final:
        break
    #BP
    #Error vector
    delta2=-err #Error of the last layer
    delta1=np.dot(W2.transpose(),delta2)*sigmoid_delta(hidden_out)  #The error vector of the previous layer. This layer is hidden_out uses sigmoid to activate the function. It is necessary to activate hidden_ Find the partial derivative; Note that the last step is the dot multiplication of two matrices, which are two matrices with exactly the same dimension
    #Gradient: partial derivative of loss function
    delta_W2=np.dot(delta2,hidden_out.transpose())
    delta_W1=np.dot(delta1,sample_in.transpose())
    delta_b2=np.dot(delta2,np.ones((sample_number,1)))
    delta_b1=np.dot(delta1,np.ones((sample_number,1)))
    W2-=learn_rate*delta_W2
    b2-=learn_rate*delta_b2
    W1-=learn_rate*delta_W1
    b1-=learn_rate*delta_b1


#Loss value drawing
print(mse_history)
loss=np.log10(mse_history)
min_mse=min(loss)
plt.plot(loss,label='loss')
plt.plot([0,len(loss)],[min_mse,min_mse],label='min_mse')
plt.xlabel('iteration')
plt.ylabel('MSE loss')
plt.title('Log10 MSE History',fontdict={'fontsize':18,'color':'red'})
plt.legend()
plt.show()


#Comparison between model predicted output and actual output
hidden_out=sigmoid(np.dot(W1,sample_in)+b1)
network_out=np.dot(W2,hidden_out)+b2

#Invert to get actual value:
network_out=y_scaler.inverse_transform(network_out.T)
sample_out=y_scaler.inverse_transform(y)

#Solve the problem that Chinese pictures cannot be displayed
plt.rcParams['font.sans-serif'] = [u'SimHei']
plt.rcParams['axes.unicode_minus'] = False

plt.figure(figsize=(8, 6))
plt.plot(network_out[:,0],label='pred')
plt.plot(sample_out[:,0],'r.',label='actual')
plt.title('Passenger volume ',)
plt.legend()
plt.show()



plt.figure(figsize=(8, 6))
plt.plot(network_out[:,1],label='pred')
plt.plot(sample_out[:,1],'r.',label='actual')
plt.title('the volume of freight transport ')
plt.legend()
plt.show()

The simulation results are shown as follows:
This is the result of an iteration:


It can be seen that the fitting result is not very good, and the predicted value deviates greatly from the actual value
The better model results are given below

traffic_ data. The CSV data is as follows

Question:
1. It can be seen from here that it is very important to set the initial value of parameters. The solution may be to train several times to get a better model result
Information:
Neural network, understanding and derivation of BP algorithm https://zhuanlan.zhihu.com/p/45190898
BP algorithm of neural network_ Actual combat prediction case https://www.bilibili.com/video/BV1a7411J7SR?t=2522

Topics: Python AI neural networks Deep Learning