# LSTM time series prediction

Posted by caspert_ghost on Fri, 14 Feb 2020 09:43:26 +0100

# 1. Problem description

Using any RNN (simple RNN, GRU, LSTM), according to 1000 time series data provided, 1001-1500 data are predicted.
Data set provided:
A3 ﹐ train.txt, including 1000 lines, each line is a float type value. The contents of the txt file are as follows:

# 2. Data preprocessing

## ① Read the file, and convert 1000 pieces of data in the file to an array with a shape of (1000,)

```# Read the file and construct 1000 pieces of data into an array with the shape of (1000,)
import os
import numpy as np
from matplotlib import pyplot as plt
data_dir = r"C:\Users\NOTEBOOK\DeepLearningNote\train_data.txt"
f = open(data_dir)
f.close()

samples = [float(i) for i in content.split("\n")]
samples = np.array(samples)
plt.plot(range(len(samples)),samples)
```

Show 1000 pieces of data

## ② Define a function to generate train? Data and train? Labels

This function can generate (900100) - like array, as train_data, as train_labels

```#Define the split ﹣ sequence function. The return value X contains the values before n ﹣ steps. y is the value of the next time
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
end_ix = i + n_steps
if end_ix > len(sequence)-1:
break
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return np.array(X), np.array(y)
```

## ③ All the data and labels are obtained by the split ﹣ sequence function, and the train ﹣ data is transformed into a form that can be processed by LSTM

```data,labels = split_sequence(samples,100)
data = data.reshape((len(data),100,1))
```

# 3. Model building

The model consists of three LSTM layers and one Dense layer

```#Modeling
from keras.models import Sequential
from keras.layers import GRU,LSTM,Dropout,Dense
model = Sequential()
model.summary()
```

# 4. Training model

Take 20% of 1000 data as validation data

```history = model.fit(data,labels,epochs=20,batch_size=20,validation_split=0.2)
```

Since the initial weights are random, the best training process is as follows

Print the loss and val Ou loss of its training process

```loss = history.history['loss']
val_loss = history.history['val_loss']
plt.plot(range(20),loss,"bo",label="loss")
plt.plot(range(20),val_loss,"b",label='val_loss')
plt.legend()
plt.show()
```

# 5. forecast

## ① Define prediction function

This function is used to get the 500 prediction results of 1001-1500
This function first obtains the 1001st prediction result by using the 100 data of 9011000, then stores the 1001st prediction result in the results, and adds the 100 data of 9021001 into the original data. The next round of prediction will take this as the input to obtain the 1002nd prediction result, which will go down successively until 500 times of operation, that is to say, the 1500th prediction result

```#Define methods to predict results from 1001 to 1500
def predict1001_1500(samples):
results = []
for i in range(500):

#         print(samples[-1].shape)
result = model.predict(samples[-1].reshape(1,100,1))

results.append(result)

sample = np.append(samples[-1][1:],result)

sample = sample.reshape((1,100,1))
#         print(sample.shape)
samples = np.concatenate((samples,sample),axis=0)
return results
```

## ② The 500 prediction results of 1001-1500 are obtained

```results = predict1001_1500(data)
```

Print out the 1500 prediction results

```plt.plot(range(1000),samples)
plt.plot(range(1000,1500),np.array(results).reshape(500,))
```

Print 500 forecast results separately

```plt.plot(range(500),np.array(results).reshape(500,))
```

Published 10 original articles, won praise 4, visited 2222