前言
本文主要是采用lstm对汇率涨跌进行预测,是一个二分类的预测问题。
步骤解析
数据构造
原始数据是单变量数据
import pandas as pd
file_path = r"./huilv.csv"
data = pd.read_csv(file_path, usecols=[1],encoding='gbk')
data['level'] = -1
美元 level
0 7.2996 -1
1 7.2775 -1
2 7.2779 -1
3 7.2695 -1
4 7.2791 -1
... ... ...
3655 6.9475 -1
3656 6.9131 -1
3657 6.8926 -1
3658 6.8912 -1
3659 6.8265 -1
3660 rows × 2 columns
构造成两列数据,后一列为涨跌情况
for i in range(len(data)-1):
if data.iloc[i+1,0] - data.iloc[i,0] > 0:
data.iloc[i+1,1] = 1
else :
data.iloc[i+1,1] = 0
美元 level
0 7.2996 -1
1 7.2775 0
2 7.2779 1
3 7.2695 0
4 7.2791 1
... ... ...
3655 6.9475 0
3656 6.9131 0
3657 6.8926 0
3658 6.8912 0
3659 6.8265 0
3660 rows × 2 columns
构造训练和测试集
def scaler(X_train,X_test):
"""
数据归一化
"""
mm = MinMaxScaler()
mm.fit(X_train)
mmX_train = mm.transform(X_train)
mmX_test = mm.transform(X_test)
return mmX_train,mmX_test
def create_dataset(dataset, dataset_label, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset_label)-look_back-1):
a = dataset[i:(i + look_back), :]
dataX.append(a)
b = dataset_label[i+look_back, 0]
dataY.append(b)
print('a: %s, b: %s' % (a, b))
return numpy.array(dataX), numpy.array(dataY)
构造模型
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(64, input_shape=(look_back,1))) # 与上面的重构格式对应,要改都改,才能跑通代码
model.add(Dense(1))
model.compile(loss='binary_crossentropy', optimizer='adam',metrics=['acc'])
history = model.fit(trainX, trainY, epochs=100, batch_size=1000, validation_split=0.2)
损失函数图像
预测
# 模型预测并计算结果
from sklearn.metrics import confusion_matrix
y_pred = model.predict(testX)
# 计算混淆矩阵
cm = confusion_matrix(testY, y_pred.argmax(axis=1))
print('Confusion Matrix')
print(cm)
# 评价指标
print("Precision: ", (cm[0][0] + cm[1][1])/ (cm[0][0]+cm[1][0]+cm[0][1]+cm[1][1]))
print("Accuracy: ", (cm[0][0] + cm[0][1])/ (cm[0][0]+cm[1][0]+cm[0][1]+cm[1][1]))
print("Recall", (cm[0][0] )/ (cm[0][0]+cm[1][0]))
23/23 [==============================] - 0s 1ms/step
Confusion Matrix
[[384 0]
[344 0]]
Precision: 0.5274725274725275
Accuracy: 0.5274725274725275
Recall 0.5274725274725275
总结
- lstm 二分类预测,效果看起来一般,算是第一次接触分类问题
- 结果有更进一步的优化,本案例只做跑通整个流程
备注
需要完整源代码和数据集,或者想要沟通交流,请私聊,谢谢.