Stock Price Forecast with TensorFlow (Multilayer Perceptron: MLP) ~ Stock Forecast Part 2 ~

Introduction

I will start studying stock investment, so I will leave a note of it.

Continuation of the previous TensorFlow (LSTM) Stock Price Forecast ~ Stock Forecast Part 1 ~

Last forecasted stock price

This time, we will use Multilayer Perceptron (MLP) to classify whether the stock price will rise or fall and make a stock price forecast.

[Supplement]

We operate horse racing prediction siva that uses AI in addition to stocks. Quinella predictive value: Approximately 86% Recovery rate: Approximately 136%

I started twitter. Please follow me.

Simple specifications

As a forecast method for previous, the stock price for the next day was predicted by LSTM from the closing price for the previous 10 days.

This time, we will enter the closing price for the previous 100 days and classify the stock price on the next day as a binary classification of whether it will rise or fall.

First of all, stock prediction experiment

For the data, use the data of here together with previous.

The downloaded data is the Nikkei 225 2007-2017 information. Contains data for date, open, high, low and close prices.

approach

Since it is a binary classification of whether the stock of the next day goes up or down from the closing price information for the past 100 days, the activation function is a logistic function (sigmoid) and the loss function is an entropy error.

About Multilayer Perceptron

Multi-Layer Perceptron (MLP) is a feedforward neural network in which neurons are arranged in multiple layers.

Programmatic experiment

As with the last time, I will give priority to trying first and make a program quickly with the Nikkei average. Gives the correct answer rate for the classification.

`stock_mlp.py`


# -*- coding: utf-8 -*-
import sys
import os
import numpy
import pandas
from sklearn import preprocessing

from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.utils import np_utils


#
#Generate model
#
class StockCNN : 
  def __init__(self):
    self.length_of_sequences = 100

  def load_data(self, date, data, n_prev=100):
    label = []
    X, Y = [], []
    for i in range(len(data) - n_prev):
      label.append(date.iloc[i+n_prev].as_matrix())
      X.append(data['close'].iloc[i:(i+n_prev)].as_matrix())
      array = data.iloc[i:(i+n_prev)].as_matrix()
      if (float(array[-1]) > float(data.iloc[i+n_prev].as_matrix())) :
        Y.append([0])
      else :
        Y.append([1])

    ret_label = numpy.array(label)
    retX = numpy.array(X)
    retY = numpy.array(Y)
    return ret_label, retX, retY

  def create_model(self) :
    model = Sequential()
    model.add(Dense(64, input_dim=self.length_of_sequences, activation='sigmoid'))
    model.add(Dense(128, activation='sigmoid'))
    model.add(Dense(64, activation='sigmoid'))
    model.add(Dense(1, activation='sigmoid'))


    model.compile(loss='binary_crossentropy',
                  optimizer='rmsprop',
                  metrics=['accuracy'])

    return model

if __name__ == "__main__":

  stock = StockCNN()
  data = None
  for year in range(2007, 2018):
    data_ = pandas.read_csv('csv/indices_I101_1d_' + str(year) +  '.csv', encoding="shift-jis")
    data = data_ if (data is None) else pandas.concat([data, data_])
  data.columns = ['date', 'open', 'high', 'low', 'close']
  data['date'] = pandas.to_datetime(data['date'], format='%Y-%m-%d')
  data['close'] = preprocessing.scale(data['close'])
  data = data.sort_values(by='date')
  data = data.reset_index(drop=True)
  data = data.loc[:, ['date', 'close']]

  #Data preparation
  split_pos = int(len(data) * 0.9)
  x_label, x_train, y_train  = stock.load_data(data[['date']].iloc[0:split_pos],\
                                          data[['close']].iloc[0:split_pos], stock.length_of_sequences)
  x_tlabel, x_test,  y_test  = stock.load_data(data[['date']].iloc[split_pos:], \
                                          data[['close']].iloc[split_pos:], stock.length_of_sequences)

  model = stock.create_model()

  
  model.fit(x_train, y_train, nb_epoch=1000, batch_size=10)



  good = 0
  index = 0
  for values in x_test : 
    y = y_test[index][0]
    predict = model.predict(numpy.array([values]))[0][0]
    print(x_tlabel[index][0])
    print(y)
    print(predict)
    if predict < 0.5 :
      if y == 0 :
        good += 1
    else : 
      if y == 1 :
        good += 1
    index += 1
  print ("accuracy = {0:.2f}".format(float(good) / len(x_test)))

result

This time there are two categories, UP / DOWN, so try multiple times to verify the results.

Time	accuracy
1st time	57%
Second time	59%
3rd time	53%

Hmm. .. .. .. It's more likely than appropriate, but I'm not sure it's predictable. .. .. I will do my best.

at the end

First of all, give priority to trying and consider the best way. Next time, I will try statistical calculation, CNN, and reinforcement learning.

Please follow twitter.

We operate horse racing prediction siva that uses AI in addition to stocks. Quinella predictive value: Approximately 86% Recovery rate: Approximately 136%