Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~

Introduction

I will start studying stock investment, so I will leave a note of it.

Predict stock prices using goals, machine learning and deep learning.

Before you start studying, first check the following books.

It seems that the conditions for stock movement are "good performance", "low PER", and "good chart shape". I would like to bring each one by scraping and make predictions.

[Supplement]

In addition to stocks, I usually use horse racing forecasts siva. Quinella predictive value: Approximately 86% Recovery rate: Approximately 136%

twitter I started. Please follow me.

Why stock?

I chose the stock from the following views.

◆ Gambling such as horse racing
The return of 0 or 100 is also large, but the risk is large.
　◆ FX 
On the other hand, there are people who make money, but there are people who lose it, so it doesn't suit their gender.
　◆ bitcoin
Since the value has not been established, there is a possibility of a crash.
◆ Stocks
As for stocks, everyone is profitable.

First of all, stock prediction experiment

Let's start the experiment from the site where you can download and experiment before bringing it by scraping.

The downloaded data is the Nikkei 225 2007-2017 information. Contains data for date, open, high, low and close prices.

The data used this time uses the closing price.

Think about the approach

For stocks, it is considered that it is better to predict by RNN (Recurrent Neural Network) (* 2) using time series than to statistically analyze past performance (* 1), so it is an extension of RNN. Let's try using LSTM (Long short-term memory).

1 We will try the method of learning the actual results of past data on the chart from the next time onward.
2 RNN is a type of deep learning. Unlike a normal neural network, it inputs its previous state in addition to the current input value. The input / output of RNN is as shown in the figure below. When considering temporally continuous data x (= x_1, x_2, x_3,…, x_n) as input, the previous state s_t-1 is input together with the input of x_t. This "previous state" contains past information up to that point.

Schematic diagram of RNN (source)

Experiment with the program anyway

First of all, I gave priority to trying it and tried to make a program quickly with the Nikkei average.

# -*- coding: utf-8 -*-
import numpy
import pandas
import matplotlib.pyplot as plt

from sklearn import preprocessing
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.layers.recurrent import LSTM

class Prediction :

  def __init__(self):
    self.length_of_sequences = 10
    self.in_out_neurons = 1
    self.hidden_neurons = 300


  def load_data(self, data, n_prev=10):
    X, Y = [], []
    for i in range(len(data) - n_prev):
      X.append(data.iloc[i:(i+n_prev)].as_matrix())
      Y.append(data.iloc[i+n_prev].as_matrix())
    retX = numpy.array(X)
    retY = numpy.array(Y)
    return retX, retY


  def create_model(self) :
    model = Sequential()
    model.add(LSTM(self.hidden_neurons, \
              batch_input_shape=(None, self.length_of_sequences, self.in_out_neurons), \
              return_sequences=False))
    model.add(Dense(self.in_out_neurons))
    model.add(Activation("linear"))
    model.compile(loss="mape", optimizer="adam")
    return model


  def train(self, X_train, y_train) :
    model = self.create_model()
    #Learning
    model.fit(X_train, y_train, batch_size=10, nb_epoch=100)
    return model


if __name__ == "__main__":

  prediction = Prediction()

  #Data preparation
  data = None
  for year in range(2007, 2017):
    data_ = pandas.read_csv('csv/indices_I101_1d_' + str(year) +  '.csv')
    data = data_ if (data is None) else pandas.concat([data, data_])
  data.columns = ['date', 'open', 'high', 'low', 'close']
  data['date'] = pandas.to_datetime(data['date'], format='%Y-%m-%d')
  #Standardize closing price data
  data['close'] = preprocessing.scale(data['close'])
  data = data.sort_values(by='date')
  data = data.reset_index(drop=True)
  data = data.loc[:, ['date', 'close']]

  #20% to test data
  split_pos = int(len(data) * 0.8)
  x_train, y_train = prediction.load_data(data[['close']].iloc[0:split_pos], prediction.length_of_sequences)
  x_test,  y_test  = prediction.load_data(data[['close']].iloc[split_pos:], prediction.length_of_sequences)

  model = prediction.train(x_train, y_train)

  predicted = model.predict(x_test)
  result = pandas.DataFrame(predicted)
  result.columns = ['predict']
  result['actual'] = y_test
  result.plot()
  plt.show()

result

It's pretty predictable ... Screen Shot 0029-06-30 at 3.26.00 AM.png

It seems that you can make good predictions by setting it to UP / DOWN.

at the end

This time I gave priority to moving it, but I will continue to write articles in the future. twitter I started. Please follow me.

In addition, we operate horse racing forecast siva. Please follow us as well.