Since it is the 4th article, please refer to the past articles 3rd month
The original purpose of this project was to use deep learning to make money in the equity or financial markets in a broader sense. To achieve that goal, there are several factors involved, as shown below, and we believe that profitability will increase by improving these factors little by little. This time I'd like to first list those elements here, what steps I've taken, and what plans I can make to improve them.
To get better results, we need to prevent overfitting. There are two main ways to prevent overfitting, one is to increase the amount of data and the other is to introduce regularization, while the latter is 2. the "layer of deep learning for training data". It is about the fitness of ". There are also some difficulties with increasing the amount of data. You can increase the amount by simply adding various stock price data to the learning data, but it is controversial and experienced to learn data that is a mixture of US stocks, Japanese stocks and other data. The result is not so good. On the other hand, if the range of data types is narrowed down to only some types (for example, Japanese stocks) or only ETFs, the amount of data will be insufficient this time. Is there a way to increase the amount of data while preventing the diversification of training data?
As a result of trial and error, I came up with a method.
As you can see from this figure, until now, the original data was simply divided into training data, so there was a limit to the amount of training data. On the other hand, if a new method is used, more training data can be extracted from one original data, and the amount of training data can be increased as appropriate. The "Data" abstractly shown in the image is, for example, the following image data.
There is another good point about this method. In order to improve the quality of the training data, it is possible to exclude the data with less price fluctuation, but this can alleviate the disadvantage of reducing the amount of data. In the future, we will narrow down the types of data and appropriately eliminate data that seems unnecessary for learning, while increasing the amount of data by the above method and aiming to maximize the potential of the data.
I think this part is the most difficult and worth studying. At first, I used the Sequential model that I picked up from somewhere to learn without understanding, but there was a limit to that, so I learned a little about the theory behind it. Perhaps the most important is the basic order of the layers, which are as follows:
Also, Convolution, Relu and Pooling can be repeated as a group. I don't know why, but repeating these as follows tends to improve the results.
Please note that you may be using a messed up Sequential model that you still don't know much about. In the future, I will do my best to understand the theory of layers as deeply as possible and to build an efficient Sequential model that suits my purpose.
Until now, I have skipped practical operations and backtesting, so I haven't made much progress here. We introduced democratic decision-making by AI and devised a method to ensure the credibility and accuracy of AI's judgment to some extent, but when to start AI (read the stock price chart at the timing of starting and make a decision) ), And what kind of buying and selling behavior will be triggered as a result will have a great influence on profits.
This time, the stock price had plummeted due to the influence of the coronavirus, so I recently showed the price chart in the following range to the AI of three people and tried to see what kind of result will be obtained by the majority vote.
We predicted that two-thirds of the AI would go up, and the result was good. What will happen to the stock market next week?
Recommended Posts