Chapter 3 of Advanced in Financial Machine Learning introduces a two-step machine learning method that introduces the Triple-Barrier Method and Metalabel, and the current stock price data is triple-Barrier threshold within the time limit. It introduces a method to predict whether the upper limit is exceeded, the Triple-Barrier threshold lower limit is exceeded, or the Triple-Barrier threshold upper and lower limits are met. Here is an example of applying this method to Bitcoin to improve the F1 score (Financial Machine Learning Part 1: Labels) ) Is written. Since the concept itself is still difficult, I wrote an article that explained by chewing (Financial Machine Learning Part 1: Labels). I would like.
Reference article </ b> ・ Improvement of performance metrix by 2-step learning model ・ Financial Machine Learning Part 1: Labels
① Does the current stock price exceed the threshold upper limit? </ B> ② Is it below the threshold lower limit? </ B> ③ Will it move within the threshold upper and lower limits? I want to predict </ b>. It does not predict the future stock price itself, but (1) will it exceed the upper limit of the threshold, (2) will it fall below the lower limit of the threshold, and (3) will it remain within the upper and lower limits of the threshold? I want to predict which of the three cases will be.
1st machine learning of training data recording the movement of the price of Bitcoin in Triple-Barrier (the upper limit of Triple-Barrier threshold has been reached, the lower limit of threshold has been reached, and it has fallen within Triple-Barrier) starting from any time. Input to the model (logistic regression) and train the 1st machine learning model. Train the 2nd machine learning model using the output of the 1st machine learning model and the training data.
Accuracy can be improved by using a two-step machine learning model.
December 4, 2018 → We were able to improve the accuracy of Bitcoin price data from December 9, 2018 by applying the "two-step machine learning model".
● December 4, 2018 → Bitcoin price on December 9, 2018
● Accuracy and F1-score could be improved by applying the "two-step machine learning model". (Prediction accuracy has improved.) </ B> In other words, by applying the "two-step machine learning model"
[TN] </ b> Predicted that the price will change within Triple-Barrier → Predicted that the price will actually change within Triple-Barrier [TP] </ b> Predicted price increase → Actual price increase [TP] </ b> Predicted that the price will go down → The price actually went down. I was able to improve the frequency of.
The training data and test data were input to the trained machine learning model, and the Confusion matrix was calculated. Since TP (True Positive) is a profit pattern, FP (False Positive) is a loss cut pattern, and TN (True Negative) and FN (False Negative) are patterns that do nothing, the number of TPs increases and the number of FPs increases. It can be said that it is preferable to reduce.
In fact, the 2nd model is compared to the 1st model ・ In the training data ... TP has improved and FP has decreased, so the probability of profitability has improved. Accuracy and F1-score are also improved. ・ In the test data ... TP decreased, FP decreased. Since Accuracy and F1-score are improved, the probability that the prediction by the machine learning model will be correct is increasing, but since TP is decreasing and FP is decreasing, which one can make a profit depends on the situation. (Should we evaluate that the probability of loss cut is decreasing)
--A two-step machine learning model was applied to predict future price movements of Bitcoin. --Accuracy and F1-score could be improved by applying the "two-step machine learning model". (Prediction accuracy has improved.) </ B> ――The "two-step machine learning model" is preferable to the "one-step machine learning model" because the profit probability is improved and the loss cut probability is reduced in the training data. --However, when applied to test data, the "two-step machine learning model" has a lower profit probability and loss cut probability than the "one-step machine learning model", so the "two-step machine learning model" and "one-step machine" Which of the "learning models" can be profitable depends on the situation. However, the probability of losing money is decreasing
The Jupyter code that implemented the above has been uploaded below.
https://github.com/fdfpy/studyresult/tree/master/3-5
Calculate the daily earnings standard deviation according to the flow shown in the figure below.
Let volstd ($ t_ {i} $) be the standard deviation of daily earnings at time $ t_ {i} $. Also, let the stock price at time $ t_ {i} $ be $ c [t_ {i}]
\begin{eqnarray}
Label
=
\begin{cases}
1 & ( vol \geqq Vthu ) \\
0 & ( Vthd \lt b \lt Vthu ) \\
-1 & (vol \lt Vthd)
\end{cases}
\end{eqnarray}
Assign to the Confusion matrix according to the figure below.
Recommended Posts