Aidemy 2020/11/10
Hello, it is Yope! I am a liberal arts student, but I was interested in the possibilities of AI, so I went to the AI-specialized school "Aidemy" to study. I would like to share the knowledge gained here with you, and I am summarizing it on Qiita. I am very happy that many people have read the previous summary article. Thank you! This is the first post of RNN_LSTM. Nice to meet you.
What to learn this time ・ ・
(For deep learning, learn from "Deep Learning Basics") -Deep learning refers to a model__ in which the __intermediate layer (hidden layer) of the neural network is deeply set. -A neural network is one of machine learning that applies the mechanism of brain transmission, and consists of __ "input layer", "intermediate layer", and "output layer" __. -When information is transmitted from left to right, the output value that you want to machine learn can be obtained, while you can improve the accuracy of the model by performing "learning" that gives information from right to left and adjusts the parameters. -By deepening the layer, it is possible to reduce __ parameters compared to the case where it is not done, and there is an advantage that __ learning efficiency is good __.
-One of the things to consider when building a neural network model is __ "number of layers" __ and __ "number of units" __. Since there is no formula to determine these, it is necessary to make exploratory decisions. -The most obvious is the number of units (number of dimensions) __ in the output layer. For this, the number of categories (number of classes) to be classified can be specified as it is. -Regarding the number of units in other layers, it is customary to start with the number of units __ "(input layer + output layer) * 2/3" __. Subsequent layers are set while checking the learning results. -Regarding the number of __intermediate layers __, add it when the number of units is likely to increase in the above procedure.
-The first countermeasure against overfitting, that is, the __ generalization __ method, is __ "DropOut ()" __. This is a generalization method by not learning nodes by a fixed ratio. Generally, set the ratio of __50% __. -Dropout can be done with __ "model.add (Dropout (ratio))" __.
-Another method of generalization is __ "Early Stopping" __. This is a method to prevent overfitting by stopping learning when __accuracy does not improve in repeated learning. ・ It can be said that the accuracy is best when the __error between the training data and the test data is approximately equal. -Early Stopping can be performed with __ "Early Stopping ()" __. As a parameter, __ "monitor ='val_loss'" __ is used to determine the error as the criterion for accuracy, and __ "patience" __ is used to specify the number of past data to judge the error, and __ "mode" __ Specify the definition (upper limit or lower limit) that is judged to be converged by.
RNN/LSTM
(Also learned about RNN by "time series analysis") -RNN is a __neural network that can handle __time series data. Past information is retained in the model in order to incorporate the concept of time into the neural network. ・ However, as the time series progresses, the gradient may disappear or the amount of calculation may increase explosively __ "gradient explosion" __, so it is not suitable for long-term learning __ There is a point. -For gradient explosions, there is a solution called "gradient clipping" __ that says "correct the gradient if the gradient value exceeds the threshold".
-The solution to the above RNN problem is __ "LSTM" __. By replacing the middle layer of RNN with LSTM block, the context can be maintained for a long time.
・ For this data, __ "UNIQLO stock price data" __ will be used. This time, we will do "Predict the data __ of the next day __ from the data __ 15 days past". -In the code, first, the __ "apply_window ()" __ function that divides the data every n days (window_size) is created. -Next, create a __ "split_train_test ()" __ function that divides the data into training data and test data. This time, __70% __ is used as training data by default. -Create a function __ "data_load ()" __ that reads data. After reading the data, recognize the'Date' column as date data and sort by date based on it. Finally, get __close price __ ('Close' column) in that order. -Once this is done, create __ "train_model ()" __ that defines the model and performs training. The unit size of the input layer is __ "15" __ because "15 days worth of data" is input this time. The model uses __Sequential () __, and time series data is handled by __LSTM layer __. After defining the layer, compile it, and this time, let it learn by the number of learnings (epochs) 10. -Finally, create a __ "predict ()" __ function that predicts the model. When returning it, return it as __ "one-dimensional NumPy array" __ like __ "pred.flatten ()" __.
-Code![Screenshot 2020-11-08 19.58.38.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/698700/db6ea786-8bb8-0e75- ba35-7560071141e8.png)
-Implement the model using the existing functions. After loading the data with __ "data_load () __", getting the closing price, splitting the data with __ "split_train_test ()" __, scale the data __ before passing the data to the model __. This time we will do standardization. -For standardization, create an instance with __ "StandardScaler ()" __, calculate the average variance of the data with __ "fit_transform ()" __ for training data, and then standardize, and for test and data Only standardize with __ "transform ()" __ (because test data is less accurate when used to define standardization).
-After standardization, divide the data with __ "apply_window ()" __. This time window_size is set to 15, but here we want to include the "close price of the next day" in the size, so the argument is __ "window_size + 1" __. The last (-1) of the list of data divided by this is passed as the correct label (y_train), and the other 15 data are passed as training data (X_train) to __ "train_model ()" __. After that, predict with "predict ()" and you're done.
・ Furthermore, if you want to compare the data predicted by predict this time with the test data in a diagram, restore the data with __ "inverse_transform ()" __ and then use __ "plt.plot ()" __ to illustrate. good.
-Code![Screenshot 2020-11-08 19.59.41.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/698700/480715aa-ab35-a829- 9d3c-fd0f8d167ca7.png)
・ Result![Screenshot 2020-11-08 20.00.06.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/698700/968fe309-e9e8-1b95- a559-fcc04896b00c.png)
・ In the previous section, the closing price that could be predicted was one day's worth, but by shifting this prediction by one day and performing it for ten days __, a model __ that predicts the closing price after __10 days can be created. .. -As a concrete method, create a function __ "predict_ten_days ()" __ function that predicts 10 times, and pass the data and model to it. However, the place to start from the test data of "window_size or more (365-window_size)" is defined by __ "start_point" __, and the data is used from this position.
-Code (additional points only, start_point is optional)![Screenshot 2020-11-08 20.00.32.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0 / 698700 / 06cccc3c-ee9b-b899-fc52-707d067f2bdb.png)
・ Result![Screenshot 2020-11-08 20.01.32.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/698700/7c8466ff-487c-66b3- 6e70-7c101d483e7d.png)
-As a method of generalization, there is also "Early Stopping" in addition to the "dropout" that has appeared so far. This is to prevent overfitting by automatically stopping learning when the accuracy does not improve. -When creating a model that makes a certain prediction from the data of the past n days, divide the data by n and adjust the input size of the model accordingly. ・ If you want to perform more future time-series analysis such as stock price forecasting in a few days, you can create a model by repeating daily forecasting.
This time is over. Thank you for reading until the end.
Recommended Posts