In this paper, we describe how to fit given time series data to AR model, AM model, and ARMA model using python.
The function statsmodels.tsa.arima_model.ARMA.fit is used. Click here for details (https://www.statsmodels.org/devel/generated/statsmodels.tsa.arima_model.ARMA.fit.html#statsmodels.tsa.arima_model.ARMA.fit)
As an example, we will fit the AR (1) model.
y_t = 1 + 0.5 y_{t-1} + \epsilon_t
However, $ \ epsilon_t $ is the normal white noise with variance 1. Also, let $ y_0 = 2 $.
#A magical spell that makes module capture and graphs look good
import matplotlib as mpl
import matplotlib.pyplot as plt
plt.style.use('seaborn')
mpl.rcParams['font.family'] = 'serif'
%matplotlib inline
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.api as smt
p = print
#Creating a column of data to plot
#This time, capture the data at 1000 times
y = np.zeros(1000)
np.random.seed(42)
epsilon = np.random.standard_normal(1000)
y[0] = 2
for t in range(1,1000):
y[t] = 1 + 0.5 * y[t-1] + epsilon[t]
#Take a look at the time series data to plot
plt.plot(y)
plt.xlabel('time')
plt.ylabel('value')
plt.title('time-value plot');
The following graph is plotted.
Fit this model.
mdl = smt.ARMA(y, order=(1, 0)).fit()
p(mdl.summary())
The result is as follows. It can be seen that the constant term is 2.0336 and the coefficient of the AR model is 0.4930, which is close to the actual value of 2,0.5. In addition, this model was an AR model that included the constant term 2, but if it is known that the constant term is 0,
mdl = smt.ARMA(y, order=(1, 0)).fit(trend='nc')
And it is sufficient.
All you have to do is change the number in the order =, part. For example, when fitting to MA (1),
mdl = smt.ARMA(y, order=(0, 1)).fit()
And it is sufficient
When fitting to ARMA (1, 1)
mdl = smt.ARMA(y, order=(1, 1)).fit()
And it is sufficient
The function sm.tsa.arma_order_select_ic can be used to estimate the optimal order based on the information criterion. Click here for details on the function (https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.arma_order_select_ic.html)
The order is estimated using the time series data of the AR (1) model model described above. In other words, if you can estimate that the order is (1,0), you are successful.
from statsmodels.tsa.arima_process import arma_generate_sample
y = np.zeros(1000)
np.random.seed(42)
epsilon = np.random.standard_normal(1000)
y[0] = 2
for t in range(1,1000):
y[t] = 1 + 0.5 * y[t-1] + epsilon[t]
sm.tsa.arma_order_select_ic(y)
The result is as follows.
The order (1,0) was optimally estimated. The matrix represents the value of BIC The rows represent the degree of AR and the columns represent the degree of AM.
If you want to use AIC, or if you want to evaluate with both AIC and BIC, describe as follows.
#When using AIC
sm.tsa.arma_order_select_ic(y, ic='aic')
#When you want to evaluate at the same time based on two information criteria
sm.tsa.arma_order_select_ic(y, ic=['aic', 'bic'])
It is also possible to survey by changing the maximum value of the order, or to estimate after assuming that the constant term = 0.
Recommended Posts