Introduction

Among the state space models, there were few sites that explained the custom model in Japanese (I could not find it), so I will post the results of my own research.

statsmodels (especially custom state-space models) have a hard time defining models, so I had a hard time around that.

I would appreciate it if you could point out any incorrect descriptions.

What is a state space model?

Originally used in physical models, system identification, control fields, etc., and famously used for rocket orbit estimation and correction control. Also, when searching for papers on state-space models, it is used for estimating the number of endangered animals and for survival analysis. See the stasmodels Github page (https://github.com/statsmodels/statsmodels/wiki/Examples) for usage examples.
I think it can be interpreted as a time series analysis with an element of Bayesian statistics added.
The best feature is that you can estimate the invisible "state". It is considered that the data observed by the sensor contains noise (mainly sensing error), and the true state of the data is estimated. (Only God knows the true true state.)

Observation equations and equations of state

The structure of the model consists of an observation equation (there is also a document described as an observation model) and an equation of state (there is also a document described as a system model).

Observation equation: $ y_t = Z_t (x_ {t}) + d_t + \ epsilon_t $ Equation of state: $ x_ {t + 1} = T_t (x_ {t}) + c_t + R_t \ eta_ {t} $

$ y_t $: Observed value at time t (Example: Sensing value including noise)
$ Z_t $: A function that represents $ x_t $. In the case of a linear function, it is called an observation matrix, but the point is that it can be recognized as a coefficient of the matrix shape. In the statsmodels documentation it is called the "Design matrix".
$ x_t $: State at time t (no noise, invisible "state")
$ d_t $: intercept of the observation equation
$ \ epsilon_t $: Observation noise (eg sensing error)
$ x_ {t + 1} $: State as of t + 1
$ T_t $: A function that represents $ x_t $. In the case of a linear function, it is called a state transition matrix, but the point is that it can be recognized as a coefficient of the matrix shape. The statsmodels documentation calls it the "Transition matrix".
$ c_t $: Intercept of equation of state
$ R_t $: Coefficient over $ \ eta_ {t} $. The statsmodels documentation calls it the "Selection matrix".
$ \ eta_ {t} $: State noise

Past, present and future

Although detailed explanation is omitted, the state space model makes the following estimates for the past, present, and future.

Past: Fixed interval smoothing
Currently: Filtering
Future: Forecast one period ahead (one point ahead) It's amazing.

Significance of custom model

Among the state space models, the significance of the custom model is as follows.

The true power of state-space models is to allow the creation and estimation of custom models. (Quotation of statsmodels custom model description page (Google Translate))
A state-space model is a "statistical model that can represent many statistical models in a unified manner." That is why the state-space model is drawing attention. ([Baba-san's site logics of blue](https://logics-of-blue.com/%e3%81%aa%e3%81%9c%e7%8a%b6%e6%85%8b%e7%] a9% ba% e9% 96% 93% e3% 83% a2% e3% 83% 87% e3% 83% ab% e3% 82% 92% e4% bd% bf% e3% 81% 86% e3% 81% Quote from ae% e3% 81% 8b /))
I think it is possible to build an interpretable model by incorporating a physical model into a time series analysis.

Model used this time

Observation equation: 　y_t = x_{t} + \epsilon_t, \hspace{10pt}\epsilon_t\sim N(0,H_t) Equation of state: 　x_{t} = T_t x_{t-1} + A\cdot\exp\left(\frac{B}{z}\right) + \eta_{t}, \hspace{10pt}\eta_{t}\sim N(0,Q_t)

The difference from the above equation is that $ Z_t $ is fixed in the observation equation, $ d_t $ in the constant term is omitted, and $ c_t $ in the constant term in the equation of state is $ A \ cdot . Replaced with exp \ left (\ frac {B} {z} \ right) $.

A: Constant
B: Constant
z: exogenous variable
What is an exogenous variable: Generally, a variable (explanatory variable) created using past data of a future numerical value (objective variable) that you want to predict in time series analysis is called an endogenous variable, but other variables. Variables are called exogenous variables.

Library import etc.

Now here is the Python code.

import numpy as np
import pandas as pd
import datetime
import statsmodels.api as sm
import matplotlib.pyplot as plt
%matplotlib inline
plt.rc("figure", figsize=(16,8))
plt.rc("font", size=15)

Data creation for testing

Create data for testing. (When actually using it, such work is unnecessary, so detailed explanation is omitted.)

def gen_data_for_model1():
    nobs = 1000

    rs = np.random.RandomState(seed=93572)

    Ht = 5
    Tt = 1.001
    A = 0.1
    B = -0.007
    Qt = 0.01
    var_z = 0.1

    et = rs.normal(scale=Ht**0.5, size=nobs)
    z = np.cumsum(rs.normal(size=nobs, scale=var_z**0.05))
    Et = rs.normal(scale=Qt**0.5, size=nobs)

    xt_1 = 50
    x = []
    
    for i in range(nobs):
        xt = Tt * xt_1
        x.append(xt)
        xt_1 = xt
    xt = np.array(x)
    xt = xt + A * np.exp(B/z) + Et
    yt = xt + et
    
    return yt, xt, z

yt, xt, z = gen_data_for_model1()
_ = plt.plot(yt,color = "r")
_ = plt.plot(xt, color="b")

Store data in DataFrame. Set index to datetime to make predictions

df = pd.DataFrame()
df['y'] = yt
df['x'] = xt
df['z'] = z
st = datetime.datetime.strptime("2001/1/1 0:00", '%Y/%m/%d %H:%M')
date = []
for i in range(1000):
    if i == 0:
        d = st
    dt = d.strftime('%Y/%m/%d %H:%M')
    date.append(dt)
    d += datetime.timedelta(days=1)
df['date'] = date
df['date'] = pd.to_datetime(df['date'] )
df = df.set_index("date")
df

Model definition

class custom(sm.tsa.statespace.MLEModel):
    param_names = ['T', 'A', 'B', 'Ht', 'Qt'] #Specify parameters to estimate
    start_params = [1., 1., 0., 1., 1] #Initial value of the parameter to be estimated

    def __init__(self, endog, exog):
        exog = np.squeeze(exog)
        super().__init__(endog, exog=exog, k_states=1, initialization='diffuse')
        self.k_exog = 1
        
        # Z = I,The part corresponding to Zt in the above equation becomes the identity matrix this time.
        self['design', 0, 0] = 1.
        # R = I,The part corresponding to Rt in the above equation becomes the identity matrix this time.
        self['selection', 0, 0] = 1.
        
        # c_Set t to time
        self['state_intercept'] = np.zeros((1, self.nobs))
        
    def clone(self, endog, exog, **kwargs):
        return self._clone_from_init_kwds(endog, exog=exog, **kwargs)
        
    def transform_params(self, params):
        #The variance needs to be a positive number, so square it once
        params[3:] = params[3:]**2
        return params
        
    def untransform_params(self, params):
        #Squared 1/Square to return to original size (square root)
        params[3:] = params[3:]**0.5
        return params
    
    def update(self, params, **kwargs):
        #It is a specification to update
        params = super().update(params, **kwargs)
        # T = T
        self['transition', 0, 0] = params[0] 
        # c_t = A * exp(B / z_t)
        self['state_intercept', 0, :] = params[1] * np.exp(params[2] / self.exog)       
        # Ht
        self['obs_cov', 0, 0] = params[3]
        # Qt
        self['state_cov', 0, 0] = params[4]

Divide into estimation data and prediction data

df_test = df.iloc[:800,:]
df_train = df.iloc[800:,:]

Learning, summary confirmation

mod = custom(df_test['y'], df_test['z'])
res = mod.fit()
print(res.summary())

サマリー.png

The coef part is the estimated value, but it is close to the numerical value of the data created at the beginning.

Store the estimation result in DataFrame

ss = pd.DataFrame(res.smoothed_state.T, columns=['x'], index=df_test.index)

Estimate and forecast

predict = res.get_prediction()
forecast = res.get_forecast(df.index[-1], exog = df_train['z'].values)

When you make a prediction, you need to specify to what point you want to make a prediction. (This time specified by df.index [-1]) Also, when using exogenous variables in the model, it is necessary to specify the data with the argument of "exog =" at the time of prediction. This time, all of them are dummy data, but if you want to actually use them in prediction, you need to create dummy data from past data. Data with uneven data spacing cannot be used for prediction. (The same can be said for SARIMAX, which also uses exogenous variables.)

Visualization

fig, ax = plt.subplots()

y = df['y']

y.plot(ax=ax,label='y')
predict.predicted_mean.plot(label='x')
predict_ci = predict.conf_int(alpha=0.05)
predict_index = predict_ci.index
ax.fill_between(predict_index[2:], predict_ci.iloc[2:, 0], predict_ci.iloc[2:, 1], alpha=0.1)

forecast.predicted_mean.plot(ax=ax, style='r', label='forecast')
forecast_ci = forecast.conf_int()
forecast_index = forecast_ci.index
ax.fill_between(forecast_index, forecast_ci.iloc[:, 0], forecast_ci.iloc[:, 1], alpha=0.1)

legend = ax.legend(loc='best');
fig.savefig('custom_statespace.png')

可視化.png

The light blue part is the estimated 95% confidence interval and the pink part is the predicted 95% confidence interval.

References

This article was written with reference to the following information.

[Baba-san's site logic of blue](https://logics-of-blue.com/category/%e7%b5%b1%e8%a8%88%e3%83%bbr/%e7%8a%b6 % e6% 85% 8b% e7% a9% ba% e9% 96% 93% e3% 83% a2% e3% 83% 87% e3% 83% ab /)
Albert's site
General Gaussian state-space model in Python
Description of custom statespace models for statsmodels
statsmodels documentation

Thanks to Chad Fulton, the developer of statsmodels!

Recommended books for state space models

We will also introduce recommended books other than those listed above. They are listed in order of relative readability.

Searching for invisible things-that is the practical Bayesian statistics by Bayesian tools This book is probably the easiest book to describe a state-space model in Japan, and I think it's the best way to get started and see what it looks like.
Basics of statistical modeling for forecasting: From introduction to application of Bayesian statistics This book was written by Professor Tomoyuki Higuchi, the current director of the Institute of Statistical Mathematics. It is difficult to predict that there is a description of the state space model from the name of the book, but the state space model is described carefully.
Basics of Time Series Analysis and State Space Models-Theory and Implementation Learned from R and Tan This is Mr. Baba's book that has appeared many times in this post. It is a very easy-to-understand good book. Recommended for those who want to learn by moving their hands (coding).
[Time series analysis-autoregressive model / state space model / anomaly detection-] (https://www.kyoritsu-pub.co.jp/bookdetail/9784320125018) Perhaps this is the only book in Japan that describes the state space model in python. You will be able to code, but it is a little sloppy and difficult to understand. However, I have not touched on a custom model like this post.

Books I bought but haven't read yet. .. ..

Revision history

October 3, 2020: First edition

Custom state space model in Python