Gaussian process with pymc3

I will try to estimate by Gaussian Process using the data on shampoo sales found on this site.

https://machinelearningmastery.com/time-series-datasets-for-machine-learning/

The Gaussian process is a typical nonparametric regression model. It has the advantages of being strong against non-linear data and being able to confirm estimation uncertainty.

import os
import pandas as pd
shampoo_df = pd.read_csv('../data/shampoo.csv')
shampoo_df
shampoo_df.plot.line(x='Month', y='Sales',c="k");

In the following, the marginal likelihoods are calculated to obtain the appropriate likelihood and prior distribution pairs.


with pm.Model() as shampoo_model:
    ρ=pm.HalfCauchy('ρ', 1) #The larger the Length scale, the closer the movements appear to be.<= Autocorrelation
    η = pm.HalfCauchy('η', 1) #The larger the Length scale, the closer the movements appear to be.<= Autocorrelation
    
    M = pm.gp.mean.Linear(coeffs=(shampoo_df.index/shampoo_df.Sales).mean()) #initial value
    K= (η**2) * pm.gp.cov.ExpQuad(1, ρ)

    σ = pm.HalfCauchy('σ', 1)
    gp = pm.gp.Marginal(mean_func=M, cov_func=K) 
    gp.marginal_likelihood("recruits", X=shampoo_df.index.values.reshape(-1,1), 
                           y=shampoo_df.Sales.values, noise=σ)
    trace = pm.sample(1000)

pm.traceplot(trace);

Maximum likelihood estimation creates a good model by maximizing the likelihood function p (X | θ) with respect to the parameter θ, but does not use the prior distribution p (θ). On the other hand, it seems that the case where there is a prior distribution is called maximum posteriori estimation.

Please refer to the following books for details. Atsushi Suyama. Machine Learning Startup Series Introduction to Machine Learning by Bayesian Inference (Japanese Edition) (Kindle Locations 2154-2156). Kindle Edition.

Obtain the optimum parameters by maximal posteriori estimation.


with shampoo_model:
    marginal_post = pm.find_MAP()

{'ρ_log__': array(2.88298779), 'η_log__': array(5.63830367), 'σ_log__': array(4.13876033), 'ρ': array(17.86757799), 'η': array(280.98567051), 'σ': array(62.72501496)}


X_pred=np.array([r for r in range(X_pred.max()+10)]).reshape(-1,1)

with shampoo_model:
    shampoo_pred = gp.conditional("shampoo_", X_pred)
    samples=pm.sample_ppc([marginal_post], vars=[shampoo_pred] ,samples=100)



fig = plt.figure(figsize=(10,5)); 
ax = fig.gca()

from pymc3.gp.util import plot_gp_dist
plot_gp_dist(ax, samples["shampoo_"], X_pred);

ax.plot(shampoo_df.index, shampoo_df.Sales, 'dodgerblue', ms=5, alpha=0.5);
ax.plot(shampoo_df.index, shampoo_df.Sales, 'ok', ms=5, alpha=0.5);
plt.show()

In pm.sample_ppc, sampling is performed from the posterior prediction distribution based on the parameters in marginal_post.