[Pyro] Stochastic modeling by the stochastic programming language Pyro ③ ~ Analysis of variance model, normal linear model ~

Introduction

PreviousPrevious, the basics of the stochastic programming language Pyro I introduced how to use it and how to Bayesian infer the parameters of a simple regression model with Pyro. This time, the "Analysis of Variance Model" that appears in Chapter 6 of Part 3 of Reference Books and the "Normal Linear Model" that appears in Chapter 7 I will introduce how to realize "" with Pyro.

ANOVA model

Usage data & example introduction

Similar to Previous article, the data handled this time is also related to beer sales. Last time, the explanatory variable was temperature, but this time the explanatory variable is ** weather **.

sales weather
0 48.5 cloudy
1 64.8 cloudy
2 85.8 cloudy
3 45 cloudy
4 60.8 cloudy

download.png

The qualitative variable of weather is used as an explanatory variable, and the ** analysis of variance model ** is used as a model to explain beer sales. In the reference book, it is explained as follows.

This model assumes that the average value of the response variables changes depending on the weather. A model that uses qualitative data as explanatory variables and assumes a normal distribution for the probability distribution is called an analysis of variance model.

(Shinya Baba. Practical Data Science Series Beginning with R and Stand Introduction to Data Analysis by Bayesian Statistical Modeling (Japanese Edition) (Kindle Position No.3608-3610). Kindle Edition.)

In this problem, there are three types of weather, {sunny, rainy, cloudy}, so a dummy variable ($ \ it {sunny} ) representing sunny and a dummy variable ( \ it {rainy} ) representing rain are incorporated. , The model is as follows. Estimate the latent variables ( Intercept, \ beta_1, \ beta_2, sigma ^ 2 $).

sales_i \sim \rm{Normal}(Intercept + \beta_1\it{sunny_i} + \beta_2\it{rainy}_i , sigma^2)

Implementation

The following shows an implementation that estimates the posterior distribution by MCMC.

import pyro
import pyro.distributions as dist
import pyro.infer as infer


def model(sunny: torch.Tensor, rainy: torch.Tensor, sales: torch.Tensor=None):
    intercept = pyro.sample("intercept", dist.Normal(0, 1000))
    beta1 = pyro.sample("beta1", dist.Normal(0, 1000))
    beta2 = pyro.sample("beta2", dist.Normal(0, 1000))
    sigma2 = pyro.sample("sigma2", dist.Uniform(0, 1000))
    with pyro.plate("plate", size=sunny.size(0)):
        y_ = intercept + beta1 * sunny + beta2 * rainy
        y = pyro.sample("y", dist.Normal(y_, sigma2), obs=sales)
    return y

#Data read
beer_sales_3 = pd.read_csv('https://github.com/logics-of-blue/book-r-stan-bayesian-model-intro/raw/master/book-data/3-6-1-beer-sales-3.csv')
#Dummy variable
rainy_sunny = pd.get_dummies(beer_sales_3.weather, drop_first=True)
sunny = torch.Tensor(rainy_sunny.sunny)
rainy = torch.Tensor(rainy_sunny.rainy)
sales = torch.Tensor(beer_sales_3.sales)

#Estimating posterior distribution by MCMC
nuts_kernel = infer.NUTS(model, adapt_step_size=True, jit_compile=True, ignore_jit_warnings=True)
mcmc = infer.MCMC(nuts_kernel, num_samples=5000, warmup_steps=200, num_chains=1)
mcmc.run(sunny, rainy, sales)
mcmc.summary()

mean std median 5.0% 95.0% n_eff r_hat
intercept 63.06 2.4 9.03 59.12 67 2435.11 1
beta1 20.02 3.39 9.03 14.12 25.27 2729.79 1
beta2 -0.36 3.34 9.03 -5.43 5.42 2716.48 1
sigma2 16.95 0.97 16.91 15.25 18.42 3628.93 1

You can see that it is consistent with the results in the book. The 95% confidence interval for $ beta_1 $ ranges from 14.12 to 25.27, indicating that sunny weather tends to increase sales compared to cloudy weather. On the other hand, since the confidence interval of $ beta_2 $ straddles 0, it can be seen that there is no big difference when it is raining than when it is cloudy.

Normal linear model

Usage data & example introduction

In the simple regression model, we assumed a model that explains sales by temperature, and in the analysis of variance model, we assumed a model that explains sales by weather. From the modeling results so far, it was found that both temperature and weather are factors related to sales. Here, in order to explain sales, we consider building a model that incorporates both the quantitative variable of temperature and the qualitative variable of weather. The ** normal linear model ** is used in this situation. The explanation by the reference book is as follows.

Regardless of quantitative or qualitative data, models that use multiple explanatory variables for linear predictors, identity functions for link functions, and normal distributions for probability distributions are generally called normal linear models. ..

The model is described as follows. Estimate the latent variables ($ Intercept, \ beta_1, \ beta_2, \ beta_3, sigma ^ 2 $).

sales_i \sim \rm{Normal}(Intercept + \beta_1\it{sunny_i} + \beta_2\it{rainy_i} + \beta_3\it{temperature_i}, sigma^2)

sales weather temperature
0 40.6433 cloudy 13.7
1 99.5527 cloudy 24
2 85.3268 cloudy 21.5
3 69.2879 cloudy 13.4
4 71.0994 cloudy 28.9

Implementation

import pyro
import pyro.distributions as dist
import pyro.infer as infer


def model(sunny: torch.Tensor, rainy: torch.Tensor, temperature: torch.Tensor, sales: torch.Tensor=None):
    intercept = pyro.sample("intercept", dist.Normal(0, 1000))
    beta1 = pyro.sample("beta1", dist.Normal(0, 1000))
    beta2 = pyro.sample("beta2", dist.Normal(0, 1000))
    beta3 = pyro.sample("beta3", dist.Normal(0, 1000))
    sigma2 = pyro.sample("sigma2", dist.Uniform(0, 1000))
    with pyro.plate("plate", size=sunny.size(0)):
        y_ = intercept + beta1 * sunny + beta2 * rainy + beta3 * temperature
        y = pyro.sample("y", dist.Normal(y_, sigma2), obs=sales)
    return y

#Data read
beer_sales_4 = pd.read_csv('https://github.com/logics-of-blue/book-r-stan-bayesian-model-intro/raw/master/book-data/3-7-1-beer-sales-4.csv')
#Data transformation
rainy_sunny = pd.get_dummies(beer_sales_4.weather, drop_first=True)
sunny = torch.Tensor(rainy_sunny.sunny)
rainy = torch.Tensor(rainy_sunny.rainy)
temperature = torch.Tensor(beer_sales_4.temperature)
sales = torch.Tensor(beer_sales_4.sales)
#Estimate by variational reasoning
guide = infer.autoguide.guides.AutoDelta(model)
adam_params = {"lr": 1e-1, "betas": (0.95, 0.999)}
oprimizer = pyro.optim.Adam(adam_params)

svi = infer.SVI(model, guide, oprimizer, loss=infer.JitTrace_ELBO(),)

n_steps = 100000
losses = []
param_dict = defaultdict(list)
for step in tqdm(range(n_steps)):
    loss = svi.step(sunny, rainy, temperature ,sales,)
    update_param_dict(param_dict)
    losses.append(loss)

#Display of estimation results
for name in pyro.get_param_store():
    print("{}: {}".format(name, pyro.param(name)))

AutoDelta.intercept: 20.227853775024414 AutoDelta.beta1: 29.45983123779297 AutoDelta.beta2: -3.535917043685913 AutoDelta.beta3: 2.547081708908081 AutoDelta.sigma2: 15.715911865234375

The intercept is about 20.2, $ beta_1 $ is about 29.5, $ beta_2 $ is about -3.54, and $ beta_3 $ is about 2.55, which are consistent with the MCMC estimation results in the reference book.

Summary

Last time and in this article, Pyro is a linear regression model (simple regression model, ANOVA model, normal linear model) that assumes a normal distribution for the response variables. I have introduced how to realize it in. From the next time, I will introduce a method to realize modeling by Poisson regression model assuming Poisson distribution for the distribution of response variables with Pyro.

Recommended Posts

[Pyro] Stochastic modeling by the stochastic programming language Pyro ③ ~ Analysis of variance model, normal linear model ~
[Pyro] Statistical modeling by the stochastic programming language Pyro ① ~ What is Pyro ~
[Pyro] Statistical modeling using the stochastic programming language Pyro ② ~ Simple regression model ~
Learn the basics of document classification by natural language processing, topic model
Try using a stochastic programming language (Pyro)
About the Normal Equation of Linear Regression
Summarized the types of sum of squares for analysis of variance