Previous ・ Previous, the basics of the stochastic programming language Pyro I introduced how to use it and how to Bayesian infer the parameters of a simple regression model with Pyro. This time, the "Analysis of Variance Model" that appears in Chapter 6 of Part 3 of Reference Books and the "Normal Linear Model" that appears in Chapter 7 I will introduce how to realize "" with Pyro.
Similar to Previous article, the data handled this time is also related to beer sales. Last time, the explanatory variable was temperature, but this time the explanatory variable is ** weather **.
sales | weather | |
---|---|---|
0 | 48.5 | cloudy |
1 | 64.8 | cloudy |
2 | 85.8 | cloudy |
3 | 45 | cloudy |
4 | 60.8 | cloudy |
The qualitative variable of weather is used as an explanatory variable, and the ** analysis of variance model ** is used as a model to explain beer sales. In the reference book, it is explained as follows.
This model assumes that the average value of the response variables changes depending on the weather. A model that uses qualitative data as explanatory variables and assumes a normal distribution for the probability distribution is called an analysis of variance model.
(Shinya Baba. Practical Data Science Series Beginning with R and Stand Introduction to Data Analysis by Bayesian Statistical Modeling (Japanese Edition) (Kindle Position No.3608-3610). Kindle Edition.)
In this problem, there are three types of weather, {sunny, rainy, cloudy}, so a dummy variable ($ \ it {sunny}
sales_i \sim \rm{Normal}(Intercept + \beta_1\it{sunny_i} + \beta_2\it{rainy}_i , sigma^2)
The following shows an implementation that estimates the posterior distribution by MCMC.
import pyro
import pyro.distributions as dist
import pyro.infer as infer
def model(sunny: torch.Tensor, rainy: torch.Tensor, sales: torch.Tensor=None):
intercept = pyro.sample("intercept", dist.Normal(0, 1000))
beta1 = pyro.sample("beta1", dist.Normal(0, 1000))
beta2 = pyro.sample("beta2", dist.Normal(0, 1000))
sigma2 = pyro.sample("sigma2", dist.Uniform(0, 1000))
with pyro.plate("plate", size=sunny.size(0)):
y_ = intercept + beta1 * sunny + beta2 * rainy
y = pyro.sample("y", dist.Normal(y_, sigma2), obs=sales)
return y
#Data read
beer_sales_3 = pd.read_csv('https://github.com/logics-of-blue/book-r-stan-bayesian-model-intro/raw/master/book-data/3-6-1-beer-sales-3.csv')
#Dummy variable
rainy_sunny = pd.get_dummies(beer_sales_3.weather, drop_first=True)
sunny = torch.Tensor(rainy_sunny.sunny)
rainy = torch.Tensor(rainy_sunny.rainy)
sales = torch.Tensor(beer_sales_3.sales)
#Estimating posterior distribution by MCMC
nuts_kernel = infer.NUTS(model, adapt_step_size=True, jit_compile=True, ignore_jit_warnings=True)
mcmc = infer.MCMC(nuts_kernel, num_samples=5000, warmup_steps=200, num_chains=1)
mcmc.run(sunny, rainy, sales)
mcmc.summary()
mean | std | median | 5.0% | 95.0% | n_eff | r_hat | |
---|---|---|---|---|---|---|---|
intercept | 63.06 | 2.4 | 9.03 | 59.12 | 67 | 2435.11 | 1 |
beta1 | 20.02 | 3.39 | 9.03 | 14.12 | 25.27 | 2729.79 | 1 |
beta2 | -0.36 | 3.34 | 9.03 | -5.43 | 5.42 | 2716.48 | 1 |
sigma2 | 16.95 | 0.97 | 16.91 | 15.25 | 18.42 | 3628.93 | 1 |
You can see that it is consistent with the results in the book. The 95% confidence interval for $ beta_1 $ ranges from 14.12 to 25.27, indicating that sunny weather tends to increase sales compared to cloudy weather. On the other hand, since the confidence interval of $ beta_2 $ straddles 0, it can be seen that there is no big difference when it is raining than when it is cloudy.
In the simple regression model, we assumed a model that explains sales by temperature, and in the analysis of variance model, we assumed a model that explains sales by weather. From the modeling results so far, it was found that both temperature and weather are factors related to sales. Here, in order to explain sales, we consider building a model that incorporates both the quantitative variable of temperature and the qualitative variable of weather. The ** normal linear model ** is used in this situation. The explanation by the reference book is as follows.
Regardless of quantitative or qualitative data, models that use multiple explanatory variables for linear predictors, identity functions for link functions, and normal distributions for probability distributions are generally called normal linear models. ..
The model is described as follows. Estimate the latent variables ($ Intercept, \ beta_1, \ beta_2, \ beta_3, sigma ^ 2 $).
sales_i \sim \rm{Normal}(Intercept + \beta_1\it{sunny_i} + \beta_2\it{rainy_i} + \beta_3\it{temperature_i}, sigma^2)
sales | weather | temperature | |
---|---|---|---|
0 | 40.6433 | cloudy | 13.7 |
1 | 99.5527 | cloudy | 24 |
2 | 85.3268 | cloudy | 21.5 |
3 | 69.2879 | cloudy | 13.4 |
4 | 71.0994 | cloudy | 28.9 |
import pyro
import pyro.distributions as dist
import pyro.infer as infer
def model(sunny: torch.Tensor, rainy: torch.Tensor, temperature: torch.Tensor, sales: torch.Tensor=None):
intercept = pyro.sample("intercept", dist.Normal(0, 1000))
beta1 = pyro.sample("beta1", dist.Normal(0, 1000))
beta2 = pyro.sample("beta2", dist.Normal(0, 1000))
beta3 = pyro.sample("beta3", dist.Normal(0, 1000))
sigma2 = pyro.sample("sigma2", dist.Uniform(0, 1000))
with pyro.plate("plate", size=sunny.size(0)):
y_ = intercept + beta1 * sunny + beta2 * rainy + beta3 * temperature
y = pyro.sample("y", dist.Normal(y_, sigma2), obs=sales)
return y
#Data read
beer_sales_4 = pd.read_csv('https://github.com/logics-of-blue/book-r-stan-bayesian-model-intro/raw/master/book-data/3-7-1-beer-sales-4.csv')
#Data transformation
rainy_sunny = pd.get_dummies(beer_sales_4.weather, drop_first=True)
sunny = torch.Tensor(rainy_sunny.sunny)
rainy = torch.Tensor(rainy_sunny.rainy)
temperature = torch.Tensor(beer_sales_4.temperature)
sales = torch.Tensor(beer_sales_4.sales)
#Estimate by variational reasoning
guide = infer.autoguide.guides.AutoDelta(model)
adam_params = {"lr": 1e-1, "betas": (0.95, 0.999)}
oprimizer = pyro.optim.Adam(adam_params)
svi = infer.SVI(model, guide, oprimizer, loss=infer.JitTrace_ELBO(),)
n_steps = 100000
losses = []
param_dict = defaultdict(list)
for step in tqdm(range(n_steps)):
loss = svi.step(sunny, rainy, temperature ,sales,)
update_param_dict(param_dict)
losses.append(loss)
#Display of estimation results
for name in pyro.get_param_store():
print("{}: {}".format(name, pyro.param(name)))
AutoDelta.intercept: 20.227853775024414 AutoDelta.beta1: 29.45983123779297 AutoDelta.beta2: -3.535917043685913 AutoDelta.beta3: 2.547081708908081 AutoDelta.sigma2: 15.715911865234375
The intercept is about 20.2, $ beta_1 $ is about 29.5, $ beta_2 $ is about -3.54, and $ beta_3 $ is about 2.55, which are consistent with the MCMC estimation results in the reference book.
Last time and in this article, Pyro is a linear regression model (simple regression model, ANOVA model, normal linear model) that assumes a normal distribution for the response variables. I have introduced how to realize it in. From the next time, I will introduce a method to realize modeling by Poisson regression model assuming Poisson distribution for the distribution of response variables with Pyro.
Recommended Posts