[Psychometrics] Bayesian estimation of "illusion amount" in Müller-Lyer illusion

Introduction

This article is about a loose memo made by a beginner statistics student to keep up with a psychometrics class in Bayesian statistics, so there may be mistakes. I would appreciate it if you could point out. If you are in the same situation, let's do our best together. Contents will be added as appropriate.

TL;DR Point estimation and interval estimation of the parameters of the ** posterior distribution ** of the illusion amount in the ** Muller-Lyer illusion experiment ** were performed by the ** Bayesian estimation method **.

Explanation of terms

What is the Müller-Lyer illusion?

OIP.jpg It is a kind of optical illusion that even if the line segments have the same length, they are perceived longer when they are sandwiched between outward diagonal lines (figure B) than when they are sandwiched between inward diagonal lines (figure A). The difference between the actual line segment length and the perceived line segment length is called the ** illusion amount **.

What is Bayesian inference?

Before moving on to the explanation of Bayesian estimation, we will take ** maximum likelihood estimation method **, which is a representative of the conventional method, as a comparison target, and describe the difference in how to grasp the relationship between the model and the data.

Maximum likelihood estimation method

The relationship between the model and the data in the maximum likelihood estimation method is determined by one ** true value ** (true model), and the data is probabilistic generated from the true model, so it depends on how it is taken. The idea is that it changes stochastically. Taking a coin toss as an example, even if the true value has a probability of 0.5, it is actually 0.8 or 0.3. However, if you repeat the coin toss many times and take the average, it should approach 0.5 ...

What is the model in which this data is most likely to be obtained when the data is fixed? The idea of likelihood is to think that, and the method of estimating the model with the maximum likelihood is the maximum likelihood estimation method.

Bayesian inference

On the other hand, in Bayesian estimation, the true value is considered as a ** probability distribution **. Therefore, the idea is that the data is just information, and if you add more data and update it, you can estimate the distribution (= distribution of true values) that can explain the phenomenon better. At this time, the probability distribution of subjective beliefs about where the population parameter is before viewing the data is called the ** prior distribution **, and the distribution of the population parameter updated after obtaining the data is **. It is called the posterior distribution **. Bayesian statistics use Bayes' theorem to model phenomena. The posterior distribution is simply calculated by the following formula.

** Post-distribution ** = (prior distribution x likelihood) / data distribution (normalization constant)

The maximum likelihood estimation method estimates the true value from only the likelihood (correctly, a uniform distribution is applied), but in the Bayesian estimation method, the likelihood is affected by the prior distribution, and the posterior distribution is the likelihood. Is estimated from the product of the prior distribution. Therefore, in order to ensure objectivity and fairness, it is a good idea to select ** non-information prior distribution ** that does not affect the posterior distribution as much as possible for the prior distribution used in the analysis.

For details, a faculty member of a university specializing in social psychology found in the written article that there are differences from the conventional maximum likelihood estimation method, as well as advantages and disadvantages. It is written in an easy-to-understand manner, so please refer to it (~~ The above explanation is almost plagiarism of the quoted article ~~).

MCMC method

The posterior distribution can be derived by (prior distribution x likelihood) / data distribution. However, it seems difficult to analytically solve the distribution of data (integral of normalized constants). Therefore, in Bayesian estimation, an algorithm called "MCMC method" is used to ** generate a parameter that follows the posterior distribution as a random number **, and obtain the posterior distribution as if it were a data distribution. The obtained random number sequence is called ** chain **.

What I want to do this time

After a brief explanation of the terms, I would like to confirm what I want to do this time.

This time, the parameters of the posterior distribution after being observed in the Müller-Lyer illusion experiment

―― 1. ** Point estimation ** of $ \ mu $ (What is the average $ \ mu $ of "illusion amount"?) ―― 2. ** Interval estimation ** of $ \ mu $ (What mm to what mm is the average $ \ mu $ of "illusion amount"?) ―― 3. ** Point estimation ** of $ \ sigma $ (What is the average perceptual dispersion of "illusion amount" $ \ sigma $?) ―― 4. ** Interval estimation ** of $ \ sigma $ (What mm to what mm is the average perceptual dispersion of "illusion amount" $ \ sigma $?)

I want to! So, we will analyze using the Bayesian inference method.

Setting

As a result of 10 trials of the Müller-Lyer illusion for a participant, the ** illusion amount ** (difference between the actual line segment length and the perceived length) for the figure A is 23,21,19,22,20,17,22,18,20,25 It seems that it was. This time, we will use this data for analysis.

Operating environment

analysis

Preparation

import numpy as np
import scipy as sp
import scipy.stats as stats
import pandas as pd

from IPython.core.pylabtools import figsize
from matplotlib import pyplot as plt

Data preparation

observed_data_list =  [23,21,19,22,20,17,22,18,20,25]
observed_data = pd.Series(observed_data_list)

Sampling of posterior distribution

A Python library called PyMC3 is used to sample the posterior distribution. This time, we assume a normal distribution model for the posterior distribution. Therefore, there are two parameters to guess: $ \ mu $ and $ \ sigma $.

Here, the idea of the Bayesian inference method was that the parameters are also some kind of probability distribution, not fixed values. Therefore, the parameters $ \ mu $ and $ \ sigma $ also need to assume some prior distribution. The prior distribution of the population mean $ \ mu $ and the population standard deviation $ \ sigma $ is assumed to be uniform, and the parameters are set in a sufficiently wide range so as not to be subjective.

In addition, the number of random numbers generated this time is 25,000, and 5 chains are generated.

import pymc3 as pm

with pm.Model() as model:
  #Prior distribution
  mu = pm.Uniform('mu', 10, 35)
  sigma = pm.Uniform('sigma', 1, 9)
  #Likelihood
  ml = pm.Normal('ml', mu=mu, sd=sigma, observed=observed_data)
  #sampling
  trace_g = pm.sample(25000)

Burn-in

Since it is highly possible that the first 5000 random numbers generated are not random numbers that follow the posterior distribution, they are discarded (burn-in).

chain_g = trace_g[5000:]

Illustration of the estimated posterior distribution

pm.traceplot(chain_g)

plt.figure()

image.png

Calculation of posterior distribution summary statistics

pm.summary()

コメント 2020-01-31 182457.jpg

You can use pm.summary () to calculate the summary statistics of the posterior distribution. $ \ Hat {R} $ is an index to judge whether or not a random number that matches the MCMC model is generated, and it seems that the closer it is to 1, the better.

result of analysis

1. Point estimation of the mean value of the posterior distribution

The mean value of the posterior distribution is called (= ** expected a posteriori, EAP). From the table, the EAP for ** $ \ mu $ was $ \ hat {\ mu_ {eap}} = $ 21.39mm **.

2. Interval estimation of the mean value of the posterior distribution

Referencing the 2.5% and 97.5% points in the table, the two-sided conviction interval for ** $ \ mu $ was [18.83mm, 24.85mm] **.

3. Point estimation of standard deviation of posterior distribution

From the table, the EAP for ** $ \ sigma $ was $ \ hat {\ sigma_ {eap}} $ = 4.00mm **.

4. Interval estimation of standard deviation of posterior distribution

Referencing the 2.5% and 97.5% points in the table, the bilateral conviction interval for ** $ \ sigma $ was [2.00mm, 6.61mm] **.

At the end

I started studying Bayesian statistics because it was a break from the hypothesis test, but when I was reading books, words such as ** chain ** and ** burn-in ** flew around, and the procedure until I got the parameters. Because it is complicated, there are often places like ** "What is this doing after all?" ** on the way. Burn with statistical software until now! From the body that calculated the result by brain death, ** Is it necessary to do this kind of effort just by estimating the parameters of the population ... ** and the brain is on the verge of a flat tire.

[Addition] Apparently, there are discussions in the Bayesian statistics area due to differences in ideas and interpretations. Therefore, it is recommended that you view the content of this article as a single memo.

References

[Hideki Toyoda "New Revised Psychological Statistics Method-Escape from the Significance Test (The Open University of Japan Teaching Materials)](https://www.amazon.co.jp/%E5%BF%83%E7%90%86%E7% B5% B1% E8% A8% 88% E6% B3% 95% E2% 80% 95% E6% 9C% 89% E6% 84% 8F% E6% 80% A7% E6% A4% 9C% E5% AE% 9A% E3% 81% 8B% E3% 82% 89% E3% 81% AE% E8% 84% B1% E5% 8D% B4-% E6% 94% BE% E9% 80% 81% E5% A4% A7 % E5% AD% A6% E6% 95% 99% E6% 9D% 90-% E8% B1% 8A% E7% 94% B0-% E7% A7% 80% E6% A8% B9 / dp / 4595317050 / ref = sr_1_1? __ mk_ja_JP =% E3% 82% AB% E3% 82% BF% E3% 82% AB% E3% 83% 8A & keywords =% E5% BF% 83% E7% 90% 86% E7% B5% B1% E8 % A8% 88% E6% B3% 95 & qid = 1580456205 & sr = 8-1) https://norimune.net/708

Recommended Posts

[Psychometrics] Bayesian estimation of "illusion amount" in Müller-Lyer illusion
Variational Bayesian estimation of mixed Gaussian distribution
Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution
"Linear regression" and "Probabilistic version of linear regression" in Python "Bayesian linear regression"
Maximum likelihood estimation implementation of topic model in python
Variational Bayesian inference implementation of topic model in python