Introduction

I tried to visualize the asymptotic property of the probability distribution represented by the central limit theorem with Python. The asymptotic properties are mainly excerpted from the book "Statistics for the Officially Certified Statistical Test Level 1 of the Japan Statistical Society".

Central limit theorem

Suppose the probability distribution $ X_i (i = 1, \ cdots, n) $ follows an independent identical distribution with mean $ \ mu $ and variance $ \ sigma ^ 2 $. When $ \ frac {\ sum_ {i = 1} ^ nX_i} {n} $ is $ n \ to \ infinity $, the mean $ \ mu $, variance $ \ frac {\ sigma ^ 2} {n} $ Follow a normal distribution.

Experiment

Since it is a big deal, we will carry out the experiment with a distorted distribution. $ X_i (i = 1, \ cdots, n) $ density function

f(x) = 11 x^{10}\ \ \ (0\leq x\leq1)

And.

The blue histogram is the numerical value of the density function of $ \ frac {\ sum_ {i = 1} ^ nX_i} {n} $, and the solid orange line is the density function of the normal distribution that theoretically converges. ..

n=1
n=2
If $ n = 10 $

Click here for Python source code when $ n = 1 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

#Probability density function average 11/12,Variance 11/13-(11/12)^2
def f(x):
    return 11 * (x ** 10)
mu = 11 / 12
sigma2 = 11 / 13 - (11 / 12) ** 2
#Cumulative distribution function
def F(x):
    return x ** 11
#Inverse function of cumulative distribution function
def F_inv(x):
    return x ** (1 / 11)

n = 1
xmin = 0.6
xmax = 1.2
meanX = []
for _ in range(100000):
    meanX.append(np.sum(F_inv(np.random.rand(n))) / n)
plt.hist(meanX, bins=200, density=True)
s = np.linspace(xmin, xmax, 100)
plt.plot(s, norm.pdf(s, mu, (sigma2 / n) ** 0.5), linewidth=4)
plt.xlim(xmin, xmax)

Statistical test statistic of goodness of fit test (1)

Suppose $ (X_1, X_2, \ cdots, X_m) $ follows a multinomial distribution with $ n $ trials and $ (p_1, p_2, \ cdots, p_m) $ probabilities.

\sum_{i=1}^m\frac{(X_i-np_i)^2}{np_i}

Follows the $ \ chi ^ 2 $ distribution of $ (m-1) $ degrees of freedom when $ n \ to \ infinity $.

Experiment

Probability is $ \ big (\ frac {1} {16}, \ frac {1} {4}, \ frac {1} {4}, \ frac {7} {16} \ big) $ Think. Obtained numerically

\sum_{i=1}^4\frac{(X_i-np_i)^2}{np_i}

Is shown by a blue histogram, and the $ \ chi ^ 2 $ distribution with 3 degrees of freedom, which is theoretically converged, is shown by a solid orange line.

n=4
n=16
n=400

Click here for the Python source code when $ n = 4 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import chi2
p = [1/16, 1/4, 1/4, 7/16]
n = 4
xmin = 0
xmax = 15
xx = []
for _ in range(100000):
    r = np.random.rand(1, n)
    x = [0] * 4
    x[0] = np.sum(r < sum(p[:1]))
    x[1] = np.sum(np.logical_and(sum(p[:1]) <= r, r < sum(p[:2])))
    x[2] = np.sum(np.logical_and(sum(p[:2]) <= r, r < sum(p[:3])))
    x[3] = np.sum(sum(p[:3]) <= r)
    xx.append(sum([(x[i] - n * p[i]) ** 2 / (n * p[i]) for i in range(4)]))
plt.hist(xx, bins=int(max(xx))*5, density=True)
s = np.linspace(xmin, xmax, 100)
plt.plot(s, chi2.pdf(s, 3), linewidth=4)
plt.xlim(xmin, xmax)

Statistical test statistic of goodness of fit test (Part 2)

The true probability $ p_i (i = 1,2, \ cdots, m) $ is unknown, but $ p_i $ is a $ l $ dimensional parameter $ \ boldsymbol {\ theta} (l \ leq m-2) $ Suppose you know that you can express it with. When the maximum likelihood estimator of $ p_i $ is $ \ hat {p} _i $,

\sum_{i=1}^m\frac{(X_i-n\hat{p}_i)^2}{n\hat{p}_i}

Follows the $ \ chi ^ 2 $ distribution of $ (m-l-1) $ degrees of freedom when $ n \ to \ infinity $.

Experiment

Probability is $ \ big (\ frac {1} {16}, \ frac {1} {4}, \ frac {1} {4}, \ frac {7} {16} \ big) $ Think. The true $ p $ is unknown, but only $ p_2 = p_3 $ is known. At this time, it can be expressed as $ p = [q, r, r, 1-2r-q] $. Find $ q, r $ by maximum likelihood estimation.

\sum_{i=1}^m\frac{(X_i-n\hat{p}_i)^2}{n\hat{p}_i}

Is shown in blue histogram, and the $ \ chi ^ 2 $ distribution with 1 degree of freedom, which is theoretically converged, is shown in orange.

n=4
n=100
n=10000

Click here for the Python source code when $ n = 4 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import chi2
n = 4
xmin = 0
xmax = 3
xx = []
for _ in range(100000):
    r = np.random.rand(1, n)
    x = [0] * 4
    x[0] = np.sum(r < sum(p[:1]))
    x[1] = np.sum(np.logical_and(sum(p[:1]) <= r, r < sum(p[:2])))
    x[2] = np.sum(np.logical_and(sum(p[:2]) <= r, r < sum(p[:3])))
    x[3] = np.sum(sum(p[:3]) <= r)
    p_ = [0] * 4
    p_[0] = x[0] / sum(x)
    p_[1] = (x[1] + x[2]) / (2 * sum(x))
    p_[2] = (x[1] + x[2]) / (2 * sum(x))
    p_[3] = x[3] / sum(x)  
    xx.append(sum([(x[i] - n * p_[i]) ** 2 / (n * p[i]) for i in range(4)]))
plt.hist(xx, bins=int(max(xx))*20, density=True)
s = np.linspace(xmin, xmax, 100)
plt.plot(s, chi2.pdf(s, 1), linewidth=4)
plt.xlim(xmin, xmax)

Asymptotic normality of maximum likelihood estimator (1)

For the random variable $ X_i (i = 1, \ cdots, n) $ characterized by the parameter $ \ theta $, $ \ theta_0 $ is the true parameter, $ \ hat \ theta $ is the maximum likelihood estimate, Fisher information. Amount $ J_n (\ theta) $

J_n(\theta)=E_\theta\Big[\Big(\frac{\delta}{\delta\theta}\log f(X_1,...,X_n;\theta)\Big)^2\Big]

And. When $ \ hat \ theta $ is $ n \ to \ infinity $, it follows a normal distribution with mean $ \ theta_0 $ and variance $ J_n (\ theta_0) ^ {-1} $.

Experiment

$ X_i (i = 1, \ cdots, n) $ density function

f(x) = (\theta+1) x^{\theta}\ \ \ (0\leq x\leq1)

And. Let the true $ \ theta $ be 10.

The experimentally obtained distribution of $ \ hat \ theta $ is shown in blue histogram, and the normal distribution that is theoretically converged is shown in orange.

n=1
n=10
n=100

Click here for the Python source code when $ n = 1 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
#Probability density function average 11/12,Variance 11/13-(11/12)^2
def f(x):
    return 11 * (x ** 10)
mu = 11 / 12
sigma2 = 11 / 13 - (11 / 12) ** 2
#Cumulative distribution function
def F(x):
    return x ** 11
#Inverse function of cumulative distribution function
def F_inv(x):
    return x ** (1 / 11)

n = 1
theta_min = -20
theta_max = 40
theta = []
for _ in range(100000):
    x = F_inv(np.random.rand(n))
    theta.append(- n / sum(np.log(x)) - 1)
theta = np.array(theta)

#theta histogram
th = np.linspace(theta_min, theta_max, 100)
hi = np.array([np.sum(np.logical_and(th[i] < theta, theta <= th[i + 1])) for i in range(100 - 1)]) / (len(theta) * (th[1] - th[0]))
plt.bar((th[:-1] + th[1:]) / 2, hi, width=(th[1] - th[0]))
#normal distribution
s = np.linspace(theta_min, theta_max, 100)
plt.plot(s, norm.pdf(s, 10, (11 ** 2 / n) ** 0.5), 'tab:orange', linewidth=4)
plt.xlim(theta_min, theta_max)
plt.ylim(0.0, 0.1)

Asymptotic normality of maximum likelihood estimator (2)

Let $ g $ be a differentiable function with $ \ theta_0 $. When $ g (\ hat \ theta) $ is $ n \ to \ infty $, mean $ g (\ theta_0) $, variance $ g ^ \ prime (\ theta_0) ^ 2 J_n (\ theta_0) ^ {-1 } Follows a normal distribution of $. However, the definitions of $ \ theta $, $ \ hat \ theta $, and $ \ theta_0 $ are the same as in Part 1.

Experiment

We will continue the experiment of Part 1. Define $ g $ with the following formula.

g(\theta)=\frac{1}{\theta}

The experimentally obtained distribution of $ g (\ hat \ theta) $ is shown in blue histogram, and the normal distribution that is theoretically converged is shown in orange.

n=1
n=10
n=100

Click here for Python source code when $ n = 1 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

#Probability density function average 11/12,Variance 11/13-(11/12)^2
def f(x):
    return 11 * (x ** 10)
mu = 11 / 12
sigma2 = 11 / 13 - (11 / 12) ** 2
#Cumulative distribution function
def F(x):
    return x ** 11
#Inverse function of cumulative distribution function
def F_inv(x):
    return x ** (1 / 11)

n = 1
theta_min = -0.2
theta_max = 0.6
theta = []
for _ in range(100000):
    x = F_inv(np.random.rand(n))
    theta.append(1 / (- n / sum(np.log(x)) - 1))
theta = np.array(theta)
th = np.linspace(theta_min, theta_max, 100)
hi = np.array([np.sum(np.logical_and(th[i] < theta, theta <= th[i + 1])) for i in range(100 - 1)]) / (len(theta) * (th[1] - th[0]))
plt.bar((th[:-1] + th[1:]) / 2, hi, width=(th[1] - th[0]))
s = np.linspace(theta_min, theta_max, 100)
plt.plot(s, norm.pdf(s, 1/10, (11 ** 2 / n / 10 ** 4) ** 0.5), 'tab:orange', linewidth=4)
plt.xlim(theta_min, theta_max)

Statistical test statistic for likelihood ratio test

Suppose there is a probability distribution $ f (x; \ theta) $ characterized by the parameter $ \ theta $. Hypothesis test problem

H_0:\theta\in\Theta_0 vs. $H_1:\theta\in\Theta_1 $

In, the likelihood ratio $ L $ is defined by the following equation.

L=\frac{\sup_{\theta\in\Theta_1} f(x_1,\cdots,x_n;\theta)}{\sup_{\theta\in\Theta_0} f(x_1,\cdots,x_n;\theta)}

$ 2 \ log L $ follows the $ \ chi ^ 2 $ distribution with $ p $ degrees of freedom when $ n \ to \ infinity $ under the null hypothesis. However, $ p ＝ \ dim (\ Theta_1)-\ dim (\ Theta_0) $

Experiment

f(x;\theta) = (\theta+1) x^{\theta}\ \ \ (0\leq x\leq1)

As a hypothesis test problem

H_0:\theta\=10 vs. $H_1\neq10 $

And.

Under the null hypothesis, the density function of $ 2 \ log L $ is numerically obtained by the blue histogram, and the density function of the $ \ chi ^ 2 $ distribution, which is theoretically converged, is the solid orange line.

If $ n = 1 $
If $ n = 10 $

Click here for the source code when $ n = 1 $.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import chi2
#True distribution: Probability density function mean 11/12,Variance 11/13-(11/12)^2
def f(x):
    return 11 * (x ** 10)

#Parametric model
def f(x, theta):
    return (theta + 1) * (x ** theta)

#Cumulative distribution function
def F(x):
    return x ** 11

#Inverse function of cumulative distribution function
def F_inv(x):
    return x ** (1 / 11)

n = 1
l_min = 0
l_max = 5
l = []
for _ in range(100000):
    x = F_inv(np.random.rand(n))
    theta = - n / sum(np.log(x)) - 1  #Maximum likelihood estimator
    l.append(2 * np.log(np.prod(f(x, theta)) / np.prod(f(x, 10))))
l = np.array(l)
#l histogram
th = np.linspace(l_min, l_max, 100)
hi = np.array([np.sum(np.logical_and(th[i] < l, l <= th[i + 1])) for i in range(100 - 1)]) / (len(l) * (th[1] - th[0]))
plt.bar((th[:-1] + th[1:]) / 2, hi, width=(th[1] - th[0]))
#Chi-square distribution
s = np.linspace(l_min, l_max, 100)
plt.plot(s, chi2.pdf(s, 1), 'tab:orange', linewidth=4)
plt.xlim(l_min, l_max)
plt.ylim(0, 2)

Check the asymptotic nature of the probability distribution in Python

Introduction

Central limit theorem

Experiment

Statistical test statistic of goodness of fit test (1)

Experiment

Statistical test statistic of goodness of fit test (Part 2)

Experiment

Asymptotic normality of maximum likelihood estimator (1)

Experiment

Asymptotic normality of maximum likelihood estimator (2)

Experiment

Statistical test statistic for likelihood ratio test

Experiment