Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ② Continuous probability distribution

Introduction

Previous content: Maximum likelihood estimation of discrete probability distribution and model fitting This time, we will apply the maximum likelihood estimation of the continuous probability distribution and the model (probability distribution).

Continuous probability distribution: normal distribution

The probability function that represents the normal distribution is as follows. $ P (y) = \ frac {1} {\ sqrt {2πσ ^ 2}} e ^ {(-\ frac {(y_i- \ mu) ^ 2} {2 \ sigma ^ 2})} (\ mu , \ sigma = parameter) $

The test data for estimating and fitting the model is created in advance from the probability distribution of the target so that the answers can be matched.

Python


"""Random numbers from the normal distribution"""
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(seed=10)
norm_values = np.random.normal(0, 1, size=1000) #Average 0,1000 extracts from normal distribution with standard deviation 1
#drawing
plt.hist(norm_values, bins=50, range=[-5, 5], normed=True)
bin_edges

image.png Since it is a continuous variable, it cannot be counted like a discrete type. Therefore, it is counted by dividing it into 0.2.

Now, let's apply a normal distribution to such data and estimate its parameters with maximum likelihood.

Python


"""Parameter estimation by partial differentiation of the log-likelihood function"""
import sympy
#Define variables (v=σ**2)
sympy.var('μ v y')
#Likelihood p(Parameters|x)Define
fe=(1/sympy.sqrt(2*sympy.pi*v))*sympy.exp(-(y-μ)**2/(2*v))
#Logarithmic
logf=sympy.log(fe)
#Partially differentiate f and expand the equation
pdff1 = sympy.expand(sympy.diff(logf, μ)) #Partial differential with respect to μ
pdff2 = sympy.expand(sympy.diff(logf, v)) #Partial differential with respect to v

The formula pdff1 that partially differentiates the log-likelihood with respect to μ is $ \ frac {y} {v}-\ frac {\ mu} {v} $, and the formula pdff2 that partially differentiates the log-likelihood with respect to v is $ \ frac {-1} {2v } + \ frac {y ^ 2} {2v ^ 2}-\ frac {y \ mu} {v ^ 2} + \ frac {\ mu ^ 2} {2v ^ 2} $ That's right. Since $ \ sum pdff1 = \ sum pdff2 = 0 $ μ, v can be found.

Python


def L_sympy(fmu,fs,var,values):       
    likelihood_mu = 0 #Initial value of likelihood
    likelihood_s = 0 #Initial value of likelihood
    for i in np.arange(len(values)):
        # likelihood
        likelihood_mu += fmu.subs(var,values[i]) #Assign a value to x
        likelihood_s += fs.subs(var,values[i]) #Assign a value to x
    param = sympy.solve([likelihood_mu,likelihood_s]) #Solve the equation
    return param

parameters = L_sympy(pdff1,pdff2,"y",norm_values)
parameters[0]["σ"]=sympy.sqrt(parameters[0][v])
parameters

[{v: 0.879764754284410, μ: -0.0145566356154705, 'σ': 0.937957757196138}]

It's almost the same as the average μ = 0 and σ = 1.

When the statistical model becomes complicated

For details, see the previous ①. Use the method of solving nonlinear equations.

Python


"""Probability density function"""
def probability_function(x,param):
    from scipy.special import factorial
    # param[0]s,param[1]mu:Parameters
    # y:Data y
    # return:Probability density P of data y(y|Parameters) 
    return (1/np.sqrt(2*np.pi*param[1]**2))*np.exp(-0.5*(y-param[0])**2/param[1]**2)

"""Log-likelihood function (continuous)"""
def L_func_c(param,y):       
    likelihood = 0 #Initial value of likelihood
    for i in np.arange(len(y)):
        # model output 
        p = probability_function(y[i], param)
        #likelihood likelihood
        likelihood += -np.log(p) #Likelihood
    return likelihood

"""Parameter estimation"""
"""
Quasi-Newton method (BFGS, L)-BFGS method: Memory can be saved in complicated cases)
"""
from scipy import optimize
x0 = [0,0.1] #Initial values of parameters
bound = [(-100, 100),(0, None)]#,(0,None)] #Scope of optimal parameter search(min,max)

params_MLE = optimize.minimize(L_func_c,x0,args=(norm_values),method='l-bfgs-b',
                        jac=None, bounds=bound, tol=None, callback=None,
                        options={'disp': None, 'maxls': 20, 'iprint': -1,
                                 'gtol': 1e-05, 'eps': 1e-08, 'maxiter': 15000,
                                 'ftol': 2.220446049250313e-09, 'maxcor': 10,
                                 'maxfun': 15000})
#Maximum likelihood estimation parameters
print('Parameter μ,σ:',params_MLE.x)
#Number of parameters
k=3
#AIC (the model with the smallest value is a good fit)
print('AIC:',params_MLE.fun*(2)+2*k)

Parameter μ,σ: [-0.01455655  0.93795781]
AIC: 2715.7763344850605

I want to know the parameters quickly ...

For details, see the previous ① . Use the method of solving nonlinear equations.

Python


"""scipy.stats.There is also a parameter estimation with fit"""
from scipy.stats import norm
fit_parameter = norm.fit(norm_values)
fit_parameter

#Parameter μ,σ
(-0.014556635615470447, 0.9379577571961389)

Apply to the data and try to illustrate

Let's apply the model (normal distribution) to the data with the estimated parameters μ = -0.01455655, σ = 0.93795781.

Python


acc_mle = probability_function(np.sort(norm_values), params_MLE.x)
plt.hist(norm_values, bins=50, range=[-5, 5], normed=True)
plt.plot(np.sort(norm_values), acc_mle)

image.png

It feels good.

Recommended Posts

Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ② Continuous probability distribution
Advantages and disadvantages of maximum likelihood estimation
Least squares method and maximum likelihood estimation method (comparison by model fitting)
Example of python code for exponential distribution and maximum likelihood estimation (MLE)
Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution
Maximum likelihood estimation implementation of topic model in python
Maximum likelihood estimation of mean and variance with TensorFlow
Machine Learning Super Introduction Probability Model and Maximum Likelihood Estimate
Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ② Continuous probability distribution
Advantages and disadvantages of maximum likelihood estimation
Least squares method and maximum likelihood estimation method (comparison by model fitting)
Example of python code for exponential distribution and maximum likelihood estimation (MLE)
Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution
Maximum likelihood estimation implementation of topic model in python
Maximum likelihood estimation of mean and variance with TensorFlow
Machine Learning Super Introduction Probability Model and Maximum Likelihood Estimate
Maximum likelihood estimation of various distributions with Pyro
Maximum likelihood estimation of various distributions with Pyro