Advantages and disadvantages of maximum likelihood estimation

Introduction

This is my first post. I will summarize the points I was interested in in "Pattern Recognition and Machine Learning" (PRML) that I am currently reading. (Chapter 2 2.1 (p66 ~))

table of contents

  1. What is Bernoulli distribution?
  2. What is maximum likelihood estimation method?
  3. Disadvantages of maximum likelihood estimation method

1. What is Bernoulli distribution?

Let's start with the definition. When the random variable $ X $ follows a Bernoulli distribution with mean $ u $

P(x=1|u)=u,P(x=0|u)=1-u

Meet. Put the two together

P(x|u)=u^x (1-u)^{1-x}

You can also write.

A simple example is a coin with a $ u $ probability of appearing ($ x = 1 $). The following topics will also use coins as an example.

2. What is maximum likelihood estimation method?

How to estimate the average $ u $ from a given sample. With maximum likelihood estimation $ N $ samples

x_1,x_2...x_n

Given, the likelihood function $ L $ defined below

L(u) = \prod_{i=0}^n u^{x_i}(1-u)^{1-x_i}   

Let $ u_ {ML} $ be the maximum estimator for the true mean $ u $.

Let's find $ u $ that actually maximizes the likelihood function $ L $. First, to simplify the equation, we take the logarithm of the likelihood function $ L $.

log(L(u)) = \sum_{i=0}^N x_i log(u) + (1-x_i)log(1-u)

If $ u $ that maximizes $ log (L (u)) $ is $ u_ {ML} $

u_{ML} = \frac{1}{N} \sum_{i=0}^N x_i

This is when $ x = 1 $ is $ m $ in $ N $ trials.

u_{ML} = m

It means that

Let's try the maximum likelihood estimation method using the coin example. Now suppose you want to know the probability that a coin will appear on the table. For the time being, when I threw it about 10 times, the following results were obtained.

Table ・ ・ ・ 3 times
Behind ... 7 times

Follow the above method to find $ u_ {ML} $ that maximizes the likelihood function.

u_{ML} = \frac{1}{N} \sum_{i=0}^N x_i \\
 = \frac{1}{10} \sum_{i=0}^{10} x_i \\
= \frac{3}{7}

Therefore, it was possible to estimate that "the probability that this coin will appear is $ \ frac {3} {7} $".

3. Disadvantages of maximum likelihood estimation method

In the previous section, we found that the output of the maximum likelihood estimation method in the Bernoulli distribution depends on the number of times an event occurred in the trial. The drawback of the maximum likelihood estimation method is that when a coin is tossed three times and all the coins appear, it is estimated that "the probability that this coin will appear is 1". In other words, a small number of trials will cause overfitting.

Recommended Posts

Advantages and disadvantages of maximum likelihood estimation
Maximum likelihood estimation of mean and variance with TensorFlow
Maximum likelihood estimation of various distributions with Pyro
Example of python code for exponential distribution and maximum likelihood estimation (MLE)
Maximum likelihood estimation implementation of topic model in python
Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ① Discrete probability distribution
Let's try again Maximum likelihood estimation and fitting of model (probability distribution) ② Continuous probability distribution
Least squares method and maximum likelihood estimation method (comparison by model fitting)
[Recommendation] Summary of advantages and disadvantages of content-based and collaborative filtering / implementation method
The advantages and disadvantages of Django that people with one year of experience think
Concept of Bayesian reasoning (2) ... Bayesian estimation and probability distribution
Consideration of propensity score and effect estimation accuracy
Until the maximum likelihood estimation method finds the true parameter
[Python] Heron's formula functionalization and calculation of the maximum area
Calculation of odometry using CNN and depth estimation Part 2 (CNN SLAM # 2)
Machine Learning Super Introduction Probability Model and Maximum Likelihood Estimate