Introduction

This series is a brief explanation of "Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa, and let's implement the contents in python. I used Google Colaboratory (hereinafter referred to as Colab) for implementation. If you have any suggestions, I would appreciate it if you could write them in the comment section. It may not be suitable for those who want to understand all the contents of the book properly because it is written with a stance that it would be nice if it could be output by touching only the part that I thought needed explanation. Please note that if the formula numbers and proposition / definition indexes are written according to the book, the numbers may be skipped in this article.

Overview of Chapter 1

It starts with defining the probability in set theory. Even those who don't care about such a definition can read this book. The author himself focuses on mathematical statistics and only briefly touches on the theoretical system of probability theory. The contents of Chapter 1 are mainly ・ Definition of probability ・ Explanation of terms ・ Bayes' theorem is. 1.3 In Evolutionary Matters, we use what is written on the previous page to prove the proposition (easy to follow).

word

$$ Probability is the mathematical description of random phenomena in the world. Let's take a cubic dice as an example. Trial: I threw the dice once. All events / sample space $ \ Omega $: $ \ Omega $ = \ {1,2,3,4,5,6 } Event: A subset of $ \ Omega $, such as \ {1 }, \ {3,6 }. In this book, it is represented by $ A $ or $ B $. Intersection: $ A \ cap B = $ \ {$ x | x \ in A and x \ in B $ } $$ Union: $ A \ cup B = $ \ {$ x | x \ in A or x \ in B $ } $$ ・・・ The complement set, the difference set, the symmetric difference, etc. are easy to understand by drawing a Venn diagram.

Definition

Probability is defined by the following three. >

(P2)P(\Omega)=1. (P3) If $ A_k \ in \ mathcal {B}, k = 1,2, ..., $ are contradictory to each other, that is, $ A_i \ cap A_j = \ emptyset, i \ neq j $, then $ P (\ bigcup_ {k = 1} ^ {\ infinty} {A_k}) = \ sum_ {n = 1} ^ \ infinty P (A_k) $ holds.

$ \ mathcal B $ is a measurable family of sets and satisfies the following properties. (M1) \emptyset\in\mathcal{B} , \Omega\in\mathcal{B}. (M2) If $ A \ in \ mathcal {B} $, then $ A ^ c \ in \ mathcal {B} $ ($ A ^ c $ is the complement of $ A $). (M3) $ A_k \ in \ mathcal {B}, k = 1,2,3 ..., $ then $ \ bigcup_ {k = 1} ^ {\ infinty} {A_k} \ in \ mathcal B $.

It seems to be difficult, but if you dig deeper here, you will not know the landing point, so I think $ \ mathcal B $ is a set consisting of a subset of the sample space $ \ Omega $, and $ A $ is a subset of $ \ Omega $. please. As with (P3), it will often be enough to represent it in a Venn diagram in the future. $ \ Bigcup $ is a symbol that takes the union of the family of sets. $ \bigcup_{k=1}^{n}{A_k} = A_1 \cup A_2 \cup ...\cup A_n$ In short, the probability $ P $ is a function of the set $ A $ (example: what is the probability of rolling a dice and getting 2? Answer: $ P (A = $ \ {2 } $) = 1/6 $), P1 ~ It meets P3. (P1) Probability is always 0 or more (P2) The probability that any of all events will occur is 1 (any will always occur) (P3) When the events being considered are contradictory to each other (the set is not covered when drawn in the Venn diagram), the probability that an event in the union part of the event will occur can be expressed by the sum of the probabilities that each event will occur. It is a way of thinking. I've written a lot after that, but I don't think it makes sense to read it until you've learned measure theory. I wrote a lot, but in the end it was the usual probability. I think it's good if you have the consciousness that probability is a function.

Bayes' theorem

There is something I need to explain to understand Bayes' theorem.

・ Conditional probability

Definition:

When there are two events $ A $ and $ B $ and $ P (B)> 0 , $ P(A|B)=\frac{P(A \cap B)}{P(B)} \tag{1.1a} $$ Is called the conditional probability of $ A $ when $ B $ is given.

$ P (A | B) $ means the probability that event $ A $ will occur under the condition that event $ B $ will occur. It also holds true if you swap $ A $ and $ B . Equation (1.1a) transposed $P(A \cap B) = P(A|B)P(B) \tag{1.1b}$$ Is also convenient.

・ Formula of all probabilities

When> $ B_1, B_2 ... $ are contradictory events and $ P (B_k)> 0, \ bigcup_ {k = 1} ^ {\ infinty} {B_k} = \ Omega $ is satisfied, the event $ A The probability of $ can be expressed as follows. $P(A)=\sum_{k=1}^\infty P(A|B_k)P(B_k) \tag{1.2}$

It says, but I think it's easy to understand if you look at the figure below. Try to prove it by using equation (1.1b) well.

(As $ k \ to \ infinty $)![795316b92fc766b0181f6fef074f03fa-9.png](https://qiita-image-store.s3.ap-northeast-1.amazonaws.com/0/594672/fb4318be-1240- c824-aec9-23633cdb55b6.png) (The image is quoted from https://bellcurve.jp/statistics/course/6444.html. Special thanks!)

・ Main subject

The Bayes' theorem can be proved by combining the conditional probability formula with the full probability formula. The exact expression of Bayes' theorem is as follows.

Let $ B_1, B_2 ... $ be a sequence of contradictory events, and satisfy $ P (B_k)> 0, \ bigcup_ {k = 1} ^ {\ infinty} {B_k} = \ Omega $. At this time, the conditional probability $ P (B_j | A) $ of $ B_j $ when $ A $ is given for any event $ A $ is expressed as follows. $P(B_j|A)= \frac{P(A|B_j)P(B_j)}{\sum_{k=1}^\infty P(A|B_k)P(B_k)}$

I think it's easier to understand if you draw a diagram. It can be derived by playing with the right-hand side of the conditional probability expression. Bayes' theorem estimates the cause from the result (when the event (result) of $ A $ occurs, what is the probability that it is caused by $ B_j (j = 1,2, ...) $) An important idea is.

That is all for the explanation of Chapter 1. I think that you can understand the example of the book by moving your hand by yourself, so if you have a book, please try it.

Let's run python

I made an example appropriately, so I will solve it with python.

Example: There are 20 corydoras, 7 guppy and 9 neon tetra in the aquarium. Normally, the probability of feeding a corydoras with insects is 0.1, the probability of eating guppy is 0.8, and the probability of neon tetra eating insects is 0.3. One day, when I closed my eyes and threw an insect at the time of feeding, the insect was eaten by a bite. What is the probability that a guppy ate an insect at this time?

Let's solve this

Corydoras = 0
Guppy  = 1
Neontetora = 2

#Probability that each fish usually eats insects
_p_eat = []
_p_eat.insert(Corydoras,0.1)
_p_eat.insert(Guppy,0.8)
_p_eat.insert(Neontetora,0.3)

#Fish ratio
_p_fish = []
_p_fish.insert(Corydoras,20/36)
_p_fish.insert(Guppy,7/36)
_p_fish.insert(Neontetora,9/36)

#Posterior probability
def prob_eat(fish):
   if int(fish) == 0 :
      return _p_eat[fish]*_p_fish[fish]
   elif int(fish) == 1 :
      return _p_eat[fish]*_p_fish[fish]
   else:
      return _p_eat[fish]*_p_fish[fish]

#Probability that 〇〇 ate an insect
def probability(fish):
    return prob_eat(fish) / (prob_eat(Corydoras) + prob_eat(Guppy) + prob_eat(Neontetora))#Ingenuity is required when the total number is large

print(round(probability(Guppy),2))

When you do this,

0.54

Then, the probability that the guppy ate it came out.

Chapter 1 looks like this. If you feel like it, I will do it in the next chapter and beyond. Thank you very much.

References

"Basics of Modern Mathematical Statistics" by Tatsuya Kubokawa

[Basics of Modern Mathematical Statistics with python] Chapter 1: Probability