First, consider the example of undistorted 1-6 dice: game_die :.
Since the dice rolls 1 to 6 are equally probable (there is no bias in each roll), each roll can be given with the following probabilities.
$
P (probability of getting a 1) = \ frac {1} {6} \ qquad
P (probability of getting a 2) = \ frac {1} {6} \ qquad
P (probability of getting a 3) = \ frac {1} {6} \
P (probability of getting a 4) = \ frac {1} {6} \ qquad
P (probability of getting a 5) = \ frac {1} {6} \ qquad
P (probability of getting a 6) = \ frac {1} {6}
$
Random variable $ X $ here If you define it as follows
$
X = \left\{
\begin{array}{ll}
1 & (when a 1 is rolled) \
2 & (when a 2 roll appears) \
3 & (when a 3 roll appears) \
4 & (when the 4 rolls) \
5 & (when a 5 is rolled) \
6 & (when a 6 is rolled) \
\end{array}\right.
$
It will be. A variable that fluctuates stochastically, such as $ X $ here, is called a ** random variable **. The value actually taken by the random variable here is called the ** realization value **.
$ P(X = x) = \frac{1}{6}, \qquad x = 1,2,3,4,5,6 $
Let's actually roll the dice with python.
import numpy as np
import matplotlib as mpl
np.random.seed()
prob_dice = np.array([])
dice = np.array([1,2,3,4,5,6])
dice_data = np.random.choice(dice, dice_times)
dice_times = 10000
for i in range(1,7):
p = len(dice_data[dice_data == i]) / dice_times
print(i, "Probability of appearing", p)
prob_dice = np.append(prob_dice, len(dice_data[dice_data == i]) / dice_times)
plt.bar(dice, prob_dice)
plt.grid(True)
The following is the result. This time, the dice are rolled 10,000 times, and the probability is given. As the results show, each eye is close to $ \ frac {1} {6} = 0.1666 ... $.
There are various random variables, and $ X $ is a ** discrete random variable ** when the possible values of $ X $ are finite or infinite (1, 2, 3, 4,). It means a value that is discrete like 5 ...), and $ X $ is a ** continuous random variable ** when it has a density function. In the case of discrete probabilities, the probability is considered for each $ x $ as in the previous dice, and the function of $ x $ is called the ** probability function **, which can be expressed as follows.
$
p(x) = P(X = x)\
$
The probability function has the following properties. Note that $ \ sum $ here represents the sum of probabilities.
$
p(x) \ge 0, \qquad \forall x \
\sum_{x}^{} p(x) = 1
$
The cumulative sum of the probability functions is called the ** cumulative distribution function or distribution function **. The distribution function has the following properties, such as monotonicity and right continuity.
$
F(x) = P(X \le x) = \sum_{y \le x} p(y)\
(1) \quad \lim_{n \to -\infty}F(x) = 0\
(2) \ quad \ forall x, y \ in \ mathbb {R} (real number) \
\qquad F(x) \ge F(y), \quad F(x) = \lim_{\varepsilon \to 0}F(x + \varepsilon)\
(3) \quad \lim_{n \to +\infty}F(x) = 1
$
Here, in $ \ forall x $, $ F (X) $ is right continuous (expressed as $ F (X +) = F (X) $), and $ x_n $ is a sequence that decreases monotonically and converges. $ \ lim_ {x_n \ to + \ infty} F (x_n) = F (x) $. Here, $ x + $ indicates that it decreases monotonically from the positive direction and converges to $ x $. Then, the probability function can be obtained by taking the difference between the cumulative distribution functions of $ X $ as shown below.
$ p(x) = F(x) - \lim_{x_n \to x-} F(x_n) = F(x) - F(x-) $
If you implement the cumulative distribution in python, it will be as follows.
import numpy as np
import matplotlib as mpl
from scipy.stats import norm
import matplotlib.pyplot as plt
x = np.arange(0,3000)
y = norm.cdf(x, loc=1500, scale=500)
plt.plot(x,y)
plt.grid(True)
plt.xlabel("value")
plt.ylabel("possibility")
Recommended Posts