Create a random number with an arbitrary probability density

TL;DR

To make a continuous random variable $ \ hat {X} $ according to the probability density $ f (x) $, the cumulative distribution function $ F (x) $ and $ 0 \ leq \ hat {R} <1 $ uniform random number You can use $ \ hat {R} $ to make $ \ hat {X} = F ^ {-1} (\ hat {R}) $ (inverse function. org / wiki /% E9% 80% 86% E9% 96% A2% E6% 95% B0% E6% B3% 95)).

Introduction

There are times when you want to create a random variable with an arbitrary probability density distribution. A typical example is when generating "points that are uniformly distributed on a sphere". This is necessary, for example, when the absolute value of velocity is the same in three-dimensional space but you want to give a random direction.

Generally, the (pseudo) random numbers obtained by a program are uniform random numbers from 0 to 1. In this article, let's see how to use this to create random numbers with an arbitrary probability density distribution and some concrete examples.

Probability density function and cumulative distribution function

Suppose you have a random variable $ \ hat {X} $ that takes a continuous value. The probability that this variable will take a value between $ a $ and $ b $

P(a \leq \hat{X} < b) = \int_a^b f(x) dx

When we can write, $ f (x) $ is called the probability density function. This narrows the range

P(x \leq \hat{X} < x+dx) \sim f(x)dx

You can read "The probability that $ \ hat {X} $ will take a value near the value $ x $ is proportional to $ f (x) $". In that sense, the meaning of the probability density function is easy to understand, but theoretically and numerically, the cumulative distribution function that integrates the probability density function is often easier to use.

The definition of the cumulative distribution function is as follows.

P(\hat{X} < x) = F(x)

That is, $ F (x) $ is the probability that the random variable $ \ hat {X} $ takes a value less than $ x $. When written using the probability density function

F(x) = \int_{-\infty}^x f(x) dx

Therefore, the cumulative distribution function is the integral of the probability density function.

For example, the probability density function and cumulative distribution function of uniform random numbers from 0 to 1

\begin{align}
f(x) &= 1 \\
F(x) &= x
\end{align}

(However, $ 0 \ leq x <1 $).

Proof

To create a random variable $ \ hat {X} $ with a probability density $ f (x) $ as mentioned at the beginning, $ 0 \ leq \ hat {R} <1 $ uniform random number $ \ hat {R } $ Converted using the inverse function $ F ^ {-1} (x) $ of the cumulative distribution function

\hat{X} = F^{-1}(\hat{R})

It is OK if you do it. To prove this, show that the cumulative distribution function of the random variable $ \ hat {X} $ created from uniform random numbers is $ F (x) $, or conversely the probability density function $ f (x). It is OK to show that converting the random variable $ \ hat {X} $ with $ to $ \ hat {R} = F (\ hat {X}) $ yields a uniform random number from 0 to 1. Personally, I think the latter fact is more important, but I will show it in both directions.

Proof (forward)

A uniform random number from 0 to 1 $ \ hat {R} $ is transformed using the inverse function of the cumulative distribution function $ F (x) $, and a new random variable $ \ hat {X} $ is created.

\hat{X} = F^{-1}(\hat{R})

Make by. At this time, the probability that $ \ hat {X} $ is smaller than the value $ x $ is [^ 3]

[^ 3]: Here we assume that the cumulative distribution function is a one-to-one mapping (proof required).

\begin{align}
P(\hat{X} < x) &= P(F^{-1}(\hat{R}) < x) \\
&= P(\hat{R} < F(x))
\end{align}

The cumulative distribution function $ F_R (r) $ of the uniform random number $ \ hat {R} $ is

F_R(r) \equiv P(\hat{R} < r) = r

So

P(\hat{R} < F(x)) = F(x)

From the above,

P(\hat{X} < x) = F(x)

This means that the cumulative distribution function of $ \ hat {X} $ is $ F (x) $, that is, the probability density function is $ f (x) $. This was what I wanted to prove.

Proof (reverse direction)

Now suppose you get a random variable $ \ hat {X} $ that has a probability density of $ f_X (x) $. The cumulative distribution function for this variable is $ F_X (x) $ and its definition is

F_X(x) = P(\hat{X} < x)

is. Now, let's transform this random variable using the inverse function $ F_X (x) $ of this cumulative distribution function to create a new random variable $ \ hat {Y} $.

\hat{Y} = F_X(\hat{X})

From the definition of the cumulative distribution function, $ 0 \ leq \ hat {Y} <1 $.

The probability that this random variable $ \ hat {Y} $ is smaller than $ y $, that is, the cumulative distribution function $ F_Y (y) $

F_Y(y) = P(\hat{Y} < y)

is. Well, because $ \ hat {Y} = F_X (\ hat {X}) $

F_Y(y) = P(F_X(\hat{X}) < y)

Here, the cumulative distribution function $ F_X (x) $ is a monotonically increasing function, so it is reversible as a map [^ 2]. So, if $ F_X (x) = y $, then $ x = F_X ^ {-1} (y) $. So

[^ 2]: Strictly speaking, I have to say a lot, but forgive me because it's troublesome.

P(F_X(\hat{X}) < y) = P(\hat{X} < F_X^{-1}(y))

Can be done. This is the definition of the cumulative distribution function itself, as the right-hand side is the probability that $ \ hat {X} $ has a value less than $ x = F_X ^ {-1} (y) $. Therefore

P(\hat{X} < F_X^{-1}(y)) = F_X(F_X^{-1}(y)) = y

From the above,

F_Y(y) = y

have become. Since $ 0 \ leq \ hat {Y} <1 $, the random variable $ \ hat {Y} $ is a uniform random number from 0 to 1. In other words, ** If you convert a random variable with an arbitrary probability density with a cumulative distribution function, it will be a uniform random number from 0 to 1 **.

Conversely, if you convert a uniform random number from 0 to 1 with $ F ^ {-1} (x) $, you can get a random variable with a probability density function $ f (x) $. This was what I wanted to prove.

Example

Points uniformly distributed on the surface of the sphere

Suppose you want to create points that are evenly distributed on the surface of the unit sphere. The coordinates of the surface of the unit sphere are calculated using two angles $ \ theta and \ phi $.

\begin{align}
x &= \sin{\theta} \cos{\phi}\\
y &= \sin{\theta} \sin{\phi}\\
z &= \cos{\theta}\\
\end{align}

Can be written. A common mistake is to take $ \ theta $ and $ \ phi $ as uniform random numbers from $ 0 $ to $ 2 \ pi $. let's do it.

import numpy as np
import matplotlib.pyplot as plt
import random
from math import pi, sin, cos, sqrt
from mpl_toolkits.mplot3d import Axes3D

X = []
Y = []
Z = []
for i in range(3000):
  phi = random.uniform(0, 2*pi)
  theta = random.uniform(0, pi)
  X.append(sin(theta)*cos(phi))
  Y.append(sin(theta)*sin(phi))
  Z.append(cos(theta))

fig = plt.figure()
ax = Axes3D(fig)
ax.plot(X,Y,Z,marker="o",markersize=1,linestyle='None')
plt.show()

The result is like this.

sphere_surface1.png

You can see that there are many points in the North and South Pole and few points near the equator.

When the coordinate variables are taken as $ \ phi $ and $ \ theta $, first $ \ phi $ can be a uniform random number, but $ \ theta $ is different. If you decide on an angle $ \ theta $, it will be a circle with a radius of $ \ sin {\ theta} $ on the sphere. If you want to make points on the sphere uniformly, you have to make points in proportion to the length of this circumference.

fig1.png

Since the length of the circumference is proportional to $ \ sin {\ theta} $, the probability density function of $ \ theta $ is

f(\theta) = \frac{1}{2}\sin{\theta}

is. However, standardization conditions

\int_0^\pi f(\theta) d\theta = 1

The coefficient $ 1/2 $ is written to satisfy.

Cumulative distribution function

F(\theta) = -\frac{1}{2}\cos{\theta} + \frac{1}{2}

It will be. To generate $ \ hat {\ Theta} $ to follow such a cumulative distribution function, use the uniform random number $ \ hat {R} $ to generate $ \ hat {\ Theta} = F ^ {-1} Must be (\ hat {R}) $. Conversely, if the random variable $ \ hat {\ Theta} $ follows the probability density function $ f (\ theta) $, then

\hat{R} = -\frac{1}{2}\cos{\hat{\Theta}} + \frac{1}{2}

The new random variable $ \ hat {R} $ obtained as is a uniform random number from 0 to 1.

Well, what I originally wanted was not the value of $ \ theta $ itself, but, for example, $ \ cos {\ theta} $, which is the $ z $ coordinate. From the above formula, we can see that $ \ cos {\ hat {\ Theta}} $ is a uniform random number from -1 to 1. In other words, if you make $ \ hat {Z} $ with a uniform random number from -1 to 1, when you make $ \ hat {\ Theta} = \ cos ^ {-1} {\ hat {Z}} $, You have generated $ \ hat {\ Theta} $ with the correct probability density. Also, from $ \ cos ^ 2 {\ theta} + \ sin ^ 2 {\ theta} = 1 $, $ \ sin {\ theta} = \ sqrt {1-z ^ 2} $ and $ \ sin {\ theta You can also find the value of} $.

When the above is implemented, it becomes like this.

X = []
Y = []
Z = []
for i in range(3000):
  phi = random.uniform(0, 2*pi)
  z = random.uniform(-1,1)
  X.append(sqrt(1-z**2)*cos(phi))
  Y.append(sqrt(1-z**2)*sin(phi))
  Z.append(z)

The execution result looks like this.

sphere_surface2.png

It looks like the dots are evenly scattered on the sphere.

Points uniformly distributed in the unit sphere

The points that are uniformly distributed in the unit sphere are

It is OK if you follow the procedure. To evenly distribute the points on a sphere with a radius of $ r $, determine $ \ theta, \ phi $ as before,

\begin{align}
x &= r \sin{\theta} \cos{\phi}\\
y &= r \sin{\theta} \sin{\phi}\\
z &= r \cos{\theta}\\
\end{align}

It is OK if you decide the coordinates of the point as. The problem is the distribution of $ r $. With a uniform distribution, the density of points will increase where $ r $ is small.

Now, the surface area of a sphere with radius $ r $ is proportional to $ r ^ 2 $. Of course, the number of points also needs to be generated in proportion to $ r ^ 2 $. So, if the probability density function of the random variable $ r $ includes the normalized constant

f(r) = 3 r^2

It will be. Cumulative distribution function

F(r) = r^3

is. This is intuitive because it means that points exist in proportion to $ r ^ 3 $ on spheres with a radius of $ r $ or less.

The inverse function is

F^{-1}(r) = r^{1/3}

is. Therefore, it is OK to use a uniform random number $ r $ from 0 to 1 and let $ r ^ {1/3} $ be the radius (the radius r and the random number r are the same, which is confusing, but confusing. Will not).

It looks like this in code.

X = []
Y = []
Z = []
for i in range(3000):
  phi = random.uniform(0, 2*pi)
  z = random.uniform(-1,1)
  r = random.uniform(0,1)
  X.append(r**(1.0/3.0)*sqrt(1-z**2)*cos(phi))
  Y.append(r**(1.0/3.0)*sqrt(1-z**2)*sin(phi))
  Z.append(r**(1.0/3.0)*z)

The execution result looks like this.

sphere.png

Perhaps the points are evenly distributed within the unit sphere.

Exponential distribution

Suppose you have a phenomenon that occurs $ \ lambda $ times per unit time. Now, the distribution of the time from the start of observation to the first occurrence of the phenomenon is an exponential distribution according to the parameter $ \ lambda $. A typical example of an exponential distribution is the decay probability of radioactive materials. In addition, the distribution of how much light can go straight without being scattered in a medium with uniform impurities is also an exponential distribution.

Now let's say you want to simulate an event that follows an exponential distribution. The time before an event that follows the exponential distribution $ \ lambda $ occurs $ \ hat {T} $ is a random variable. Since the probability that this time is less than $ t $ is the cumulative distribution function, from the definition of the exponential distribution

P(\hat{T}< t) \equiv F(t) = 1 - \exp(-t/\lambda)

It will be. The inverse function is

F^{-1}(t) = -\lambda \log(1-t)

So, using a uniform random number from 0 to 1 $ \ hat {R} $

\hat{T} = -\lambda \log(1-\hat{R})

Then you get the desired random variable. Practically, $ 1- \ hat {R} $ and $ \ hat {R} $ have the same distribution, so

\hat{T} = -\lambda \log(\hat{R})

It is often said.

Summary

I introduced how to create a random variable with an arbitrary probability density function from a uniform random number. It seems that the name "inverse function method" is given because it uses the inverse function of the cumulative distribution function.

The method of creating $ z = \ cos {\ theta} $ as a uniform random number when generating points that are uniformly distributed on a sphere is often introduced, but there is not much proof of "why that is okay". So I wrote it. The essence is that "converting an arbitrary random variable with its cumulative distribution function yields a uniform random number from 0 to 1."

If the random variables are discrete rather than continuous, it is easy to use the Walker's Alias method. In C ++, std :: discrete_distribution is (probably) implemented by the Walker's Alias method, so I think it's easier to use.

Recommended Posts

Create a random number with an arbitrary probability density
Create a 1MByte random number file
How to create a heatmap with an arbitrary domain in Python
Create a PDF file with a random page size
I made a random number graph with Numpy
[python] Create a date array with arbitrary increments with np.arange
Create an environment with virtualenv
Create an API with Django
Create a homepage with django
Create a heatmap with pyqtgraph
Create a directory with python
Create a PythonBox that outputs with Random after PEPPER Input
Create a Todo app with Django ① Build an environment with Docker
random French number generator with python
Create a virtual environment with Python!
Create an age group with pandas
Create an arbitrary machine learning environment with GCP + Docker + Jupyter Lab
Create a poisson stepper with numpy.random
Create a random string in Python
Create a file uploader with Django
Create a compatibility judgment program with the random module of python.
Create a socket with an Ethernet interface (eth0, eth1) (Linux, C, Raspberry Pi)
The story of the escape probability of a random walk on an integer grid
Create an application by classifying with Pygame
Create a Python function decorator with Class
Build a blockchain with Python ① Create a class
Create an image processing viewer with PySimpleGUI
Create a dummy image with Python + PIL.
[Python] Create a virtual environment with Anaconda
Let's create a free group with Python
Quickly create an excel file with Python #python
Create a large text file with shellscript
Create a star system with Blender 2.80 script
Create a virtual environment with Python_Mac version
Create a VM with a YAML file (KVM)
Create an update screen with Django Updateview
Create a simple web app with flask
Create a word frequency counter with Python 3.4
[Python] Quickly create an API with Flask
Create an add-in-enabled Excel instance with xlwings
Create a Connecting Nearest Neighbor with NetworkX
Create an English word app with python
Create a web service with Docker + Flask
Operate an oscilloscope with a Raspberry Pi
Random number generator with normal distribution N (0,1)
Create a private repository with AWS CodeArtifact
Create a car meter with raspberry pi
Create a devilish picture with Blender scripts
Create a matrix with PythonGUI (text box)
Create an upgradeable msi file with cx_Freeze
Create a graph with borders removed with matplotlib
Creating an environment for OSS-DB Silver # 1_Create a Linux environment (CentOS7 virtual environment) with VirtualBox/Vagrant
[kotlin] Create an app that recognizes photos taken with a camera on android
Create an app that guesses students with python
[Python] Make a game with Pyxel-Use an editor-
Create an academic society program with combinatorial optimization
Create a GUI executable file created with tkinter
Create a LINE BOT with Minette for Python
Create an image composition app with Flask + Pillow
Create a game UI from scratch with pygame2!
Create a page that loads infinitely with python