Speeding up numerical calculation using NumPy / SciPy: Application 2

Target

It is written for beginners of Python and NumPy. It may be especially useful for those who aim at numerical calculation of physics that corresponds to "I can use C language but recently started Python" or "I do not understand how to write like Python". Maybe.

Also, there may be incorrect statements due to my lack of study. Please forgive me.

Summary

Speeding up numerical calculation using NumPy: Basics Speeding up numerical calculation using NumPy / SciPy: Application 1

This is a continuation of. We will follow the basic numerical calculation method, but this time we will include some advanced contents.

Algebra equation / Transcendental equation

Algebraic equations are so-called hand-solvable equations. Transcendental equations are quite exaggerated names, but they are equations that cannot be solved by algebraic methods.

\sin(x) = \frac{x}{2}

This equation says, "You need $ \ frac {x} {2} $ to know $ \ sin (x) $, and $ to know $ \ frac {x} {2} $. It is also called a "self-consistent equation" because it has a structure that "\ sin (x) $ is required". Of course, it is possible to solve it numerically even if it cannot be solved algebraically. The equations are "Newton method" and "dichotomy". I will omit the details of the equation, but it is not so difficult, so if you do not know it, please check it. These implementations are simple, but in the end, sequential substitution The use of for loops is unavoidable because it is like doing. The Newton method basically converges quickly, but it requires a function with good properties, provided that it is a divisible function. On the other hand, the dichotomy can always be found if there is a solution in the specified interval, but the number of iterations until convergence is large.

However, as you may have already noticed, ** these are already implemented in a module called scipy.optimize. ** for loops are complete in the function and run fast. ..

Implementation

Find out where the solution to the above equation is roughly:

figure5.png

You can see that it's about $ x \ in (1.5, 2.0) $. I'll solve this by the dichotomy:

from scipy.optimize import bisect
def f(x):
	return np.sin(x) - x/2
	
bisect(f, 1.5, 2) 
>>> 1.895494267034337

It's very simple. If there are two or more solutions in an interval, you don't know which one will converge, so make sure you have one. Of course, you can also use algebraic equations.

Fourier transform

Speaking of Fourier transform, we are familiar with signal processing etc., but I will proceed with physics this time as well. Please understand that there are a lot of mathematical talks. Recall the diffusion equation dealt with in the previous article. Please give me:

\frac{\partial}{\partial t}f(x, t) = \frac{\partial^2}{\partial x^2}f(x, t)

This child could be solved by the Fourier transform. Let's see how to do it. We define the Fourier transform and the inverse Fourier transform as follows:

\tilde{f}(k, t) = \frac{1}{\sqrt{2\pi}}\int dx e^{-ikx}f(k, t) = {\cal F}[f(x, t)]\\
f(x, t) = \frac{1}{\sqrt{2\pi}}\int dk e^{ikx}\tilde{f}(k, t) = {\cal F}^{-1}[\tilde{f}(k, t)]

Let's substitute this definition into the diffusion equation, which allows us to perform the $ x $ derivative:

\frac{\partial}{\partial t}\left(\frac{1}{\sqrt{2\pi}}\int dk e^{ikx}\tilde{f}(k, t)\right) = \frac{\partial^2}{\partial x^2}\left(\frac{1}{\sqrt{2\pi}}\int dk e^{ikx}\tilde{f}(k, t)\right)\\
\frac{\partial}{\partial t}\left(\frac{1}{\sqrt{2\pi}}\int dk e^{ikx}\tilde{f}(k, t)\right) = -\left(\frac{1}{\sqrt{2\pi}}\int dk k^2e^{ikx}\tilde{f}(k, t)\right)\\

Furthermore, removing the $ k $ integral gives a simple first-order differential equation for $ t $, which can be easily solved:

\frac{\partial}{\partial t}\tilde{f}(k, t) = -k^2\tilde{f}(k, t)\\
\tilde{f}(k, t) = e^{-k^2 t}\tilde{f}(k, 0)

And I'll do an inverse Fourier transform of this:

\frac{1}{\sqrt{2\pi}}\int dk e^{ikx}\tilde{f}(k, t) = \frac{1}{\sqrt{2\pi}}\int dk e^{ikx}e^{-k^2 t}\tilde{f}(k, 0)\\
f(x, t) = \frac{1}{\sqrt{2\pi}}\int dk e^{ikx}e^{-k^2 t}\tilde{f}(k, 0)

I'm not sure if it stays at $ \ tilde {f} $, so I'll fix it to $ f $:

f(x, t) = \frac{1}{2\pi}\int dk e^{ikx}e^{-k^2 t}\int dx' e^{-ikx'}f(x', 0)

If the equation of the initial condition $ f (x, 0) $ is known and it can be integrated analytically, then the [^ 1] analytical solution can be obtained. If it cannot be integrated analytically After all it is not solved numerically.

Now, a simple description of this flow is as follows:

f(x, t) = {\cal F}^{-1}[e^{-k^2 t}{\cal F}[f(x, 0)]]

You can do a Fourier transform of $ f (x, 0) $ and then multiply it by $ e ^ {-k ^ 2 t} $ to do an inverse Fourier transform. The great thing about this is ** $ x $ and $ This is where t $ is not differentiated. ** In the method using differentiation as before, it is basic to replace the differentiation with the difference, so it was an absolute condition that the width of the difference was small. First differentiates $ x $ and $ f (x) $, but the order of ** $ \ Delta x $ does not affect the calculation result. ** $ t $ is not differentiated in the first place. In other words, the error does not accumulate even if the time of $ t $ is advanced steadily. This is quite amazing.

Now, the final question is what is $ k $. What value does $ k_i $ take for $ x_i $ differentiated into a certain $ N $? This can be seen by seriously considering the discrete Fourier transform of $ x_i $,

k_i = 
\begin{cases}
2\pi\frac{i}{N}\hspace{0.5cm} (0 \le i < N/2)\\
2\pi\frac{N-i}{N}\hspace{0.5cm} (N/2 \le i < N)
\end{cases}

This is a bit confusing. ** But SciPy even has a function to generate this $ k_i $. ** It's very careful.

Implementation

As I've said for a long time, the coding is simple. ** Fourier transforms are available on scipy.fftpack. **

There are two points to note.

from scipy.fftpack import fft, ifft, fftfreq

def f(x):
    return np.exp(-x**2)

# set system
N, L = 256, 40.0
x = np.linspace(-L/2, L/2, N)

# set initial function
gaussian = f(x)
plt.plot(x, gaussian, label="t=0")

# set k_i
k = 2 * np.pi * fftfreq(N, d=1/N)/L
time = [2, 4, 8]

for t in time:
    expK = np.exp(-k**2 * t)
    result = ifft(fft(gaussian) * expK)

There is a for statement, but it's cute ... This kind of time step is unavoidable. This time I only turned it 3 times. However, I want to be careful, advance the time sequentially like ** 2s → 4s → 8s It means that they calculate the time evolution independently. ** In other words, if you want a graph of 8s, you can just substitute t = 8. This is the difference method. Is the biggest difference.

figure6.png

You got the same graph as last time. This method can be applied even when it is very powerful and has potential, and it will develop into Symplectic method using trotter decomposition ... but it is too specialized. Let's keep it up to here.

Non-linear differential equation (evolution)

This content is a little technical. If you are not interested, you can go through it, but it also includes a little important story.

Non-linear differential equations often appear in physics. One such nonlinear equation [^ 2]

\left(-\frac{1}{2}\frac{d^2}{dx^2} + g|\psi(x)|\right)\psi(x) = \mu\psi(x)

This seems difficult to solve in the traditional way, because ** contains $ \ psi (x) $ in the matrix when diffing **. Specifically.

\left[
\frac{1}{2\Delta x^2}
	\begin{pmatrix}
	-2&1&0&\cdots&0\\ 
	1&-2 &1&&0\\
	0 & 1&-2&& \vdots\\
	\vdots&&&\ddots&1\\
	0& \cdots& 0 &1& -2
\end{pmatrix}
+g
\begin{pmatrix}
	|\psi_0|^2&0&0&\cdots&0\\ 
	0&|\psi_1|^2 &0&&0\\
	0 & 0&|\psi_2|^2&& \vdots\\
	\vdots&&&\ddots&0\\
	0& \cdots& 0 &0& |\psi_n|^2
\end{pmatrix}
\right]
\begin{pmatrix}
\psi_0\\
\psi_1\\
\vdots\\
\psi_{n-1}\\
\psi_n
\end{pmatrix}
= T(\psi)\psi
=\mu\psi

This means that it cannot be written with a linear operator such as $ T \ psi $, which is one interpretation of the word "nonlinear". Although it is forcibly written above as a matrix, $ T (\ psi) It becomes \ psi $. It is a structure that you want to find $ \ psi $, but the matrix required to find it contains $ \ psi $. Because this is called self-consistent. did.

Now, let's think of the above differential equation as an eigenvalue equation with $ \ mu $ as the eigenvalue:

T(\psi)\psi = \mu_n \psi

The self-consistent equation gives us the following ideas:

And there is a soliton solution in the above nonlinear equation, and among them, when $ g <0 $, there is a Gaussian-like [^ 3] special solution called bright soliton. So, $ \ psi $ Let's implement this idea by choosing Gaussian as the initial function of.

Implementation

def h(x):
    return x * np.exp(-x**2)

# set system
L, N, g = 30, 200, -5
x, dx = np.linspace(-L/2, L/2, N), L / N

# set K matrix
K = np.eye(N, N)
K_sub = np.vstack((K[1:], np.array([0] * N)))
K = (dx**-2 * (2 * K - K_sub - K_sub.T)) * 0.5

# set initial parameter
v, w, mu = np.array([h(x)]).reshape(N, 1), np.array([3]), 0

# self-consistent loop
while abs(w[np.argmin(w)] - mu) > 1e-5:

    # update mu
    mu = w[np.argmin(w)]
    
    # set G matrix
    G = g * np.diag(np.abs(v.T[np.argmin(w)])**2)
    
    # solve
    T = K + G
    w, v = np.linalg.eigh(T)

This time I chose $ vT [0] $ as the function to repeat. It seems that 1 soliton corresponds to the lowest energy eigenfunction in this differential equation [^ 4]. Convergence condition is imposed on $ \ mu $ I've been looping about 16 times with this kind of system. Let's plot this:

figure7.png

It looks like a soliton, but I'm worried if it's not Gaussian. Try the general solution of soliton.

f_s(x) = \frac{b}{a\cosh(x)}

Let's fit with Gaussian:

from scipy.optimize import curve_fit

def bright_soliton(x, a, b):
    return (b / np.cosh(a * x))**2

def gauss_function(x, a, b):
    return a * np.exp(-b * x**2)
	
param_soliton = curve_fit(bright_soliton, x, v.T[0]**2)[0]
param_gauss = curve_fit(gauss_function, x, result)[0]

figure8.png

The result and bright_soliton matched exactly. It didn't fit Gaussian, so it's probably a soliton.

A little difficult story continued, but this time it is not solitons, but in code such as self-consistent equations where the matrix is initialized and the calculation is repeated while changing the value, it loops to initialization one by one. It means that it will take a lot of time if you use. ** Self-consistent loop + Matrix initialization 2 loop = 3 loop It's hard to do something like this. NumPy is indispensable to make it. In the previous Basics, "Initialize a huge matrix every time you change the parameter and calculate. It is very powerful in tasks that you can proceed with, "said a calculation like this one.

Finally / personal

I've talked about the implementation of basic numerical calculations. I think I'll be a little more specialized, so I'll stop here. After that, I hope you can see the NumPy / SciPy reference according to the calculation you want to do. I think. However, I have left a little bit about it, so I may summarize it in another article.

The motivation for writing this article in the first place was that I felt that there was not much information on this kind of speedup on the net. I followed the following path.

  1. Impressed by the simple specifications of Python and the abundance of libraries for numerical calculations. I want to use it for my own research, but it is very slow if I drop the C ++ code as it is. Can I handle hard numerical calculations? And start searching.

  2. The story "NumPy is fast!" Is everywhere, but people migrating from C end up using for / while statements. ** Commit the folly of passing a NumPy array to a for statement. , I thought, "It's not fast at all!" ** It's ridiculous when I think about it now.

  3. "For statement is slow!" Is also talked about in various fields, but ** People who have migrated from C do not know how to write numerical calculation code without using for / while statement. ** It is true that matrix operations are fast, but I was still worried about how to prepare the matrix, how to calculate other than the matrix, and so on.

  4. It is said that "list comprehension is faster than ordinary for statements!", But honestly, it was a slight difference.

  5. And the destination was ** Cython, Boost / Python, Numba, etc. [^ 5]. ** These tools are certainly faster but have more restrictions, ** more and more C-like code. I will continue. **

  6. The result is "Isn't C ++ good?". However, if you are used to Python, you will be disgusted by the esoteric language specifications of C ++.

  7. Return to 5.

I've been repeating that for almost a year. I can understand how to use each function by reading the NumPy manual, but I can't find much explanation of the numerical algorithm that explicitly states that the for statement is not used. I think that. ** "Do not use for statements as much as possible! Use NumPy's universal function! I will show you an example!" ** If information such as ** is rolling in front of you, you can make such a detour. I don't think it was there.

From this background, I intend to write it with the aim of making it a cookbook-like basic numerical calculation algorithm. I hope it helps people who are worried about the execution speed of numerical calculations.

[^ 1]: To be exact, the condition is not "$ f $ is analytically integrable" but "$ f $ is analytically Fourier transformable".

[^ 2]: It is called the Gross-Pitaevskii equation.

[^ 3]: Similar, but with completely different properties from Gaussian.

[^ 4]: There is a good chance that it will change depending on the model to be solved, and trial and error may be required.

[^ 5]: I hope I can comment on these as well someday.

Recommended Posts

Speeding up numerical calculation using NumPy / SciPy: Application 2
Speeding up numerical calculation using NumPy / SciPy: Application 1
Speeding up numerical calculation using NumPy / SciPy: Picking up fallen ears
Speeding up numerical calculations with NumPy: Basics
See the power of speeding up with NumPy and SciPy
[Python] Speeding up processing using cache tools
[Scientific / technical calculation by Python] Solving (generalized) eigenvalue problem using numpy / scipy, using library
Using Intel MKL with NumPy / SciPy (November 2019 version)
[Competition Pro] Speeding up DP using cumulative sum
Try using scipy