It is written for beginners of Python and NumPy. It may be especially useful for those who aim at numerical calculation of physics that corresponds to "I can use C language but recently started Python" or "I do not understand how to write like Python". Maybe.
Also, there may be incorrect statements due to my lack of study. Please forgive me.
Python has a very rich library of numerical calculations and is easy to use. Many of them are heritage wrappers written in Fortran, but they are much easier than calling them from C / C ++ etc. For example, if you try to calculate a eigenvalue problem by calling LAPACK [^ 1] from C / C ++
info = LAPACKE_zheevr(LAPACK_COL_MAJOR, 'V', 'A', 'L', H, (lapack_complex_double*)a, lda, vl, vu, il, iu, abstol, &mm, w, (lapack_complex_double*)z, ldz, isuppz);
There are too many arguments and it is complicated. Although the more arguments increase the degree of freedom of calculation, it does not seem to be very for beginners. Also, C / C ++ does not natively support complex. Since, types such as lapack_complex_double
are defined inside the wrapper, which can be confusing. On the other hand, it is very concise in Python:
w, v = numpy.linalg.eig(H)
Arguments like those found in LAPACK are also provided properly [^ 2], but the default values are set, so if you just want to solve it without thinking about the details, you can just do this. It's simple and good. By the way, various solvers seem to have higher performance in SciPy than in NumPy.
This makes me want to leave various numerical calculations to Python, but this child also has a drawback. ** For loops are deadly slow. ** In so-called matrix operations, triple loops frequently occur. However, numerical calculation is basically "finely differentiated and looped", so it is quite fatal [^ 3]. Although it may not be a compile language, it seems to be slower than other LL languages. For matrix operations
for i in range(N):
for j in range(N):
for k in range(N):
c[i][j] = a[i][k] * b[k][j]
I wrote it and it's over.
So what to do is take advantage of NumPy's universal function. ** NumPy's built-in function works on each element of the NumPy array ndarray
and returns a new ndarray
. . ** This is very important and has the potential to drive out for loops in tensor operations. For example, math.sin (cmath.sin)
can only pass scalars, but numpy.sin
>>> theta = numpy.array([numpy.pi/6, numpy.pi/4, numpy.pi])
>>> numpy.sin(theta)
array([ 5.00000000e-01, 7.07106781e-01, 1.22464680e-16])
And throwing a NumPy array will cause numpy.sin
to work on each element. And often this kind of processing is as fast as C / C ++. The main implementation of NumPy is C. And Fortran, and BLAS [^ 4] / LAPACK is doing its best in linear operations.
Also, the four arithmetic operations defined by ndarray
are similarly fast and diverse.
Let's write a linear operation related to a vector without a for loop. Since it is expected that functions such as inner product and outer product are properly prepared, how are the sum, product, quotient, etc. for ndarray
mainly described below? Let's see if it is defined. ** The operation on the ndarray
is completely different from the operation on the Python built-in list
**
After that, alias is set like ʻimport numpy as np`.
A one-dimensional ndarray
can be considered a kind of vector.
Scalar (complex, float, int) works on all elements of ndarray
:
>>> a = np.array([1, 2, 3])
>>> a + 2
array([3, 4, 5])
>>> a * (3 + 4j)
array([ 3. +4.j, 6. +8.j, 9.+12.j])
ndarray
works with elements with matching indexes:
>>> a = np.array([1, 2, 3])
>>> b = -np.array([1, 2, 3])
>>> a + b
array([0, 0, 0])
>>> a * b
array([-1, -4, -9])
>>> a / b
array([-1., -1., -1.])
It is interesting that it is neither an inner product nor an outer product. It is possible to define division. It is often used when performing a differential operation on a certain differentiated function. The sum / product of ndarray
s with different dimensions is not defined.
ndarray
has a method called reshape
, for example, you can rearrange a 3x1 vector into a 1x3. It's like a horizontal vector to a vertical vector, but a product (*). There is no simple correspondence because the definition of is different from that of vector space. The product with this reshaped
vector is very interesting:
>>> c = a.reshape(3, 1)
>>> c
array([[1],
[2],
[3]])
>>> c + b
array([[ 0, -1, -2],
[ 1, 0, -1],
[ 2, 1, 0]])
>>> c * b
array([[-1, -2, -3],
[-2, -4, -6],
[-3, -6, -9]])
While having rules like (3 × 1) + or * (1 × 3) = (3 × 3), but like the previous operation, ** this operation is commutative **. It seems strange at first glance. It may be, but if you take a look, you can see that it can be explained using the rules in the previous section. With this, there is a possibility that even the initialization of the matrix does not need to use the for loop. Masu **. It is very useful for tasks such as initializing a huge matrix every time the parameters are changed and proceeding with the calculation.
I personally like the very flexible specifications that are not tied to these types [^ 5]. I think it's more intuitive than other languages.
It's not an operation, but there's no point in using a for loop to prepare a vector in the first place. There are many functions that create ndarray
, but here are some of the ones I use most often:
>>> L = 1
>>> N = 10
>>> np.linspace(-L/2, L/2, N)
array([-0.5 , -0.38888889, -0.27777778, -0.16666667, -0.05555556,
0.05555556, 0.16666667, 0.27777778, 0.38888889, 0.5 ])
>>> dL = 0.2
>>> np.arange(-L/2, L/2, dL)
array([-0.5, -0.3, -0.1, 0.1, 0.3])
>>> np.logspace(0, 1 ,10, base=np.e)
array([ 1. , 1.11751907, 1.24884887, 1.39561243, 1.5596235 ,
1.742909 , 1.94773404, 2.17662993, 2.43242545, 2.71828183])
If you combine operations with scalars and vectors, it seems quite so even without a for loop.
The flow of the story is almost the same as that of the vector. Since the matrix product has a dedicated function, the following is also the story of the operation. I think that the operation result can be predicted.
>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> a + (2 - 3j)
array([[ 3.-3.j, 4.-3.j, 5.-3.j],
[ 6.-3.j, 7.-3.j, 8.-3.j],
[ 9.-3.j, 10.-3.j, 11.-3.j]])
>>> a * 0.2
array([[ 0.2, 0.4, 0.6],
[ 0.8, 1. , 1.2],
[ 1.4, 1.6, 1.8]])
>>> b = np.array([1, 2, 3])
>>> a + b
array([[ 2, 4, 6],
[ 5, 7, 9],
[ 8, 10, 12]])
>>> a * b
array([[ 1, 4, 9],
[ 4, 10, 18],
[ 7, 16, 27]])
>>> a / b
array([[ 1. , 1. , 1. ],
[ 4. , 2.5, 2. ],
[ 7. , 4. , 3. ]])
>>> b = np.array([[-1, 2, -3], [4, -5, 6], [-7, 8, -9]])
>>> a + b
array([[ 0, 4, 0],
[ 8, 0, 12],
[ 0, 16, 0]])
>>> a * b
array([[ -1, 4, -9],
[ 16, -25, 36],
[-49, 64, -81]])
There are also useful functions for matrices. Below are just a few:
>>> np.identity(4)
array([[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]])
>>> np.eye(4, 3)
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.],
[ 0., 0., 0.]])
>>> c = np.array([1, 2, 3, 4])
>>> np.diag(c)
array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
I think other people have written better articles about NumPy's useful functions and detailed arithmetic rules. If you look at this article and write it like NumPy, you can do various calculations without using for loops. I hope you can feel that it is possible. If you use these, the numerical calculation algorithms that science college students learn in programming classes will run at a speed comparable to C / C ++. Many hand algorithms are already prepared in SciPy etc.
It's been a little long, so I'd like to summarize the specific application to numerical calculation in another article. Thank you.
** Addendum (2016/11/25): Continued below ** Speeding up numerical calculation using NumPy / SciPy: Application 1 Speeding up numerical calculation using NumPy / SciPy: Application 2
[^ 1]: LAPACK (Linear Algebra PACKage) is a library of linear algebra written in Fortran 90. Some wrappers for C / C ++ are called Lapacke
. Because it has an MKL compatible interface, The MKL manual will be helpful.
[^ 2]: For more information, see NumPy and [SciPy](https://docs.scipy.org/doc/scipy-0.18. Please refer to the 1 / reference /) manual.
[^ 3]: Besides that, the overhead of function call is also big. In the case of Procon, it often happens that "It passes when written by dynamic programming, but TLE in memoized recursion".
[^ 4]: A library that contains a more basic set of linear operations than LAPACK. Inside LAPACK we call BLAS.
[^ 5]: Some people dislike "no type". Unlike programs designed to meet "certain specifications", numerical calculations in physics make it so easy to check if the output is correct. And it's even harder to debug if "variable type changes implicitly" is allowed during program execution. Since the observable amount is fixed to float, static typing creates bugs. I understand the claim that it's difficult, but I like Python.
Recommended Posts