Python is a very good language. There is no doubt that it is a very grateful programming platform that provides many high-performance and high-performance libraries and can realize advanced information processing even for people who are not familiar with information engineering. However, it is not good to blindly believe or overconfide Python. Therefore, I am writing this article with the hope that it will serve as a reference for understanding the strengths and weaknesses of Python and making effective use of it. I think the correct way to use it is to combine the good points of Python with other language processing systems. The text for learning Python is based on this. → I made a Python text
● Speed comparison by simulation of circular orbit (C vs Python, Cython vs Python) This is an overwhelming victory in C language. The results show that C is more than 25 times faster than Python. However, for such a simple numerical calculation program, Cython can also be used to obtain speeds comparable to C language without making major changes to the Python program. ● Image generation of Mandelbrot set (JavaScript vs Python, Cython vs Python) This is a sample that visualizes the Mandelbrot set as an image with a size of 1024 x 1024. (The execution environment used for speed comparison is Windows10, Intel Core i7-5500U 2.39GHz) → HTML5 (JavaScript) implementation example Click the `start'button to execute The average execution time was 0.231 seconds. (Using Google Chrome) → Implementation example by Python + Pillow After downloading py MandelbrotPython.py m.png -2.0 2.0 -2.0 2.0 Execute as. (Image is generated in m.png) The average execution time was 53.645 seconds. Comparing the execution time, it was 232.23 times. Commenting out the part that calls Pillow does not have a noticeable effect on the execution time. It's an overwhelming victory for JavaScript. Of course, it depends on the browser you run, but it's still a double-digit win. → Cython + Pillow implementation example After downloading, compile with Cython processing system to make Python module import MandelbrotPyx MandelbrotPyx.MakeMandelbrot('m2.png',-2.0,2.0,-2.0,2.0) And so on. The average execution time in the same computer environment is 7.601 seconds. was. By processing with Cython, we were able to speed up about 7 times. (There is still room for optimization) As an aside, the execution speed of Java and JavaScript is approaching C these days. (JavaScript Super Introductory Text, Java Super Introductory Text) ● List / set / dictionary speed comparison Python is a good environment for processing large amounts of data on-memory. However, it is necessary to properly use the type of data structure to be used. Otherwise, the processing time will be too long for practical use. I tested the processing time of Python list / set / dictionary. → Trial on data structure processing time (Excerpt from Python text: Text body / items / b465b0cf05b1b7fd4975)) -Sample programs: spdTest00.py, spdTest01.py .py), spdTest02.py If you treat it like an array with an integer index, the Python list is fast, but you can't search for elements. Perhaps you are doing a "linear search". When searching for elements, you have to use sets and dictionaries. ● Speed comparison of C language program with / without NumPy One of the big purposes of using Python is data processing. Many libraries for data science and machine learning are based on NumPy. Since there is such a situation, I made a program to find the product of matrices and investigated how fast NumPy is. Comparison of 3 programs for matrix multiplication (PDF report) (Excerpt from Python text: text body) Programs: matmult01.py, matmult01_np.py ), Matmult01.c Compared to calculating the product of matrices by using a list as an array, NumPy is 10,000 times faster. When I saw the speed ratio, I thought, "Is it something wrong?" Moreover, it is more than 30 times faster than writing it simply in C language. In this case, Python + NumPy is the overwhelming victory. __Note) __ This is the case of NumPy using Intel MKL. It's even slower for NumPy without MKL. ** (Supplement) ** In particular, NumPy matrix operations are fast, and if the processing you want to realize can be described as matrix calculations, the execution speed may be greatly improved. (→ PDF report: An example showing the high speed of NumPy matrix multiplication) This is an excerpt from "Python3 Library Book". (to be continued)
Python is a beginner-friendly language. Still, it is a highly functional language. People who have learned programming in C or Java at school may be keenly aware, but at first they don't make any sense. It doesn't work unless you write a lot of "words" that have nothing to do with the basic algorithm (procedure idea) ... We teachers explain such incomprehensible "words" as "I don't have to understand at first ..." or "I think it's magic!". On the other hand, Python has very few "unintelligible words". For example, even if you make the first lesson "a program that displays" Hello, World "" that you often do at school, there is a big difference between Java and Python. Let's actually write Java and Python programs and compare them. → Java version "Hello, World" (5 lines) → Python version "Hello, World" (1 line) The following is a sample to display "Hello, World" on the GUI. → Java version "Hello, World" (33 lines: JavaFX version) → Python version "Hello, World" (17 lines: Kivy version) → Python version "Hello, World" (8 lines: Tkinter version) Even with just this introductory lesson, the number of lines in the program is significantly different. (Python is shorter) Furthermore, in Python, many writing styles that can directly express what you want to do are allowed, and the productivity of programming is extremely high. (Learn Python for more information) In the case of C and Java, advanced processing that must be described a lot is prepared in advance in Python, and they can be called easily. Thank you. It is a feeling of teaching in the field. For the time being, I will explain it to avoid misunderstanding, but the meaning of learning C language is very large. (The Python interpreter itself is written in C) The root of device control, the root of storage resource handling, the root of input / output, the root of communication, the root of process management ... C is inevitable when learning the basics of information engineering. (Furthermore, most operating systems such as Windows, macOS, and Linux are written in C.) If you learn Python, please learn C language. Ideally, you should create a module in C and operate it from Python. There are options such as Cython and Numba, but I think it is very good to call a function written in C language via ctypes.
I think the best thing about Python is the abundance of libraries. Even in my seminar, I was able to respond quickly to the student (still in the third grade at that time) who said, "I would like to try image recognition ...". After that, I quickly devised the outline of the production and the thesis and made the graduation research fruitful (winning in a techno competition). Of course, the student is not good at programming either. With C ++ and Java, it is impossible to meet student needs at this speed. Python + OpenCV + Pillow wins. Don't get me wrong, Python itself is slow, but various packages are often implemented in C and are very fast. A prominent example is that using Python + NumPy for numerical arithmetic processing is faster (development and execution speed) than writing poorly in C or FORTRAN. Also, SymPy, scikit-learn, Keras, PyTorch, Chainer ... Software at the level of "Can I publish this for free?" The moment I stepped into the world of Python, I was amazed at how rich it was, "What is this!" And "What is this ?!". The recognition that Python is a language for AI is also due to the abundance of related libraries. This is also misleading, but Python is not originally a language for AI. It means that AI researchers find such convenience very useful. Libraries such as scikit-learn for machine learning and Keras that operate TensorFlow are standard libraries in the field of machine learning / AI, and "Python is a prerequisite".
Subscripts (slices) attached to data structures such as Python lists and tuples are highly expressive, and it is easy to access the inside of the data structure. Furthermore, once you become accustomed to using functions such as map and filter and lambda expressions, you can program in a form that reflects the data structure as it is. If you do it well, you can reduce the loop by for and write the whole program concisely, so please learn it.
This is difficult to understand until you have mastered programming, but it is an advanced process that "data can be executed as a program". It is a function that can execute expressions and sentences created as character string data. (The function to execute data as a program is an indispensable function for AI programming.) It's a private matter, but I feel that Lisp is no longer needed for research activities. (→ Correction: After all Lisp is important) Now, for me, the language needed to tackle AI-related themes is -Python: System control ・ C / C ++: Development of parts that require speed -Prolog: Used for logic programming and constraint satisfaction / resolution There are three. (There is a library that easily calls Prolog functions from Python)
written by Katsunori Nakamura
Recommended Posts