In recent years, multi-core with multiple cores mounted on one CPU has become common. However, with the current programming language, it is difficult for engineers to create multi-core programs without being aware of it. Therefore, I will explain how to utilize multi-core from various languages.
A process is a running program such as each application, and a thread is a unit of CPU usage. The process has one or more threads as follows, and can process as many threads as the number of CPU cores. (In recent years, [SMT](https://ja.wikipedia.org/wiki/%E5%90%8C%E6%99%82%E3%83%9E%E3%83%AB%E3%83% 81% E3% 82% B9% E3% 83% AC% E3% 83% 83% E3% 83% 87% E3% 82% A3% E3% 83% B3% E3% 82% B0) One physics The core can handle multiple threads, such as 2 threads. It's like 2 cores and 4 threads.) In order to effectively utilize multi-core and execute a program, it is necessary to generate an appropriate number of threads on the program side for the number of cores that the CPU can process. It is possible to create more threads than the number of cores, but the CPU can only process as many threads as there are cores, and there is a problem that processing slows down due to switching threads to execute.
There are similar terms, parallel and concurrent, but they are different. Parallel is a case where multiple threads are processed by multiple cores by performing multiple processes at the same time. (Multiple processes cannot be executed at the same time with a single core, so parallel cannot be achieved.) Concurrent means that multiple processes are switched and executed at the same time, so that one thread can switch multiple processes and execute them. Since it is possible to execute while switching the processing in multiple threads, it is also possible to realize parallel and parallel.
Apache, a web server, uses a method to generate a process for each user request, and when the number of clients reaches about 10,000, the response performance drops significantly even though the hardware performance of the web server has a margin. There was a C10K problem. (The specific cause of the C10K problem was easy to understand in this article.) Therefore, in nginx and Node.js, I tried to solve the C10K problem by processing in parallel by processing asynchronous I / O in a single thread.
Node.js As mentioned above, Node.js operates in a single thread, and the approach is to process in parallel with asynchronous processing such as async / await. As shown in the figure below, the image is such that when you access the external API, other processing is performed until the result is returned, and when the result is obtained, the processing is continued. (For details, this article was easy to understand.) Therefore, when performing standard asynchronous processing, it is not possible to bring out the performance of multi-core. Therefore, in Node.js, use Cluster to create multiple processes (https://postd.cc/setting-up-a-node-js). You need to either -cluster /) or create multiple threads using worker_threads. In order to utilize the multi-core core in this way, it is necessary to create multiple processes or threads from the program side, and variable values can be shared in multi-thread, but the memory space is separated in multi-process. [Advantages and disadvantages] of not being able to share variable values (https://stackoverflow.com/questions/56656498/how-is-cluster-and-worker-threads-work-in-node-js) Exists.
In Node.js, I was able to take advantage of multi-core by creating multiple processes or threads. However, in Ruby and Python, [Global Interpreter Lock (GIL)](https://ja.wikipedia.org/wiki/%E3%82%B0%E3%83%AD%E3%83%BC%E3%83% 90% E3% 83% AB% E3% 82% A4% E3% 83% B3% E3% 82% BF% E3% 83% 97% E3% 83% AA% E3% 82% BF% E3% 83% AD% There is something called E3% 83% 83% E3% 82% AF), and even if you create multiple threads, they cannot be executed in parallel. (To be exact, it is the case of CPython and CRuby implemented in C language, but it is omitted here.) Therefore, if you try to utilize multi-core in these languages, it cannot be realized by multi-threading, and you need to create multiple processes.
In Go language, asynchronous processing is realized in parallel and parallel using something called goroutine, and by default, GOMAXPROCS
with the number of CPU cores is set. As many threads as this value are prepared, and the lightweight thread goroutine is executed in the threads.
The figure below is an image when the number of CPU cores is 4 and GOMAX PROCS = 4
.
By using goroutine in this way, you can execute programs in parallel and in parallel by taking advantage of multi-core.
(It was easy to understand why goroutine is lightweight this article.)
In Rust, asynchronous processing can be performed using async / await. At this time, you can select the execution allocation method for asynchronous processing threads depending on which runtime is used. A popular runtime is tokio. In tokio, threads are created for the number of cores, and asynchronous processing is passed to those threads, which is similar to goroutine's way of utilizing multi-core. (For other allocation methods and asynchronous processing in Rust, this article was easy to understand. Especially [About the execution model here](https://tech-blog.optim.co.jp/entry/2019/11/08/163000#%E5%AE%9F%E8%A1%8C%E3%83% A2% E3% 83% 87% E3% 83% AB) is easy to understand.)
In Ruby and Python, it is difficult to make it multi-threaded due to the mechanism, and the asynchronous processing of Node.js could not make use of multi-core as it is. However, in the Go language and Rust, which are popular in recent years, by calling asynchronous processing, parallel and parallel processing can be performed without the engineer being aware of it, and multi-core can be utilized. It's no wonder that Go and Rust are popular in the modern era when multi-core CPUs are becoming more commonplace.
[Illustration] Differences between CPU cores, threads, and processes, context switching, and multithreading Getting closer to a complete understanding of Unity asynchronous by knowing the difference between processes, threads and tasks I investigated asynchronous I / O of Node.js Node.js I can't hear anymore I searched as much as possible about the GIL that you should know if you want to do parallel processing with Python Why goroutine is lightweight [Master Rust Asynchronous Programming](https://tech-blog.optim.co.jp/entry/2019/11/08/163000#%E3%83%A9%E3%83%B3%E3%82 % BF% E3% 82% A4% E3% 83% A0% E3% 81% A7% E9% 9D% 9E% E5% 90% 8C% E6% 9C% 9F% E3% 82% BF% E3% 82% B9 % E3% 82% AF% E3% 82% 92% E8% B5% B7% E5% 8B% 95% E3% 81% 99% E3% 82% 8B)
Recommended Posts