How to do multi-core parallel processing with python

Previously, I had to analyze a little big data, which took a lot of time to process. At that time, I will summarize the method used to speed up the processing.

The following is using the multiprocessing module.

import

multi.py


from multiprocessing import Pool
from multiprocessing import Process

How to use

Basic

It is like this.

multi.py


def function(hoge):
    #Thing you want to do
    return x

def multi(n):
    p = Pool(10) #Maximum number of processes:10
    result = p.map(function, range(n))
    return result

def main():
    data = multi(20)
    for i in data:
        print i

main()

In this case, the process is "execute the function 20 times by changing the value to 0,1,2 ... 19". Since the return value of function is included in the result as a list, it is received and output as standard.

Also, in my environment, I can use up to 12 cores (6 cores and 12 threads to be exact), so I set the maximum number of processes to 10. If you use it to the maximum, it will be difficult to open the browser, so it is safe to stop it.

CPU usage

The CPU usage rate during parallel processing is also listed. Screenshot .png You can see that parallel processing is properly performed with multiple cores like this.

Get process id

You can also get the process id that is in charge of each process.

multi.py


import os

def fuction(hoge):
    #Thing you want to do
    print 'process id:' + str(os.getpid())
    return x

#Omitted below

It is interesting to know that if you display it like this, it is being executed in a different process.

How fast was it

The process, which took about 35 hours, was completed in just over 4 hours. The processing time is less than 1/10, which is a sufficient result.

Of course, the speed of each process is not increasing, so it is necessary to allocate work evenly in order to improve efficiency, but I think that it is useful because there are many such things in the analysis system.

Recommended Posts

How to do multi-core parallel processing with python
How to do portmanteau test with python
How to do hash calculation with salt in Python
An introduction to Python distributed parallel processing with Ray
Python: How to use async with
[Python] Easy parallel processing with Joblib
To do tail recursion with Python2
How to get started with Python
What to do with PYTHON release?
How to use FTP with Python
How to calculate date with python
How to do Bulk Update with PyMySQL and notes [Python]
How to do arithmetic with Django template
How to do R chartr () in Python
Let Heroku do background processing with Python
How to work with BigQuery in Python
How to display python Japanese with lolipop
How to enter Japanese with Python curses
[Python] How to deal with module errors
How to install python3 with docker centos
How to upload with Heroku, Flask, Python, Git (4)
How to read a CSV file with Python 2/3
How to enjoy programming with Minecraft (Ruby, Python)
[REAPER] How to play with Reascript in Python
How to install Python
Strategy on how to monetize with Python Java
How to achieve time wait processing with wxpython
[Python] How to draw multiple graphs with Matplotlib
[Python] How to read excel file with pandas
How to crop an image with Python + OpenCV
How to install python
How to specify attributes with Mock of python
How to measure execution time with Python Part 1
Do Houdini with Python3! !! !!
Image processing with Python
How to use tkinter with python in pyenv
Parallel processing with multiprocessing
Parallel processing with no deep meaning in Python
[Python] How to handle Japanese characters with openCV
[Python] How to compare datetime with timezone added
How to measure execution time with Python Part 2
3. Natural language processing with Python 1-2. How to create a corpus: Aozora Bunko
How to take multiple arguments when doing parallel processing using multiprocessing in python
High resolution acoustic signal processing (1) --How to read 24-bit wav file with Python
How to convert / restore a string with [] in python
[Introduction to Python] How to use while statements (repetitive processing)
How to add help to HDA (with Python script bonus)
[Python] How to draw a line graph with Matplotlib
How to scrape image data from flickr with python
How to measure processing time in Python or Java
[Introduction to Python] How to iterate with the range function?
Explain in detail how to make sounds with python
How to upload with Heroku, Flask, Python, Git (Part 3)
How to do zero-padding in one line with OpenCV
How to run tests in bulk with Python unittest
[Python] How to specify the download location with youtube-dl
[Chapter 5] Introduction to Python with 100 knocks of language processing
How to measure mp3 file playback time with python
How to use python interactive mode with git bash
How to convert JSON file to CSV file with Python Pandas
[Chapter 3] Introduction to Python with 100 knocks of language processing