Examine Python script bottlenecks with the cProfile module

This article is the 24th day article of Takumi Akashiro Alone Advent Calendar 2020.

At the beginning

Python tools are slow, but I don't know where to fix them ... Have you ever experienced that? This is the cProfile module, which is useful in such cases!

I will try using it for the time being

If you pass a string to cProfile.run,exec (<string>)will be measured and the result will be output.

>>> import time
>>> import cProfile

>>> cProfile.run('[time.sleep(1) for x in range(3)]') 
         7 function calls in 3.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.000    3.000 <string>:1(<listcomp>)
        1    0.000    0.000    3.000    3.000 <string>:1(<module>)
        1    0.000    0.000    3.000    3.000 {built-in method builtins.exec}
        3    3.000    1.000    3.000    1.000 {built-in method time.sleep}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

It's just 3 seconds![^ 1] Various information comes out like this.

[^ 1]: Actually, the tot time was off by 3.003, so I remeasured it about 6 times.

Element name meaning
ncall Function call count
tottime Total time of function
(Processing time for that function only. Excludes child functions.)
percall(First time) Time taken for one function
(Processing time for that function only. Excludes child functions.)
cumtime Total time of function(Accumulation)
percall(Second time) Time taken for one function(Accumulation)

Basically, there are strangely many ncalls, and it feels like searching for a function with bulging tottime () to improve efficiency.

Eh, is it troublesome to pass a character string? That's right…… Actually, I don't use this method either, and I attach the decorator to the function and measure it.

So here is the decorated function.

profiler.py



import cProfile
import pstats

def profile(func):
    LENGTH = 10
    def func_(*args, **kwargs):
        profile = cProfile.Profile()
        profile.enable()
        result = func(*args, **kwargs)
        profile.disable()
        pstats.Stats(profile).sort_stats(2).print_stats(LENGTH)
        return result
    return func_

Try using a decorator

I couldn't think of anything to speed it up, so I'll try a suitable regular expression a million times.

sample.py


import profiler
import re

@profiler.profile
def v_1():
    [re.search(r'\d*', str(x)) for x in range(1000000)]

@profiler.profile
def sample_2():
    reg = re.compile(r'\d*')
    [reg.search(str(x)) for x in range(1000000)]

if __name__ == '__main__':
    sample_1()
    sample_2()
python.exe d:/catbox/sample.py
         4000139 function calls (4000135 primitive calls) in 1.365 seconds

   Ordered by: cumulative time
   List reduced from 53 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.032    0.032    1.365    1.365 d:/catbox/sample_1.py:4(sample_1)
        1    0.361    0.361    1.333    1.333 d:/catbox/sample_1.py:6(<listcomp>)
  1000000    0.286    0.000    0.972    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:196(search)
  1000000    0.272    0.000    0.382    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:287(_compile)
  1000000    0.303    0.000    0.303    0.000 {method 'search' of 're.Pattern' objects}
  1000016    0.110    0.000    0.110    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\sre_compile.py:759(compile)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\sre_parse.py:937(parse)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\enum.py:828(__and__)
        2    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\enum.py:278(__call__)


         1000006 function calls in 0.582 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.033    0.033    0.582    0.582 d:/catbox/sample_1.py:8(sample_2)
        1    0.276    0.276    0.550    0.550 d:/catbox/sample_1.py:11(<listcomp>)
  1000000    0.273    0.000    0.273    0.000 {method 'search' of 're.Pattern' objects}
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:248(compile)
        1    0.000    0.000    0.000    0.000 C:\Users\<username>\AppData\Local\Programs\Python\Python38\lib\re.py:287(_compile)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Sample_2, which was pre-compiled with re.compile, is about 0.7 seconds faster! (I couldn't think of any good material ...)

Tightening

The number of calls tends to increase more than you think, so just measure and reduce unnecessary processing. There are times when you will be able to move quite comfortably. If you think it's too late, let's try it!

~~ I was thinking of giving a better story, but as a result of verification, I couldn't realize it ... This year was a sad Christmas Eve. ~~

Recommended Posts

Examine Python script bottlenecks with the cProfile module
Specifying the module loading destination with GAE python
POST json with Python3 script
About the Python module venv
Call the API with python3.
I replaced the Windows PowerShell cookbook with a python script.
Minimum knowledge to get started with the Python logging module
[Python] Get the script execution directory with an absolute path
Extract the xz file with python
Get the weather with Python requests 2
Find the Levenshtein Distance with python
Hit the Etherpad-lite API with Python
Install the Python plugin with Netbeans 8.0.2
Debug the script with Sakura Editor
I liked the tweet with python. ..
Examine the object's class in python
Master the type with Python [Python 3.9 compatible]
Try using the Python Cmd module
Write a batch script with Python3.5 ~
Master the weakref module in Python
The story of making a module that skips mail with python
Create a compatibility judgment program with the random module of python.
Make the Python console covered with UNKO
[Python] Set the graph range with matplotlib
Suddenly with Python PyInstaller No module named pyinstaller
Behind the flyer: Using Docker with Python
Execute Python script with cron of TS-220
Play around with the pythonista3 ui module
Check the existence of the file with python
[Python] Get the variable name with str
[Python] Round up with just the operator
Display Python 3 in the browser with MAMP
Search the maze with the python A * algorithm
Let's read the RINEX file with Python ①
Working with OpenStack using the Python SDK
Run the Python interpreter in a script
Download files on the web with Python
[Python] Import the module one level higher
Check the path of the Python imported module
I get a Python No module named'encodings' error with the aws command
Install the python module with pip on a server without root privileges
Create a record with attachments in KINTONE using the Python requests module
Learn the design pattern "Singleton" with Python
[Python] Automatically operate the browser with Selenium
[Python] How to deal with module errors
Get the return value of an external shell script (ls) with python3
Install the Python module in any directory
Learn the design pattern "Facade" with Python
The road to compiling to Python 3 with Thrift
Python script that makes UTF-8 files with all BOMs under the folder without BOMs
Error when installing a module with Python pip
[Python] Get the files in a folder with Python
Prepare the execution environment of Python3 with Docker
I tried "differentiating" the image with Python + OpenCV
[Python] Let's execute the module regularly using schedule
Find the mood value with python (Rike Koi)
[Note] Export the html of the site with python.
The easiest way to synthesize speech with python
Try to solve the man-machine chart with Python
Calculate the total number of combinations with python
Specify the Python executable to use with virtualenv