Python3 datetime is faster just by specifying the timezone

Introduction

The standard library for handling time in Python is datetime. It's a library that can be used without any problems even if you use it properly, and you don't even have to worry about performance if you just call it a little. However, when it is necessary to generate datetime tens of thousands or tens of millions of times, the bottleneck becomes apparent.

So, I found that the performance can be improved a little by just paying attention to a little thing, so I would like to introduce it.

The conclusion is that we will use the standard library timezone to generate datetime.

Conclusion

Suddenly write only the conclusion. After that, please see if you are interested. I think it is better to generate datetime as follows. The point is whether to specify timezone. .. .. That's all.

from datetime import datetime, timedelta, timezone

#Time zone generation
JST = timezone(timedelta(hours=+9), 'JST')

# GOOD,The time zone is specified. early
datetime.now(JST)
datetime.fromtimestamp(UNIX time, JST)

# NG,The time depends on the environment. Slow compared to not specifying a timezone
datetime.now()
datetime.fromtimestamp(UNIX time)

Performance comparison

Immediately, I will try to generate datetime by various methods. Measure the processing time when datetime is generated 10 million times. How much the performance changes depending on whether or not the time zone is specified. I hope you can refer to it.

--Execution environment: - OS: Mac - CPU: Core i5 1.6Ghz --Memory: 4GB DDR3 --Language: Python3.6.2

If you specify a time zone

This is the fastest pattern (as far as I know).

zikan1.py


from datetime import datetime, timedelta, timezone

JST = timezone(timedelta(hours=+9), 'JST')

for _ in range(10000000):
  datetime.now(JST)
$ time python zikan1.py
real	0m7.581s
user	0m7.167s
sys	0m0.114s

It looped 10 million times and the result was 7 seconds.

If you do not specify a time zone

If you do not specify the time zone, it will be slightly slower. Slightly. ..

zikan2.py


from datetime import datetime

for _ in range(10000000):
  datetime.now()
$ time python zikan2.py
real	0m9.609s
user	0m9.149s
sys	0m0.111s

It's about 9 seconds. It's a little late.

When the time zone is specified by pytz

pytz is a library that is often used to specify the time zone in Python2 series. That's because Python2 didn't yet implement the timezone class. .. ..

zikan3.py


import pytz
from datetime import datetime

# third party
JST = pytz.timezone('Asia/Tokyo')

# performance testing
for _ in range(10000000):
  datetime.now(JST)
$ time python zikan3.py
real	1m9.173s
user	1m6.999s
sys	0m0.584s

It was much slower than I expected. We will discuss this later.

When the time zone is specified in Python2 system

Since it is a good idea, I also tried benchmarking with Python 2 system. In Python2 series, only the interface class for the time zone called tzinfo is provided, so you have to implement it yourself. Tedious. The pytz may have become popular because of the trouble.

zikan4.py


from datetime import datetime, timedelta, tzinfo

class JST(tzinfo):
  def utcoffset(self, dt):
    return timedelta(hours=9)


  def dst(self, dt):
    return timedelta(0)


  def tzname(self, dt):
    return 'JST'

for _ in range(10000000):
  datetime.now(JST())
$ time python zikan3.py
real	0m55.416s
user	0m51.131s
sys	0m1.532s

slow. .. .. ..

result

Time zone specified (7s) <Time zone not specified (9s) <python2 (51s) <pytz (66s) It became a feeling.

Detailed story

Why pytz is so slow

Both the timezone class of the standard library and the timezone class of pytz are implementation classes of tzinfo. However, there is a difference between heaven and earth in performance. Why? I haven't come up with a definite answer, but profiling makes the difference between the two obvious.

$ python -m cProfile -s cumtime zikan1.py
         10001072 function calls (10001061 primitive calls) in 9.107 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      2/1    0.000    0.000    9.107    9.107 {built-in method builtins.exec}
        1    3.149    3.149    9.107    9.107 zikan1.py:1(<module>)
 10000000    5.950    0.000    5.950    0.000 {built-in method now}
      3/1    0.000    0.000    0.009    0.009 <frozen importlib._bootstrap>:958(_find_and_load)
      3/1    0.000    0.000    0.009    0.009 <frozen importlib._bootstrap>:931(_find_and_load_unlocked)
$ python -m cProfile -s cumtime zikan3.py
         70022021 function calls (70021903 primitive calls) in 83.138 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     14/1    0.001    0.000   83.138   83.138 {built-in method builtins.exec}
        1    3.185    3.185   83.138   83.138 zikan5.py:1(<module>)
 10000000    9.225    0.000   79.868    0.000 {built-in method now}
 10000000   17.973    0.000   70.643    0.000 tzinfo.py:179(fromutc)
 20000000   43.347    0.000   43.347    0.000 {method 'replace' of 'datetime.datetime' objects}

The part I want to pay attention to is when running zikan3.py

10000000   17.973    0.000   70.643    0.000 tzinfo.py:179(fromutc)
20000000   43.347    0.000   43.347    0.000 {method 'replace' of 'datetime.datetime' objects}

It is the part of. pytz has been called datetime.replace () and has been executed 20 million times. Not only that, the fromutc function is being called. In other words, the process tz.fromutc (datetime.now (). replace (tzinfo = tz)) is running. Generate datetime of time in UTC => Generate datetime with timezone => Convert to datetime of timezone. .. That's right.

On the other hand, when the time zone of the standard library is given as an argument, it seems that {built-in method now} processes that side, and although the internal specifications are unknown, it seems to process efficiently. I can tell you that.

in conclusion

By the way ... In other words, let's use the standard library timezone! that's all!

Please do not hesitate to point out any inaccurate content such as typographical errors.

Recommended Posts

Python3 datetime is faster just by specifying the timezone
I tried using the Datetime module by Python
About Python datetime and timezone
Python release cycle is faster!
[Python3] Call by dynamically specifying the keyword argument of the function
[Python] What is @? (About the decorator)
[python] What is the sorted key?
What is the python underscore (_) for?
Python does not output errors or output just because the indent is misaligned
Download the file by specifying the download destination with Python & Selemiun & Chrome (Windows version)
How to automatically notify by phone when the python system is down
How to sort by specifying a column in the Python Numpy array.
[Python] What is inherited by multiple inheritance?
[Python] Visualize the information acquired by Wireshark
[Python] Round up with just the operator
Where is the python instantiation process written?
Install by specifying the version with pip
What is "mahjong" in the Python library? ??
Read the file line by line in Python
Read the file line by line in Python
Read the file by specifying the character code.
[python] [meta] Is the type of python a type?
Pandas of the beginner, by the beginner, for the beginner [Python]
Which is faster, Python shuffle or sample?
Import by directly specifying the directory path
Why Python slicing is represented by a colon (:)
The answer of "1/2" is different between python2 and 3
[Xonsh] The Python shell is sharp and god
What is wheezy in the Docker Python image?
Sort the elements of the array by specifying the conditions
Wagtail is the best CMS for Python! (Perhaps)
Specifying the range of ruby and python arrays
About the difference between "==" and "is" in python
[Python] Sort the table by sort_values (pandas DataFrame)
BeautifulSoup trick: Decide the Tag by specifying the path
This is the only basic review of Python ~ 1 ~
Specifying the module loading destination with GAE python
This is the only basic review of Python ~ 2 ~
How to erase the characters output by Python
This is the only basic review of Python ~ 3 ~
[Python] How to compare datetime with timezone added
Sort tuple list in Python by specifying the ascending / descending order of multiple keys
How to execute a schedule by specifying the Python time zone and execution frequency