Participated in AtCoder ABC151. Solve Problem D in C ++ / Python3 / PyPy3 with a method that is probably not optimal for this problem ([Warshall-Floyd Method](https://ja.wikipedia.org/wiki/Warshall-Floyd Method)) I measured and compared it.
The result was that C ++ was more stable and 60-70 times faster when the problem size increased to some extent. If you have any concerns about TLE, it seems better to stop submitting ~~ Implement ~~ in Python.
Later, when I received a comment and submitted it in PyPy3 (2.4.0), I found that TLE was resolved. However, the execution time was 1191 msec, which was considerably slower than C ++'s 71 msec.
So, if you are worried about TLE, it seems better to 1. Submit implementation in C ++ 2. Implement in Python PyPy.
Participated in AtCode ABC151. When I saw Problem D, I thought as follows.
"It can be solved by the method on the ant book (Warshall-Floyd method). But the nodes (mass) are connected only to the top, bottom, left and right, and the distance between adjacent nodes is always 1, so there is a more efficient method. so.
But O (N ^ 3) and N = 400, so 400 ^ 3 = 64,000,000, isn't it? It's a hassle to think of other methods, and I know that this will solve the problem, so let's use the Worshall-Floyd method. "
Implemented in Python without thinking too much. I debugged until the time limit (3 minutes ago) and submitted # 9476488 (20-01-12 22:37:01), but the result Is TLE. I was sad.
Later, I implemented it in C ++ and submitted # 9484896, and the execution time was 71msec, which was a margin. I want to refer to it in the future.
In the comment, "I did not get TLE when I put it out with PyPy 3. I got 3 WAs." Certainly, when I implemented it with PyPy, TLE disappeared and WA came out from what was not TLE. WA became AC after changing INF to 10 ** 3. (Submission # 9513468) However, the execution time was 1191msec, which was not TLE, but there was a big difference from 71msec in C ++.
In response to the above, I remeasured the speed with Python / PyPy with INF = 10 ** 3 on my machine.
** Speed ratio **
(Extended vertical axis)
The figure above is
--Run on ubuntu KVM node on your home NUC (CentOS). --Change the question size from 1 to 20. --Repeat one size 5 times each in C ++ / Python / PyPy. --Measure the elapsed execution time with the time command. --Calculate the average of 5 runs. --Plot the vertical axis as (average execution time in Python or PyPy) / (average execution time in C ++) and the horizontal axis as the problem size (H, W (H = W)).
It is the result obtained by.
C ++: In Python, the Python code is INF = 100. (If you submit this code to AtCoder with PyPy3, you will get a WA) In C ++: Python (# 2) and C ++: PyPy, the Python code is INF = 1000.
--It can be seen that there is no big difference in execution time between when INF = 100 and when INF = 1000. --In Python and C ++, when the problem size is large, the ratio of execution speed is 60 times or more. --In PyPy and C ++, when the problem size is large, it can be seen that the ratio of execution speed is about twice. --According to the results submitted by AtCoder, the execution time of C ++ is 71msec and the execution time of PyPy3 is 1191msec, which is more than 10 times the speed difference, but it is strange that the difference is only about 2 times on my server. Perhaps it's related to the PyPy version of AtCoder being 2.4.0, while the home server is 7.3.0.
myoden@ubuntu002:~/AtCoder/ABC151/D$ pypy3 -V
Python 3.6.9 (1608da62bfc7, Dec 23 2019, 10:50:04)
[PyPy 7.3.0 with GCC 7.3.1 20180303 (Red Hat 7.3.1-5)]
myoden@ubuntu002:~/AtCoder/ABC151/D$`
The problem case used will be described later.
** Elapsed execution time ** Elapsed execution time C ++
Elapsed execution time Python (INF = 100)
Elapsed execution time PyPy (INF = 1000)
The graph shows the maximum value (MAX), average value (AV), and minimum value (MIN) of the elapsed time when the test was performed 5 times.
The sample case when the problem size was changed was created by changing H (= W) from 1 to 20 with H = W as follows.
sampleSquare-10.in
myoden@ubuntu002:~/AtCoder/ABC151/D$ cat sampleSquare-10.in
10 10
.#.#.#.#.#
..........
.#.#.#.#.#
..........
.#.#.#.#.#
..........
.#.#.#.#.#
..........
.#.#.#.#.#
..........
sampleSquare-20.in
myoden@ubuntu002:~/AtCoder/ABC151/D$ cat sampleSquare-20.in
20 20
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
.#.#.#.#.#.#.#.#.#.#
....................
myoden@ubuntu002:~/AtCoder/ABC151/D$
Recommended Posts