In the comment of Parameter tuning with luigi, I received the information that "on-memory can be passed with luigi.mock", and I actually did it. The story I tried. Well, from the name mock, it seems that file I / O is done on-memory in a pseudo manner, and I think that it will not be so efficient. I was there.
The code used this time is as follows.
https://github.com/keisuke-yanagisawa/study/blob/20151208/luigi/mock_test.py
Use python mock_test.py main --use mock
to check the mock version.
You can run the mock-free version with python mock_test.py main
.
As you can see, it is a code that creates csv with 10000000 "1" s, separated by commas, reads it, and counts the number of characters, and the final output is 19999999. I have some time to create an array, but it's almost like this. In fact, this made a difference in the following time measurements.
I will show you the result quickly. This time, I used the time command to measure the time three times.
luigi.LocalTarget | luigi.mock.MockTarget | |
---|---|---|
First time | 10.952 sec. | 29.879 sec. |
Second time | 7.829 sec. | 30.883 sec. |
Third time | 11.137 sec. | 27.766 sec. |
Yes, I have no objection. Even though it's a mock, I didn't expect it to be this slow. As explained by the head family, it feels like a mechanism for testing.
So, let's write out a pounding file for everyday use.
Recommended Posts