Here are some tips to keep in mind when making your own small tools.

When you start writing, if you start writing with the idea that you're throwing it away, you'll end up using it for a long time or in various places, so it's a trick that will make things easier later if you're careful. I think there is a different style in large-scale software development.

Make the main routine a function

The function name can be anything, but as a hobby, I prepare a function named main () and write all the main routines there.

#!/usr/bin/env python

def main():
    print('hello, hello, hello!')

if __name__ == '__main__': main()

If you look at the small sample code of python, the execution code is written in solid outside the function. Of course, it still works, but the following two points can be troublesome later.

It's full of global variables and management can't keep up: If you start with "It's a small script anyway", even if you have a good view at first, you will gradually lose track of where and what variable name you are using. If you think that it is a particularly small script and give variable names loosely, you may inadvertently use the same variable name for another purpose and cause an accident.
It is executed when reading from the outside: When you want to use the class defined in this code outside, if you read from the outside with the ʻimport` statement etc., the part outside the function is executed It will end up. You can wrap such a part in a function later, but you can also wrap a part that needs to be executed globally (for example, the definition of a global variable used as a constant) for debugging. It tends to take a lot of wasted time

To prevent this, push all the processing including variables into main () and put the conditional statement __name__ =='__main__' to prevent the main routine from being executed when it is read from the outside. It is.

Manage the execution environment using virtualenv

$ pip install virtualenv
$ virtualenv venv
$ source venv/bin/activate

virtualenv is a well-known execution environment separation tool in python. There are various explanation articles on how to use virtualenv, so please refer to them, but I try to use virtualenv as much as possible even when making small tools. The reasons are as follows.

(In rare cases) I sometimes go back and forth between python2 and python3, so I want to make sure that it works with the version decided for each project.
Make sure you know the package used in the project. If you can save the specification package list as pip freeze> requirements.txt by not mixing it with the whole environment, you can move to another environment (for example, a server that sends analysis from your PC all night). To be smooth when moving

Manage parameters with argparse

When writing a write-down tool or script, you may have a fixed variable to pass a fixed value, but if possible, use ʻargparse` to get it from the command line.

#!/usr/bin/env python

import argparse

def main():
    #Creating a parser
    psr = argparse.ArgumentParser()
    # -w / --Added an option called word. The default is'hello! '
    psr.add_argument('-w', '--word', default='hello! ')
    # -s / --Added an option called size, default is 5 and type is int
    psr.add_argument('-s', '--size', default=5, type=int)
    #Parse command line arguments and put them in args. Exit if there is an error
    args = psr.parse_args()
    print(args.word * args.size)

if __name__ == '__main__': main()

It's certainly a bit tedious compared to writing variables, but on the other hand, when you have to mess with parameters or change the file to read data, you can specify it with command line arguments = code itself You don't have to mess with it, so it's very convenient for trial and error. If you specify a value for default, you don't have to specify it one by one.

> python t3.py -s 6
hello! hello! hello! hello! hello! hello!
> python t3.py -s 2
hello! hello!
> python t3.py -w 'hoge '
hoge hoge hoge hoge hoge

View progress with tqdm

When waiting for processing by turning a loop of tens of thousands of times with a for statement, how far has it progressed now? I would write code like ʻif i% 1000 == 0: print ('now =', i)because I wanted to confirm that, but using the library for that is easier and richer. You can get information. Personally, I use a library calledtqdm`.

#!/usr/bin/env python

from tqdm import tqdm
import time

def main():
    for i in tqdm(range(100)):
        time.sleep(0.02)

if __name__ == '__main__': main()

By using this, you can see how many cases have been processed now, how many cases have been processed per second, and if the end is known, what percentage has been processed. It's easy to use because it just encloses the iterator like tqdm (iter (...)).

Prepare a function to log

When writing a small script, the progress of the process and the result are printed out by print, but I wanted to write it to a file in case I wanted to enter the time in the middle or the result to the standard output flowed. I will come. Originally, it may be a good idea to use the logging module, but in the case of a script that you write yourself, it is faster and it may be effective to write it yourself, so the followinglog () I often prepare a function like.

`t5.py`


#!/usr/bin/env python

import datetime

def log(*args):
    msg = ' '.join(map(str, [datetime.datetime.now(), '>'] + list(args)))
    print(msg)
    with open('log.txt', 'at') as fd: fd.write(msg + '\n')

def main():
    log('generated', 5, 'items')

if __name__ == '__main__': main()

$ python t5.py
2017-09-04 12:39:38.894456 > generated 5 items
$ cat log.txt
2017-09-04 12:39:38.894456 > generated 5 items

in conclusion

The tips of the type to write in the code are put together in gist.

If you have any other tricks like "There are some tricks like this!", Please let me know.

Tips for making small tools in python