This article is the 14th day of Docker Advent Calendar 2015.
Summary
This story is from PyCon JP 2015's Talk Session "[Baseball Hack! ~ Data analysis and visualization using Python](http://www.slideshare.net/shinyorke/hackpython-pyconjp" Baseball Hack! ~ Data using Python. This is an excerpt and detailed explanation version of the story "I made a batch to acquire baseball data with Docker and parse-crontab!" That was shown in the second half of "Analysis and Visualization") ".
Basically, there are many parts that I write for my convenience and thoughts, so I am waiting for opinions and suggestions m (_ _) m
Also, in making this material,
** Implementing scheduled execution processing in Python (GAUJIN.JP/Gojin) **
I referred to the above entry.
[Announcement of XP Festival 2015] before PyCon JP 2015 ,,, [http://www.slideshare.net/shinyorke/agile-baseball-science-52692504 "Agile Baseball Science --Agile Baseball that works well for the brain Baseball data was needed for Hanashi "), and development began.
I was just making an announcement for PyCon JP 2015, so I asked on Twitter to "reprint it!" → Thank you for your kind consent.
The baseball code is a little complicated (& I don't want to explain it too much), so I made a sample.
python-crontab-docker-example(GitHub)
The recommended environment is Python 3.4.x or higher.
By the way, it works fine with the latest version Python 3.5.1 at the moment (2015/12/14)!
Define JobController to execute batch and JobSettings to manage the next execution time & interval until execution.
The argument of JobController.run () is the familiar crontab setting (* * * * *).
scheduler/job.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import time
import functools
import logging
from crontab import CronTab
from datetime import datetime, timedelta
import math
__author__ = 'Shinichi Nakagawa'
class JobController(object):
"""
Job execution controller
"""
@classmethod
def run(cls, crontab):
"""
Processing execution
:param crontab: job schedule
"""
def receive_func(job):
@functools.wraps(job)
def wrapper():
job_settings = JobSettings(CronTab(crontab))
logging.info("->- Process Start")
while True:
try:
logging.info(
"-?- next running\tschedule:%s" %
job_settings.schedule().strftime("%Y-%m-%d %H:%M:%S")
)
time.sleep(job_settings.interval())
logging.info("->- Job Start")
job()
logging.info("-<- Job Done")
except KeyboardInterrupt:
break
logging.info("-<- Process Done.")
return wrapper
return receive_func
class JobSettings(object):
"""
Output settings
"""
def __init__(self, crontab):
"""
:param crontab: crontab.CronTab
"""
self._crontab = crontab
def schedule(self):
"""
Next run
:return: datetime
"""
crontab = self._crontab
return datetime.now() + timedelta(seconds=math.ceil(crontab.next()))
def interval(self):
"""
Time to next execution
:return: seconds
"""
crontab = self._crontab
return math.ceil(crontab.next())
Import JobController, Pool the process to be executed, and execute it in parallel.
The sample is made like teaching baseball every Friday night at Tamori Club and every day at 18:00.
As a point to note
That's about it.
batch.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import logging
from multiprocessing import Pool
from scheduler.job import JobController
__author__ = 'Shinichi Nakagawa'
#Note that the Docker Image Timezone is UTC!
@JobController.run("20 15 * * 5")
def notice_tmr_club():
"""
It's time for the Tamori Club(Tokyo)
:return: None
"""
logging.info("The Tamori Club will begin! !! !!")
#Note that the Docker Image Timezone is UTC!(I said it twice because it's important)
@JobController.run("00 9 * * *")
def notice_baseball():
"""
Teach the time of Yakiu
:return: None
"""
logging.info("It's time to go! !! !! !!")
def main():
"""
method to run crontab
:return: None
"""
#Log settings(Info level, format, timestamp)
logging.basicConfig(
level=logging.INFO,
format="time:%(asctime)s.%(msecs)03d\tprocess:%(process)d" + "\tmessage:%(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
)
#Register the job you want to run with crontab
jobs = [notice_tmr_club, notice_baseball]
# multi process running
p = Pool(len(jobs))
try:
for job in jobs:
p.apply_async(job)
p.close()
p.join()
except KeyboardInterrupt:
logging.info("exit")
if __name__ == '__main__':
main()
Dockerfile
This is simple.
Include the GitHub code, launch and exit.
Dockerfile
# Python crontab sample
FROM python:3.5.1
MAINTAINER Shinichi Nakagawa <[email protected]>
# add to application
RUN mkdir /app
WORKDIR /app
ADD requirements.txt /app/
RUN pip install -r requirements.txt
ADD ./scheduler /app/scheduler/
ADD *.py /app/
docker-compose
Write the docker startup settings in docker-compose.yml.
However, move batch.py and finish.
docker-compose.yml
batch:
build: .
dockerfile: ./Dockerfile
command: python batch.py
container_name: python_crontab_example
If you docker-compose up (or docker run) and move like this, it's OK.
$ docker-compose up
Creating python_crontab_example
Attaching to python_crontab_example
python_crontab_example | time:2015-12-13 13:45:09.463 process:9 message:->- Process Start
python_crontab_example | time:2015-12-13 13:45:09.464 process:8 message:->- Process Start
python_crontab_example | time:2015-12-13 13:45:09.465 process:9 message:-?- next running schedule:2015-12-18 15:20:00
python_crontab_example | time:2015-12-13 13:45:09.465 process:8 message:-?- next running schedule:2015-12-14 09:00:00
Thank you @tamai.
Recommended Posts