Summary

I wrote a script to download stock price information from Stock Investment Memo without scraping.

how to use

python ./stockDownload.py -c 7203

7203 Toyota Motor Corporation's 2019 daily data can be downloaded as csv. If the download is successful, it returns Code: 7203 download finished., and if it fails, it returns Code: not valid..

Motivation

Scraping is prohibited from Yahoo! finance. The method of scraping stock price information from Stock Investment Memo was disclosed [^ 1], but the format may be changed and parsing may not work. On the other hand, there is a download button on the site, so I was investigating whether I could make good use of it.

How the download is done

After pressing the download button, I analyzed it from the network tab of the google developer tool. It seems that data is POSTed to https://kabuoji3.com/stock/file.php.

Parts that need adjustment

--Since the script's fullName is the save destination, change it as appropriate. --If there is no header, a 403 error will occur, so check user-agent with the google developer tool. [^ 2] --sleep (3) is included to avoid excessive server load.

script

`{stockDownload.py}`


#!/usr/bin/env python

import requests
import re
import click
from time import sleep

@click.command()
@click.option("--code", "-c", "code", required=True,
        help="Stock code to download.")
def main(code):
    year = "2019"
    session = requests.Session()
    headers = {
                "User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36"
                }
    data = {
            "code":code,
            "year":year,
            "csv":""
            }
    url = "https://kabuoji3.com/stock/file.php"
    res = session.post(url, data=data, headers=headers)
    try:
        contentDisposition = res.headers['Content-Disposition']
        fileName = re.findall(r'\"(.+?)\"', contentDisposition)[0]
        fullName = ~/Documents/projects/ipo/data/stock/{}".format(fileName)
        with open(fullName, "wb") as saveFile:
                saveFile.write(res.content)
        print("Code: {} download finished.".format(code))
    except KeyError:
        print("Code: {} not valid.".format(code))
    sleep(3)

if __name__ == '__main__':
    main()

Feelings

I created a cli using click for the first time. I think it's easier to read than sys.argv. All you have to do is use shell's cat code | while read line: do python ./stockDownload.py -c $ line; done. Since it is cp932 encoded, it needs to be converted as nkf.

Reference material

[Python] Pseudo-click the button with requests How to Write Python Command-Line Interfaces like a Pro

[^ 1]: [Python] Get stock price data by scraping [^ 2]: [Python] What to do when scraping 403 Forbidden: You do n’t have permission to access on this server

Download Japanese stock price data with python