Access to government statistics from today (October 31, 2014) It seems that the Web API has been released. http://www.e-stat.go.jp/api/
A list of available data can be found at the URL below. There is a lot from the census to the labor statistics. http://www.e-stat.go.jp/api/api-info/api-data/
It looks interesting! The record I used for that.
First, access this page and register as a user. http://www.e-stat.go.jp/api/regist-login/ Enter your email address and name.
A notification has arrived at your email address, so click it to activate it.
Next, log in. Get the application ID. It seems to be up to 3 IDs per person. From now on, the application ID will be xxx.
The procedure is as follows.
#!/usr/bin/env python
#-*- coding: utf-8 -*-
import httplib2
import lxml.etree
import pylab
import matplotlib.font_manager as fm
#Initial setting
h = httplib2.Http('.cache')
key = "xxx"
baseUrl = "http://api.e-stat.go.jp/rest/1.0/app"
statsCode = "00200521"
#First data for government statistics code 00200521
# (1980 Census)Fetch the data ID of
print "getStatusList..."
cmd = "%s/getStatsList?appId=%s&statsCode=%s"
response, content = h.request(cmd % (baseUrl, key, statsCode))
xml = lxml.etree.fromstring(content)
dataid = xml.xpath('//LIST_INF')[0].attrib["id"]
#Extract the actual data using the data ID as a key
print "getStatusData..."
cmd = "%s/getStatsData?appId=%s&statsDataId=%s"
response, content = h.request(cmd % (baseUrl, key, dataid))
xml = lxml.etree.fromstring(content)
#Extraction of category name
categories = {}
for c in xml.xpath("//CLASS_OBJ"):
categories[c.attrib["id"]] = {"name": c.attrib["name"],
"labels": {}}
print c.attrib["id"]
for label in c.xpath("CLASS"):
print label.attrib["name"], label.attrib["code"]
categories[c.attrib["id"]]["labels"][label.attrib["code"]] = label.attrib["name"]
#Extracting the value
values = [{"cat01": v.attrib["cat01"],
"cat02": v.attrib["cat02"],
"cat03": v.attrib["cat03"],
"area": v.attrib["area"],
"value": int(v.text)}
for v in xml.xpath('//VALUE')]
#Age group(cat03)Aggregation by
c = categories["cat03"]
data = []
labels = []
for code in sorted(c["labels"].keys())[1:]:
labels.append(c["labels"][code])
data.append(sum([v["value"] for v in values if v["cat03"] == code]))
print data
#plot
width = 0.5
x = pylab.arange(len(data))
prop = fm.FontProperties(fname='/Library/Fonts/Osaka.ttf') # for mac
pylab.barh(x, data, width)
pylab.yticks(x + width / 2, labels)
pylab.show()
Click here for results
Official manual http://www.e-stat.go.jp/api/wp/wp-content/uploads/2014/10/API-spec.pdf
Interface to touch the API of the web http://www.e-stat.go.jp/api/sample/testform/
Recommended Posts