I am studying with reference to O'Reilly Japan's "Data Visualization Beginning with Python and JavaScript".
"Requests" in Python is a library that makes it easy to handle HTTP exchanges in Python.
pip install requests
pip install --upgrade ndg-httpsclient
Download Wikipedia page (get HTML page and inline JavaScript)
>>> import requests
>>> response = requests.get("https://ja.wikipedia.org/wiki/Python");
>>>
>>> #Get a list of attributes of the responsep object
>>> dir(response)
['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
>>>
>>> #Get HTTP status code from response object
>>> response.status_code
200
>>>
>>> #You can get the HTML page and inline JavaScript by getting the text property of the response object
>>> response.text
'<!DOCTYPE html>\n<html class="client-nojs" lang="ja" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Python - Wikipedia</title>\n<script>document.documentElement.className = document.documentElement.className.replace( /(^|\\s)client-nojs(\\s|$)/, "$1client-js$2" );</script>\n<script>(window.RLQ=window.RLQ||[]).push(function(){mw.config.set({"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"Python","wgTitle":"Python","wgCurRevisionId":65321720,"wgRevisionId":65321720,"wgArticleId":993,"wgIsArticle":true,"wgIsRedirect":false,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Programming language","Object-oriented language","Scripting language","Open Source","Python"],"wgBreakFrames
...
JSON format data acquisition
>>> import requests
>>> response = requests.get("https://www.oreilly.co.jp/books/9784873118086/biblio.json");
>>>
>>> #Get JSON data
>>> data = response.json()
>>> data
{'title': 'Data visualization starting with Python and JavaScript', 'picture_large': 'http://www.oreilly.co.jp/books/images/picture_large978-4-87311-808-6.jpeg', 'picture': 'http://www.oreilly.co.jp/books/images/picture978-4-87311-808-6.gif', 'picture_small': 'http://www.oreilly.co.jp/books/images/picture_small978-4-87311-808-6.gif', 'authors': ['Kyran Dale\by u3000', 'Takeshi Shimada\translated by u3000', 'Tetsuya Kinoshita\u3000 translation'], 'released': '2017-08-25', 'pages': 500, 'price': 4104, 'ebook_price': 3283, 'original': 'Data Visulalization with Python and JavaScript', 'original_url': 'http://shop.oreilly.com/product/0636920037057.do', 'isbn': '978-4-87311-808-6'}
>>>
>>> #Get key value
>>> data.keys()
dict_keys(['title', 'picture_large', 'picture', 'picture_small', 'authors', 'released', 'pages', 'price', 'ebook_price', 'original', 'original_url', 'isbn'])
>>>
>>> #Get title
>>> data["title"]
'Data visualization starting with Python and JavaScript'
Data visualization starting with Python and JavaScript https://www.oreilly.co.jp/books/9784873118086/
Requests: HTTP for humans http://requests-docs-ja.readthedocs.io/en/latest/user/quickstart/ Next time, I will study how to use data from Web API.
Recommended Posts