Try to get a web page and JSON file using Python's Requests library

Overview

I am studying with reference to O'Reilly Japan's "Data Visualization Beginning with Python and JavaScript".

Retrieving web data using the requests library

"Requests" in Python is a library that makes it easy to handle HTTP exchanges in Python.

Advance preparation

Install requests

pip install requests
pip install --upgrade ndg-httpsclient

Example of using request library

Download Wikipedia page (get HTML page and inline JavaScript)

>>> import requests
>>> response = requests.get("https://ja.wikipedia.org/wiki/Python");
>>> 
>>> #Get a list of attributes of the responsep object
>>> dir(response)
['__attrs__', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__nonzero__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_content', '_content_consumed', '_next', 'apparent_encoding', 'close', 'connection', 'content', 'cookies', 'elapsed', 'encoding', 'headers', 'history', 'is_permanent_redirect', 'is_redirect', 'iter_content', 'iter_lines', 'json', 'links', 'next', 'ok', 'raise_for_status', 'raw', 'reason', 'request', 'status_code', 'text', 'url']
>>>
>>> #Get HTTP status code from response object
>>> response.status_code
200
>>>
>>> #You can get the HTML page and inline JavaScript by getting the text property of the response object
>>> response.text
'<!DOCTYPE html>\n<html class="client-nojs" lang="ja" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>Python - Wikipedia</title>\n<script>document.documentElement.className = document.documentElement.className.replace( /(^|\\s)client-nojs(\\s|$)/, "$1client-js$2" );</script>\n<script>(window.RLQ=window.RLQ||[]).push(function(){mw.config.set({"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"Python","wgTitle":"Python","wgCurRevisionId":65321720,"wgRevisionId":65321720,"wgArticleId":993,"wgIsArticle":true,"wgIsRedirect":false,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Programming language","Object-oriented language","Scripting language","Open Source","Python"],"wgBreakFrames
...

JSON format data acquisition

>>> import requests
>>> response = requests.get("https://www.oreilly.co.jp/books/9784873118086/biblio.json");
>>> 
>>> #Get JSON data
>>> data = response.json()
>>> data
{'title': 'Data visualization starting with Python and JavaScript', 'picture_large': 'http://www.oreilly.co.jp/books/images/picture_large978-4-87311-808-6.jpeg', 'picture': 'http://www.oreilly.co.jp/books/images/picture978-4-87311-808-6.gif', 'picture_small': 'http://www.oreilly.co.jp/books/images/picture_small978-4-87311-808-6.gif', 'authors': ['Kyran Dale\by u3000', 'Takeshi Shimada\translated by u3000', 'Tetsuya Kinoshita\u3000 translation'], 'released': '2017-08-25', 'pages': 500, 'price': 4104, 'ebook_price': 3283, 'original': 'Data Visulalization with Python and JavaScript', 'original_url': 'http://shop.oreilly.com/product/0636920037057.do', 'isbn': '978-4-87311-808-6'}
>>> 
>>> #Get key value
>>> data.keys()
dict_keys(['title', 'picture_large', 'picture', 'picture_small', 'authors', 'released', 'pages', 'price', 'ebook_price', 'original', 'original_url', 'isbn'])
>>> 
>>> #Get title
>>> data["title"]
'Data visualization starting with Python and JavaScript'

reference

Data visualization starting with Python and JavaScript https://www.oreilly.co.jp/books/9784873118086/

Requests: HTTP for humans http://requests-docs-ja.readthedocs.io/en/latest/user/quickstart/     Next time, I will study how to use data from Web API.

Recommended Posts

Try to get a web page and JSON file using Python's Requests library
I tried to get Web information using "Requests" and "lxml"
Try creating a compressed file using Python and zlib
(Python) Try to develop a web application using Django
Every time I try to read a csv file using pandas, I get a numpy error.
Get an image from a web page and resize it
Get a Python web page, character encode it, and display it
Try to get statistics using e-Stat
Process Splunk execution results using Python and save to a file
Try using APSW, a Python library that SQLite can get serious about
Try using platypus, a multipurpose optimization library
How to search using python's astroquery and get fits images with skyview
I want to make a web application using React and Python flask
I want to drop a file on tkinter and get its path [Tkinter DnD2]
WEB scraping with python and try to make a word cloud from reviews
Create a web app that converts PDF to text using Flask and PyPDF2
Try web scraping now and get lottery 6 data
How to create a JSON file in Python
Create a web map using Python and GDAL
Parse a JSON string written to a file in Python
Get the file name in a folder using glob
Created a module to monitor file and URL updates
Try to dynamically create a Checkbutton with Python's Tkinter
Python script to create a JSON file from a CSV file
Output a binary dump in binary and revert to a binary file
I tried using a library (common thread) that makes Python's threading package easier to use
Try using Python's feedparser.
Try using Python's Tkinter
Try to make it using GUI and PyQt in Python
Try to operate an Excel file using Python (Pandas / XlsxWriter) ①
Try to operate an Excel file using Python (Pandas / XlsxWriter) ②
Developed a library to get Kindle collection list in Python
Try to bring up a subwindow with PyQt5 and Python
Get a global IP and export it to Google Spreadsheets
Try building a neural network in Python without using a library
Try to model a multimodal distribution using the EM algorithm
[Introduction to Tensorflow] Understand Tensorflow properly and try to make a model
How to get a list of links from a page from wikipedia
Just try to receive a webhook in ngrok and python
Convert pixiv to mp4 and download from pixiv using python's pixivpy
Note that I was addicted to accessing the DB with Python's mysql.connector using a web application.
[Python] How to scrape a local html file and output it as CSV using Beautiful Soup