I think there are various methods, but it is the basic method in the basics.
Use Requests from a third party library. You can also get it with the urllib.request module of the standard library. With the standard library, only basic GET and POST requests. Adding HTTP headers and basic authentication is troublesome. So I use the library "Requests".
import requests
# GET
r = requests.get(URL)
#GET parameter
r = requests.get(URL, params={'key': 'val'})
# POST
p = requests.post(URL, data={'key': 'val'})
#Basic authentication
b = requests.get(URL, auth=('User ID', 'password'))
import sys
import requests
url = sys.argv[1]
r = requests.get(url)
#r.json()You can also get it with json
r.encoding = r.apparent_encoding
print(r.text)
import sys
import re
import requests
url = sys.argv[1]
r = requests.get(url)
#Assuming the charset is at the beginning, only the beginning is decoded as an ASCII string
scanned_text = r.content[:1024].decode('ascii', errors='replace')
#Extract the charset value with a regular expression from the decoded string
match = re.search(r'charset=["\']?([\w-]+)', scanned_text)
if match:
r.encoding = match.group(1)
else:
r.encoding = 'utf-8'
print(r.text)
Recommended Posts