On the second day after I started learning Python by myself, I tried simple web scraping. I made it with reference to some sites. Post it for personal notes. This time ** ・ title ** ** ・ h2 ** Extract two elements.
-Install the requests module
command
pip install requests
-Install the Beautiful Soup module
command
pip install beautifulsoup4
code
import requests
from bs4 import BeautifulSoup
#URL you want to get
url = "*********"
#Get HTML by sending HTTP request with url as argument
response = requests.get(url)
#Character code is automatically encoded
response.encoding = response.apparent_encoding
#HTML parsing
bs = BeautifulSoup(response.text, 'html.parser')
#Extract title
title_tag = bs.find('title')
print(title_tag.text)
#Extract h2 element
h2_tags = bs.select('h2')
for h2_tag in h2_tags:
print(h2_tag.text)
Recommended Posts