I'm studying Python by myself. I don't know the details yet, but I will make a note of the phenomenon that I stumbled upon. By the way, it's ** Python 3.8.5 **
I tried to extract the title tag.
code
# html_parser.py
import requests
from bs4 import BeautifulSoup
#URL you want to get
url = "http://example.com"
#Get HTML by sending HTTP request with url as argument
response = requests.get(url)
#Character code is automatically encoded
response.encoding = response.apparent_encoding
#HTML parsing
bs = BeautifulSoup(response.text, 'html.parser')
title_tag = bs.find('title')
#Output the text part of the extracted tag
print(title_tag.text)
I got an Import Error for Beautiful Soup.
result
Traceback (most recent call last):
File "c:/python/html.py", line 3, in <module>
from bs4 import BeautifulSoup
File "C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\__init__.py", line 31, in <module>
from .builder import builder_registry, ParserRejectedMarkup
File "C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\builder\__init__.py", line 7, in <module>
from bs4.element import (
File "C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\element.py", line 19, in <module>
from bs4.formatter import (
File "C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\formatter.py", line 1, in <module>
from bs4.dammit import EntitySubstitution
File "C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\dammit.py", line 13, in <module>
from html.entities import codepoint2name
File "c:\python\html.py", line 3, in <module>
from bs4 import BeautifulSoup
ImportError: cannot import name 'BeautifulSoup' from partially initialized module 'bs4' (most likely due to a circular import) (C:\Users\*****\AppData\Local\Programs\Python\Python38\lib\site-packages\bs4\__init__.py)
If not told, the following has been done
python
pip install beautifulsoup
Even with pip list beautifulsoup4 4.9.1 Can be confirmed.
Then why ...
It seems that Python has a package called "html" that has been loaded ...
Recommended Posts