BeautifulSoup4 memo

# list.html
<html>
  <head><title></title></head>
  <body>
    <a href="http://www.example.com/index.html" title="link title a">Example A</a>
    <a href="http://wwww.example.org/" title="link title b" target="_blank">Example B</a>
    <a href="http://www.example.net/" title="link title c">Example C</a>
  </body>
</html>
from bs4 import BeautifulSoup

soup = BeautifulSoup(open("list.html"))

link = soup.find("a")
print(link["title"])
# link title a
print(link["href"])
# http://www.example.com/index.html
print(link.string)
# Example A

link = soup.find("a", target="_blank")
print(link.string)
# Example B
print(link["title"])
# link title b
print(link["href"])
# http://wwww.example.org/

i = [ {"title": x["title"], "url": x["href"], "content": x.string } for x in soup.find_all("a")]
print(i)
# [{'content': 'Example A', 'url': 'http://www.example.com/index.html', 'title': 'link title a'}, {'content': 'Example B', 'url': 'http://wwww.example.org/', 'title': 'link title b'}, {'content': 'Example C', 'url': 'http://www.example.net/', 'title': 'link title c'}]

Recommended Posts

BeautifulSoup4 memo
Raspberry-pi memo
Pandas memo
HackerRank memo
python memo
graphene memo
Flask memo
Matplotlib memo
pytest memo
sed memo
Python memo
Install Memo
networkx memo
python memo
tomcat memo
command memo
Generator memo.
psycopg2 memo
Python memo
SSH memo
Command memo
Memo: rtl8812
pandas memo
Shell memo
Python memo
Pycharm memo
Python memo
[Memo] How to use BeautifulSoup4 (1) Display html
Selenium, Phantomjs & BeautifulSoup4
AtCoder devotion memo (11/12)
[OpenCV] Personal memo
[Python] Memo dictionary
PyPI push memo
LPIC201 learning memo
Jupyter Notebook memo
LPIC304 virtualization memo
ALDA execution memo
python beginner memo (9.2-10)
youtube download memo
Linux x memo
Django Learning Memo
LPIC101 study memo
python beginner memo (9.1)
linux (ubuntu) memo
scp command memo
Flask Primer Memo
celery / kombu memo
who command memo
django tutorial memo
Flask basic memo
Linux # Command Memo 1
★ Memo ★ Python Iroha
Gender recognition memo
Image reading memo
[MEMO] [TERMINAL] Alacritty
3D rotation memo (1)
[Python] EDA memo
Python 3 operator memo
H2O.ai Introduction memo
lambda expression memo
[Memo] [terminal] xfce-terminal