Use BeautifulSoup to extract a link containing a string from an HTML file

-Extract the link including "mp4" from the HTML file. -Uses Beautiful Soup.


from BeautifulSoup import BeautifulSoup

open_name = raw_input('Open html file: ')
save_name = raw_input('Save file name: ')

f = open(open_name)
html = f.read()
f.close()

f2 = open(save_name, 'w')

soup = BeautifulSoup(html)

for link in soup.findAll("a"):
    if "mp4" in link.get("href"): # "mp4"Extract links containing
        f2.writelines(link.get('href') + '¥n')

f2.close()

I referred to Stack Overflow linked below. Please point out if there is another good way.

reference: python - how can I get href links from html code - Stack Overflow https://stackoverflow.com/questions/3075550/how-can-i-get-href-links-from-html-code/3075568#3075568

Recommended Posts

Use BeautifulSoup to extract a link containing a string from an HTML file
Python> Read from a multi-line string instead of a file> io.StringIO ()
How to get a list of links from a page from wikipedia
Read and use Python files from Python
Use BeautifulSoup to extract a link containing a string from an HTML file
Try to extract a character string from an image with Python3
How to use NUITKA-Utilities hinted-compilation to easily create an executable file from a Python script
Convert a string to an image
Outputs a line containing the specified character string from a text file
How to extract the desired character string from a line 4 commands
I want to extract an arbitrary URL from the character string of the html source with python
[Memo] How to use BeautifulSoup4 (1) Display html
# 5 [python3] Extract characters from a character string
How to create a function object from a string
[Python] Concatenate a List containing numbers and write it to an output file.
How to extract coefficients from a fractional formula
I tried to extract a line art from an image with Deep Learning
Extract lines containing a specific "string" in Pandas
I made a package to create an executable file from Hy source code
Convert a path string that uses a symbolic link in the middle to an absolute path
Parse a JSON string written to a file in Python
Steps from installing Django to displaying an html page
Backtrader How to import an indicator from another file
How to turn a .py file into an .exe file
Python script to create a JSON file from a CSV file
An introduction to machine learning from a simple perceptron
Use Boto3 to retrieve over 1000 Prefixes from S3's file list
How to generate a public key from an SSH private key
Garbled characters when printing an array containing a string containing utf-8
Python> Read from a multi-line string instead of a file> io.StringIO ()
How to use a file other than .fabricrc as a configuration file
[Python] How to output a pandas table to an excel file
Extract the value closest to a value from a Python list element
Convert HTML to text file
[Python] Use a string sequence
Upload a file to Dropbox
How to paste a CSV file into an Excel file using Pandas
Create an instance of a predefined class from a string in Python
How to make a string into an array or an array into a string in Python
How to get a string from a command line argument in python
How to use Visual Recognition to get LINE ID from a girl
I wrote a script to extract a web page link in Python
How to create a CSV dummy file containing Japanese using Faker
How to get a job as an engineer from your 30s
Find all patterns to extract a specific number from the set
DataFrame of pandas From creating a DataFrame from two lists to writing a file