Suddenly
```And
```ls
```After that, there was scraping that could be done like cooking for 3 minutes, so I will write it as an article.
### At first
Since the assistant does not bring "This is a PC with environment settings ~ ~", let's start from the environment settings.
By the way, my environment is macOS Catalina 10.15, and various apps are often dropped recently due to a lot of bugs.
Oh, please assume that Python3 is included.
The module to prepare this time
```beautifulsoup```When```requests```is
If you don't have either, please install with ``` pip3 install 〇〇```.
With this, it feels like the ingredients are ready for the time being.
It's been 3 minutes so far, so let's do our best.
### next
Open a terminal and do ``` cd destkop``` (this is where you want to save it) and decide where to save it. The recommendation is to create a file called ``` Python``` on your desktop.
Then create a Python file as ``` vim news.py```. I'm just using vim as a hobby here, so Atom or VS Code is OK.
##### vim is good.
This time I'm scraping from Yahoo news
#### **`https://news.yahoo.co.jp/Is used as the url.`**
Show the location of this access ranking, it's getting fun.
Scrap the ``` yjnSub_list``` selected in this photo.
Open the developer tools in Chrome and take a look.
First of all, if you do not import the module, it will not be talked about, so I will put it in
news.py
import requests
import bs4
Then, assign it to a good feeling such as url or soup, create a function, and print it. It's easy.
news.py
url = 'https://news.yahoo.co.jp'
html = requests.get(url)
soup = bs4.BeautifulSoup(html.text, 'html.parser')
lank = soup.find('ol',class_='yjnSub_list')
urls = list(map(lambda l:l.find('a').get('href'), lank))
def get_title(url):
html = requests.get(url)
soup = bs4.BeautifulSoup(html.text, 'html.parser')
return soup.find('div',class_='hd').text
titles = list(map(get_title, urls))
print('title'.join(titles))
The title of the last `print ('title')`
can be rewritten appropriately.
When the code is applied, at the terminal
python3 news.Enter py.
You should be able to get the news and delivery date and time without any errors.
If you get a module error, go back to the initial preferences and try installing again with `` `pip3 uninistall 〇〇```.
Thank you for your hard work.
Recommended Posts