I tried to get RSS using feedparser library in Python3. Python is a complete beginner.
feedparser
I used a convenient one called feed parser
.
This is a great thing because it parses just by passing the URL and puts it in a nice dictionary.
I haven't seen the latest version of the document because I can't see it without building it, but old information says that it supports the main RSS / Atom documents.
Install with pip
.
pip3 install feedparser
Easy to use.
import feedparser
url = 'https://gigazine.net/index.php?/news/rss_2.0/'
for entry in feedparser.parse(url).entries:
print(entry.title)
Which star is the "largest star in the universe"?
Heavier blankets on the body may improve insomnia
I tried to eat Lotteria's popular menu together with W Patty, 4 kinds of cheese sauce, and a gorgeous custom "Lotteria Autumn 3 Big Fair" with a soft-boiled moon viewing
Former CEO of a global game maker establishes a new game company, and development of new games is in progress
Why did the conspiracy-theoretic group "QAnon" leave Reddit on the giant bulletin board?
What are Facebook CEOs thinking when faced with a number of issues, including employee strikes, advertiser boycotts, and antitrust investigations? Voice leaked
Headline News on September 24, 2020
"Forged profile creation manual" on Facebook and LinkedIn leaked from SNS monitoring company
Intel announces 10nm process 11th generation Core processor "Tiger Lake UP3" etc. for IoT edge
Released a solution implementation "AWS Perspective" that allows AWS to automatically create an architecture diagram
AMD announces Ryzen 3000C series of processors for Chromebooks jointly designed with Google
A movie that captures the phenomenon that meteorites bounce in the earth's atmosphere like a "drainer"
Tesla sues U.S. government to eliminate tariffs on imported parts from China
Plans are underway for clinical trials of "artificial eyes" to restore lost vision
You can check the epidemic status of the new coronavirus on Google Maps
I tried Lawson "L Chiki Honey Maple Flavor" where the sweetness of shining honey & maple syrup stands out among the spices of L Chiki
Adobe announces "Liquid Mode", a function that makes it easy to read PDFs on smartphones and automatically adjusts them
California decides to ban new sales of gasoline cars
Wikipedia redesigns for the first time in 10 years
The release of the latest MCU movie "Black Widow" has been postponed, and 2020 will be "the first year without an MCU movie since 2009"
Google phased out paid extension distribution on Chrome Web Store
Twitter tests "voice message" feature
A type of woodpecker wages a "large-scale war" over several days, and some individuals watch the course of the war.
What is the cause of the mysterious phenomenon that the Internet suddenly disappears at 7 am for a year and a half?
Summary of what was found from the "Finsen Documents" that revealed the participation of major banks in money laundering and problems
Google's Android executive talks about "Android 11"
I bought a "shrimp chili sauce set meal" at Matsuya and tried it, where white rice goes crazy with chili sauce with plenty of shrimp.
179 dark web illicit traders arrested and officials say "the golden age of the dark web is over"
A fierce man who turned Minecraft into a NES emulator appears
How can a completely unknown artist earn views by abusing Spotify?
It's amazing.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#URL definition
urls = [
'https://gigazine.net/index.php?/news/rss_2.0/',
'https://japanese.engadget.com/rss.xml',
'https://jp.techcrunch.com/feed/',
'https://www.gizmodo.jp/index.xml',
]
from feedparser import parse
from datetime import datetime as dt
from webbrowser import open as browserOpen
from urllib.parse import urlencode
#Date parsing function
def parseDate(dateData):
return dt(
dateData.tm_year,
dateData.tm_mon,
dateData.tm_mday,
dateData.tm_hour,
dateData.tm_min,
dateData.tm_sec
)
#Get & format quickly with list comprehension
entries = [
{
'title': entry['title'],
'link': entry['link'],
'date': parseDate(entry['updated_parsed'] or entry['published_parsed'])
}
for url in urls
for entry in parse(url).entries
]
#Sort by date
entries.sort(key=lambda x: x['date'], reverse=True)
savedEntries = []
for entry in entries:
#Display the title,
print(entry['title'])
#Ask the user for input
userAction = input()
if userAction == 'q':
#If it is quit, it ends
break
elif userAction == 's':
#If it is save, store it in an array
savedEntries.append(entry)
print('saved!')
for savedEntry in savedEntries:
browserOpen(savedEntry['link'])
It takes time to process because it collects data by itself.
It seems good to display the progress with tqdm
.
I haven't registered so many URLs this time, so take another opportunity.
If you use Pocket API etc., you can automatically save to Pocket.
Even so, Python is difficult, isn't it? It was easy, but it can be quite annoying to me as I'm used to TypeScript ...
Recommended Posts