If you are not interested in code and just want to add a calendar here. If you have a Google Account, you can add it immediately.
In this article, we aim to obtain information from HP and automatically generate the following calendar.
This will -** Turn on Google Calendar notifications and you'll never miss their activity ** -** Since you know the activity schedule in advance, you can reduce the risk of not being able to see other schedules ** There are merits such as.
Hinatazaka46 is one of the Sakamichi groups, and is a group whose motto is "Happy Aura". Of course, there are many people who are attracted to their "** visual ", " brightness " and " attitude to work hard on anything **", and I am one of them.
The most reliable way to follow their activities is to check the "Schedule" page on HP. I often see it myself.
However, -** Inefficient to see what activity is on which day ** (I can't tell at a glance because I have to scroll to the location of the day) -** Not necessarily described in chronological order ** (The schedule of "18: 00 ~" may be described in the next paragraph of the schedule of "22: 00 ~") I was personally dissatisfied with that.
Also, by introducing the calendar of this site, it is possible to cover major events, but detailed events (fixed). It seemed that it was not covered for (irregular activities that were not done).
So, in order to eliminate these dissatisfactions, I thought about realizing ** "Reflect their schedule in my Google Calendar" **.
You need to get the Google API. For the procedure, please refer to this article for easy understanding.
Also, if you want to perform regular execution, it is better to use cron or Heroku. I personally like Heroku, which doesn't need to run on my local pc, so I use it. Regarding Heroku, I explained how to use it in My hatena blog before, so please refer to that if you like.
The information to be acquired is the following four.
--Category
Since there may be multiple appearance events on the same day,
Information is acquired in the flow.
def search_event_each_date(year, month):
url = (
f"https://www.hinatazaka46.com/s/official/media/list?ima=0000&dy={year}{month}"
)
result = requests.get(url)
soup = BeautifulSoup(result.content, features="lxml")
events_each_date = soup.find_all("div", {"class": "p-schedule__list-group"})
time.sleep(3) # NOTE:Eliminate the load on the server
return events_each_date
def search_event_info(event_each_date):
event_date_text = remove_blank(event_each_date.contents[1].text)[
:-1
] # NOTE:Get information other than the day of the week
events_time = event_each_date.find_all("div", {"class": "c-schedule__time--list"})
events_name = event_each_date.find_all("p", {"class": "c-schedule__text"})
events_category = event_each_date.find_all("div", {"class": "p-schedule__head"},)
events_link = event_each_date.find_all("li", {"class": "p-schedule__item"})
return event_date_text, events_time, events_name, events_category, events_link
def search_detail_info(event_name, event_category, event_time, event_link):
event_name_text = remove_blank(event_name.text)
event_category_text = remove_blank(event_category.contents[1].text)
event_time_text = remove_blank(event_time.text)
event_link = event_link.find("a")["href"]
active_members = search_active_member(event_link)
return event_name_text, event_category_text, event_time_text, active_members
def search_active_member(link):
try:
url = f"https://www.hinatazaka46.com{link}"
result = requests.get(url)
soup = BeautifulSoup(result.content, features="lxml")
active_members = soup.find("div", {"class": "c-article__tag"}).text
time.sleep(3) # NOTE:Eliminate server load
except AttributeError:
active_members = ""
return active_members
def remove_blank(text):
text = text.replace("\n", "")
text = text.replace(" ", "")
return text
** [Addition] ** In the version of 2020/10/14, it was not possible to correctly acquire events other than media-related events. Therefore, modify it as follows. (In the code above, it's already reflected.)
(Before correction)
events_category = event_each_date.find_all(
"div", {"class": "c-schedule__category category_media"}
)
event_category_text = remove_blank(event_category.text)
(Revised)
events_category = event_each_date.find_all("div", {"class": "p-schedule__head"},)
event_category_text = remove_blank(event_category.contents[1].text)
Now events like "Birthday" and "LIVE" can be correctly reflected in the calendar.
Especially regarding time, depending on the notation ――It's the next day, like "24: 20 ~ 25: 00" --In the first place, there is only date information Since there are cases such as, prepare a function corresponding to them.
def over24Hdatetime(year, month, day, times):
"""
Convert time over 24H to datetime
"""
hour, minute = times.split(":")[:-1]
# to minute
minutes = int(hour) * 60 + int(minute)
dt = datetime.datetime(year=int(year), month=int(month), day=int(day))
dt += datetime.timedelta(minutes=minutes)
return dt.strftime("%Y-%m-%dT%H:%M:%S")
def prepare_info_for_calendar(
event_name_text, event_category_text, event_time_text, active_members
):
event_title = f"({event_category_text}){event_name_text}"
if event_time_text == "":
event_start = f"{year}-{month}-{event_date_text}"
event_end = f"{year}-{month}-{event_date_text}"
is_date = True
else:
start, end = search_start_and_end_time(event_time_text)
event_start = over24Hdatetime(year, month, event_date_text, start)
event_end = over24Hdatetime(year, month, event_date_text, end)
is_date = False
return event_title, event_start, event_end, is_date
The general procedure is as follows.
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
def build_calendar_api():
SCOPES = ["https://www.googleapis.com/auth/calendar"]
creds = None
if os.path.exists("token.pickle"):
with open("token.pickle", "rb") as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", SCOPES)
creds = flow.run_local_server(port=0)
with open("token.pickle", "wb") as token:
pickle.dump(creds, token)
service = build("calendar", "v3", credentials=creds)
return service
Before adding, check based on "event name-time" to determine "whether it is a previously added event". Get the list for that with the search_events function.
def search_events(service, calendar_id, start):
end_datetime = datetime.datetime.strptime(start, "%Y-%m-%d") + relativedelta(
months=1
)
end = end_datetime.strftime("%Y-%m-%d")
events_result = (
service.events()
.list(
calendarId=calendar_id,
timeMin=start + "T00:00:00+09:00", # NOTE:+09:It is important to set it to 00. (Convert UTC to JST)
timeMax=end + "T23:59:00+09:00", # NOTE;Search period until next month.
)
.execute()
)
events = events_result.get("items", [])
if not events:
return []
else:
events_starttime = change_event_starttime_to_jst(events)
return [
event["summary"] + "-" + event_starttime
for event, event_starttime in zip(events, events_starttime)
]
def change_event_starttime_to_jst(events):
events_starttime = []
for event in events:
if "date" in event["start"].keys():
events_starttime.append(event["start"]["date"])
else:
str_event_uct_time = event["start"]["dateTime"]
event_jst_time = datetime.datetime.strptime(
str_event_uct_time, "%Y-%m-%dT%H:%M:%S+09:00"
)
str_event_jst_time = event_jst_time.strftime("%Y-%m-%dT%H:%M:%S")
events_starttime.append(str_event_jst_time)
return events_starttime
def add_date_schedule(
event_name, event_category, event_time, event_link, previous_add_event_lists
):
(
event_name_text,
event_category_text,
event_time_text,
active_members,
) = search_detail_info(event_name, event_category, event_time, event_link)
#Preparation of information to be reflected in the calendar
(event_title, event_start, event_end, is_date,) = prepare_info_for_calendar(
event_name_text, event_category_text, event_time_text, active_members,
)
if (
f"{event_title}-{event_start}" in previous_add_event_lists
): # NOTE:Pass if the same appointment already exists
pass
else:
add_info_to_calendar(
calendarId, event_title, event_start, event_end, active_members, is_date,
)
def add_info_to_calendar(calendarId, summary, start, end, active_members, is_date):
if is_date:
event = {
"summary": summary,
"description": active_members,
"start": {"date": start, "timeZone": "Japan",},
"end": {"date": end, "timeZone": "Japan",},
}
else:
event = {
"summary": summary,
"description": active_members,
"start": {"dateTime": start, "timeZone": "Japan",},
"end": {"dateTime": end, "timeZone": "Japan",},
}
event = service.events().insert(calendarId=calendarId, body=event,).execute()
This time, I am trying to reflect the schedule from this month to 3 months ahead in Google Calendar. Only calendarId needs to set the id of my calendar.
import time
import pickle
import os.path
import requests
from bs4 import BeautifulSoup
import datetime
from dateutil.relativedelta import relativedelta
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
def build_calendar_api():
SCOPES = ["https://www.googleapis.com/auth/calendar"]
creds = None
if os.path.exists("token.pickle"):
with open("token.pickle", "rb") as token:
creds = pickle.load(token)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", SCOPES)
creds = flow.run_local_server(port=0)
with open("token.pickle", "wb") as token:
pickle.dump(creds, token)
service = build("calendar", "v3", credentials=creds)
return service
def remove_blank(text):
text = text.replace("\n", "")
text = text.replace(" ", "")
return text
def search_event_each_date(year, month):
url = (
f"https://www.hinatazaka46.com/s/official/media/list?ima=0000&dy={year}{month}"
)
result = requests.get(url)
soup = BeautifulSoup(result.content, features="lxml")
events_each_date = soup.find_all("div", {"class": "p-schedule__list-group"})
time.sleep(3) # NOTE:Eliminate the load on the server
return events_each_date
def search_start_and_end_time(event_time_text):
has_end = event_time_text[-1] != "~"
if has_end:
start, end = event_time_text.split("~")
else:
start = event_time_text.split("~")[0]
end = start
start += ":00"
end += ":00"
return start, end
def search_event_info(event_each_date):
event_date_text = remove_blank(event_each_date.contents[1].text)[
:-1
] # NOTE:Get information other than the day of the week
events_time = event_each_date.find_all("div", {"class": "c-schedule__time--list"})
events_name = event_each_date.find_all("p", {"class": "c-schedule__text"})
events_category = event_each_date.find_all("div", {"class": "p-schedule__head"},)
events_link = event_each_date.find_all("li", {"class": "p-schedule__item"})
return event_date_text, events_time, events_name, events_category, events_link
def search_detail_info(event_name, event_category, event_time, event_link):
event_name_text = remove_blank(event_name.text)
event_category_text = remove_blank(event_category.contents[1].text)
event_time_text = remove_blank(event_time.text)
event_link = event_link.find("a")["href"]
active_members = search_active_member(event_link)
return event_name_text, event_category_text, event_time_text, active_members
def search_active_member(link):
try:
url = f"https://www.hinatazaka46.com{link}"
result = requests.get(url)
soup = BeautifulSoup(result.content, features="lxml")
active_members = soup.find("div", {"class": "c-article__tag"}).text
time.sleep(3) # NOTE:Eliminate server load
except AttributeError:
active_members = ""
return active_members
def over24Hdatetime(year, month, day, times):
"""
Convert time over 24H to datetime
"""
hour, minute = times.split(":")[:-1]
# to minute
minutes = int(hour) * 60 + int(minute)
dt = datetime.datetime(year=int(year), month=int(month), day=int(day))
dt += datetime.timedelta(minutes=minutes)
return dt.strftime("%Y-%m-%dT%H:%M:%S")
def prepare_info_for_calendar(
event_name_text, event_category_text, event_time_text, active_members
):
event_title = f"({event_category_text}){event_name_text}"
if event_time_text == "":
event_start = f"{year}-{month}-{event_date_text}"
event_end = f"{year}-{month}-{event_date_text}"
is_date = True
else:
start, end = search_start_and_end_time(event_time_text)
event_start = over24Hdatetime(year, month, event_date_text, start)
event_end = over24Hdatetime(year, month, event_date_text, end)
is_date = False
return event_title, event_start, event_end, is_date
def change_event_starttime_to_jst(events):
events_starttime = []
for event in events:
if "date" in event["start"].keys():
events_starttime.append(event["start"]["date"])
else:
str_event_uct_time = event["start"]["dateTime"]
event_jst_time = datetime.datetime.strptime(
str_event_uct_time, "%Y-%m-%dT%H:%M:%S+09:00"
)
str_event_jst_time = event_jst_time.strftime("%Y-%m-%dT%H:%M:%S")
events_starttime.append(str_event_jst_time)
return events_starttime
def search_events(service, calendar_id, start):
end_datetime = datetime.datetime.strptime(start, "%Y-%m-%d") + relativedelta(
months=1
)
end = end_datetime.strftime("%Y-%m-%d")
events_result = (
service.events()
.list(
calendarId=calendar_id,
timeMin=start + "T00:00:00+09:00", # NOTE:+09:It is important to set it to 00. (Convert UTC to JST)
timeMax=end + "T23:59:00+09:00", # NOTE;Search period until next month.
)
.execute()
)
events = events_result.get("items", [])
if not events:
return []
else:
events_starttime = change_event_starttime_to_jst(events)
return [
event["summary"] + "-" + event_starttime
for event, event_starttime in zip(events, events_starttime)
]
def add_date_schedule(
event_name, event_category, event_time, event_link, previous_add_event_lists
):
(
event_name_text,
event_category_text,
event_time_text,
active_members,
) = search_detail_info(event_name, event_category, event_time, event_link)
#Preparation of information to be reflected in the calendar
(event_title, event_start, event_end, is_date,) = prepare_info_for_calendar(
event_name_text, event_category_text, event_time_text, active_members,
)
if (
f"{event_title}-{event_start}" in previous_add_event_lists
): # NOTE:Pass if the same appointment already exists
pass
else:
add_info_to_calendar(
calendarId, event_title, event_start, event_end, active_members, is_date,
)
def add_info_to_calendar(calendarId, summary, start, end, active_members, is_date):
if is_date:
event = {
"summary": summary,
"description": active_members,
"start": {"date": start, "timeZone": "Japan",},
"end": {"date": end, "timeZone": "Japan",},
}
else:
event = {
"summary": summary,
"description": active_members,
"start": {"dateTime": start, "timeZone": "Japan",},
"end": {"dateTime": end, "timeZone": "Japan",},
}
event = service.events().insert(calendarId=calendarId, body=event,).execute()
if __name__ == "__main__":
# -------------------------step1:various settings-------------------------
#API system
calendarId = (
"〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜〜" # NOTE:My calendar ID
)
service = build_calendar_api()
#Search range
num_search_month = 3 # NOTE;Reflected in the calendar up to the schedule 3 months ahead
current_search_date = datetime.datetime.now()
year = current_search_date.year
month = current_search_date.month
# -------------------------step2.Get information for each date-------------------------
for _ in range(num_search_month):
events_each_date = search_event_each_date(year, month)
for event_each_date in events_each_date:
# step3:Get schedules for a specific day at once
(
event_date_text,
events_time,
events_name,
events_category,
events_link,
) = search_event_info(event_each_date)
event_date_text = "{:0=2}".format(
int(event_date_text)
) # NOTE;Filled with 0s to 2 digits (ex.0-> 01)
start = f"{year}-{month}-{event_date_text}"
previous_add_event_lists = search_events(service, calendarId, start)
# step4:Add information to the calendar
for event_name, event_category, event_time, event_link in zip(
events_name, events_category, events_time, events_link
):
add_date_schedule(
event_name,
event_category,
event_time,
event_link,
previous_add_event_lists,
)
# step5:To the next month
current_search_date = current_search_date + relativedelta(months=1)
year = current_search_date.year
month = current_search_date.month
In this article, I introduced how to reflect the schedule of Hinatazaka46 in Google Calendar. This will -** Turn on Google Calendar notifications and you'll never miss their activity ** -** Since you know the activity schedule in advance, you can reduce the risk of not being able to see other schedules ** There are merits such as.
This time, we focused on Hinatazaka46, but if you change "(1) Scraping necessary information from HP", you can reuse (2) and reflect the schedule of any person in Google Calendar.
━━━━━━━━━━
If you don't know Hinatazaka46, why don't you take an interest in this? Personally, ** "Let's meet at Hinatazaka" broadcast on TV TOKYO from 25:05 every Sunday. ** is recommended. You will be amazed and attracted to the high variety ability that you can't think of as an idol. In addition, I think it is good to know from the song at Hinatazaka46 OFFICIAL YouTube CHANNEL.
Also, as a complete digression, my recent recommendation is Konoka Matsuda, who has a very nice smile. What's good?
How to extract arbitrary events in Google Calendar with Python
Adding an event to Google Calendar in Python
[Python] Get / add Google Calendar appointments using Google Calendar API
━━━━━━━━━━ Hinatazaka46 Home Page
Hinatazaka46 OFFICIAL YouTube CHANNEL
Recommended Posts