I made a program to scrape Hololive, which is the distribution schedule of Hololive, and display the contents easily with CLI.

Source code

GitHub

Precautions for use

This tool uses requests as an external library. If you already have pip installed You can do this with pip install requests </ b>

Also, this tool has nothing to do with the Hololive formula. Do not overload the server with more execution than necessary.

How to use

You can display the following contents by executing main.py in the repository.

You can also add options at runtime and check their contents with --help. As an example,

---- all Show all schedules except Bilibili videos, including Holosters ---- eng Display member names in English ---- tomorrow Show tomorrow's schedule

Etc. can be used. These options can be set and run at the same time.

Notes When making it, make a note of where you are stuck.

First of all, I was worried about how to separate Hololive members from the others from the data obtained by scraping. As a response

scraping.py

def delete_exception(time_list, stream_members_list, stream_url_list, is_all): EXCEPTION_LIST = {'Yogiri', 'Civia', 'SpadeEcho', 'Doris', 'Artia', 'Rosalyn'} if not is_all: #Slice to get only non-hololive members (e.g. holostars hololive-ID) EXCEPTION_LIST = EXCEPTION_LIST | set(get_member_list()[29:]) for i in range(len(time_list)): if stream_members_list[i] in EXCEPTION_LIST: time_list[i] = None stream_members_list[i] = None stream_url_list[i] = None time_list = [i for i in time_list if not i is None] stream_members_list = [i for i in stream_members_list if not i is None] stream_url_list = [i for i in stream_url_list if not i is None] return time_list, stream_members_list, stream_url_list

Prepare a set of Hololive Youtube distribution members and Bilibili distribution members in advance. From options etc., I made a set of members to exclude, replaced the scraped elements belonging to it with None, and finally deleted them collectively in the inclusion notation.

By the way, you can implement it in a list as well, but if you don't need the element numbers, it's many times faster to use a set than a list.

Another problem was that there were members whose names were English and Japanese, and due to the difference between half-width and full-width characters, it was not possible to display the columns neatly. To solve this, I used the standard library unicodedata .
unicode

if unicodedata.east_asian_width(stream_members_list[i][0]) == 'W': m_space = ' ' * ( (-2 * len(stream_members_list[i]) + 18)) else: m_space = ' ' * ( (-1 * len(stream_members_list[i]) ) + 18)

east_asian_width of unicodedata returns W when the argument character (one character because it is Char) is a full-width Japanese character. As a result, it was possible to display the lines in a uniform line using spaces, taking into consideration the number of characters in the name.

Finally

I'm glad that it has already been cloned by some people. We will continue to improve this repository.

Recommended Posts
Scraping the holojour and displaying it with CLI

Scraping the rainfall data of the Japan Meteorological Agency and displaying it on M5Stack

POST the image with json and receive it with flask

scraping the Nikkei 225 with playwright-python

[pyqtgraph] Add region to the graph and link it with the graph region

Scraping with Node, Ruby and Python

Scraping with Python, Selenium and Chromedriver

Scraping with Python and Beautiful Soup

Scraping the schedule of Hinatazaka46 and reflecting it in Google Calendar

Run the IDCF cloud CLI with Docker

Crawling and scraping any site with mitmproxy

Scrap the published csv with Github Action and publish it on Github Pages

Extract the TOP command result with USER and output it as CSV

I set the environment variable with Docker and displayed it in Python

I vectorized the chord of the song with word2vec and visualized it with t-SNE

Convert the spreadsheet to CSV and upload it to Cloud Storage with Cloud Functions

Find it in the procession and edit it

Practice web scraping with Python and Selenium

Easy web scraping with Python and Ruby

[Python] Precautions when retrieving data by scraping and putting it in the list

Scraping tabelog with python and outputting to CSV

[Python] I introduced Word2Vec and played with it.

Get the latest AMI information with the AWS CLI

Solving the Lorenz 96 model with Julia and Python

Archive and compress the entire directory with python

Describe ec2 with boto3 and retrieve the value

Scraping with selenium

Scraping with Python

Scraping with Python

[All-new best browser] Let's change the way the Internet works and how it works with the Brave browser.

Scraping with Selenium

Until you can install blender and run it with python for the time being

Read the csv file with jupyter notebook and write the graph on top of it

It is easy to execute SQL with Python and output the result in Excel

Let's transpose the matrix with numpy and multiply the matrices.

Install selenium on Mac and try it with python

Read the csv file and display it in the browser

Visualize the range of interpolation and extrapolation with python

[2020 version] Scraping and processing the text from Aozora Bunko

Get comments and subscribers with the YouTube Data API

Install the latest stable Python with pyenv (both 2 and 3)

Get mail from Gmail and label it with Python3

Read json file with Python, format it, and output json

Install Ubuntu 20.04 with GUI and prepare the development environment

Reading, displaying and speeding up gifs with python [OpenCV]

Extract the maximum value with pandas and change that value

Install tweepy with pip and use it for API 1.1

Call the C function with dart: ffi and call the Dart function callback

Get the matched string with a regular expression and reuse it when replacing on Python3

POST the image selected on the website with multipart / form-data and save it to Amazon S3! !!

Recursively get the Excel list in a specific folder with python and write it to Excel.

Return the image data with Flask of Python and draw it to the canvas element of HTML

Put Ubuntu in Raspi, put Docker on it, and control GPIO with python from the container

Scraping the holojour and displaying it with CLI

Source code

Precautions for use

How to use

`scraping.py`

`unicode`

Finally