[Ruby] Scraping Twitter to find the oldest Pien Tweet

3 minute read

スクリーンショット 2020-07-14 3.47.33.png

We have created twitterscraper-ruby gem that allows you to scrape Twitter to get past tweets.

With this twitter scraper-ruby gem, you can easily find the “oldest tweet” or “the first person to tweet a word”.

Announcement

Please feel free to contact @ ts_3156 for requests and inquiries regarding analysis using SNS data and web development using Ruby on Rails.

Why scraping now

There are three main ways to get a large number of tweets: Twitter Search API (free version), Twitter Search API (paid version), and Twitter scraping.

Twitter Search API (free version)

Perhaps 99% of people in the world use this method to get tweets. Since it is an API provided by Twitter, you can use it with confidence, but there is a strong limit on the number of times you can use it, and the biggest disadvantage is that you can only get “tweets in the last 7 days”. For this reason, you can only “get a few recent tweets”.

Twitter Search API (paid version)

With this API, you will be able to search all past tweets by paying millions of yen per month. There is a limit to the number of times you can use it, but it is a relatively loose limit and you will not be inconvenienced in terms of getting tweets.

  • There are various types of paid APIs, and the exact names are different.

Twitter scraping

twitterscraper-This is the method used by ruby gem. You can get a large number of tweets at high speed without worrying about the restrictions on the number of times of use and the target period, which are the disadvantages of the Twitter Search API. However, scraping is an act that is expressly prohibited by the Terms of Use and must be done at your own risk.

In this article, we’re doing some research in a way that’s as light as possible on Twitter.

Find out the first person to tweet “Pien”

I checked the person who tweeted the popular “Pien” first. Surprisingly, it has a long history, and as of May 22, 2008, I found a person who used it with almost the same meaning as the current Pien.

Click here for the URL of the first tweeted “Pien”

スクリーンショット 2020-07-14 3.15.52.png

You can get the first “Pien” tweet by executing the following command after installing twitterscraper-ruby.

$ twitterscraper --query 'Pien' --start_date 2008-03-21 --end_date 2009-03-21 --lang ja --limit 10 --proxy --threads 10

Click here for the original tweet URL of Pienro

By the way, if it means different from the current Pien, the oldest Pien was tweeted on January 24, 2008. There are foods such as Pien Lo (Pien Lo Nabe) and Pien Porridge (Yapienjo), and it was a tweet in this sense.

スクリーンショット 2020-07-14 3.24.35.png

Click here for the original tweet URL of Pien Porridge

スクリーンショット 2020-07-14 3.25.31.png

Find out the first person to tweet “Reiwa” as an era

I think many of you know this prophetic tweet because it was buzzing online in the spring of 2019.

“Reiwa” was first tweeted as the next year of Heisei on “July 13, 2016”. You can easily find this tweet using the twitterscraper gem.

Click here for the URL of the Reiwa Prophecy Tweet

令和予言ツイートのスクリーンショット

If you execute the following command after installing twitterscraper-ruby, you can get the Reiwa prophecy tweet.

$ twitterscraper --query 'Reiwa' --start_date 2016-07-13 --end_date 2016-07-14 --limit 10

By the way, if it is “Reiwa as a character string” instead of “Reiwa as an era”, there are many people who tweeted earlier. It seems that they happen to be in the same order in Chinese.

Click here for the URL of the tweet that happened to be written as Reiwa

スクリーンショット 2020-07-14 1.52.52.png

Find the oldest tweets you can get

With the official Twitter Search, you can get tweets up to “2006-03-21”.

As a test, I tried to get the oldest tweet. As a result, it turned out that just setting up my twttr on March 22, 2006 was the oldest tweet.

Click here for the URL of the oldest tweet

スクリーンショット 2020-07-14 2.41.08.png

You can get the oldest tweet by executing the following command after installing twitterscraper-ruby.

twitterscraper --query 'just' --start_date 2006-03-21 --end_date 2006-03-22 --limit 10

Announcement

Please feel free to contact @ ts_3156 for requests and inquiries regarding analysis using SNS data and web development using Ruby on Rails.

https://github.com/ts-3156/twitterscraper-ruby