[Ruby] Use twitterscraper-ruby to search for tweets from several years ago by date

1 minute read

There is a method of using the Twitter Search API to search for tweets on Twitter, but you can only search for the last 7 days**.

To get around this limitation, try using twitterscraper-ruby gem. twitterscraper-ruby scrapes Twitter directly, so you can search anytime from the first day Twitter started.

Note 1

twitterscraper-ruby scrapes Twitter directly, but the Twitter Terms of Use prohibit scraping. Only use it for collecting a small amount of personal tweets.

Note 2

The search function of Twitter does not cover all tweets. In addition, it seems that there are cases where Japanese tweets cannot be searched well. Therefore, it is not suitable for collecting all tweets.

Search tweets by date

Let’s compare the results when using twitterscraper-ruby and when using Twitter Search API.

When using twitterscraper-ruby

Let’s try to get a maximum of 1000 tweets on January 1, 2020.

Gemfile


gem'twitterscraper-ruby'
require'twitter scraper'

# There is a proxy function to reduce the possibility of IP ban,
# Proxy is off because it's slower instead
client = Twitterscraper::Client.new(proxy: false)

tweets = client.query_tweets('Twitter', start_date: '2020-01-01', end_date: '2020-01-02', lang:'ja', limit: 1000)

puts tweets.size
# => 1000

tweets.take(3).each {|t| puts t.created_at}
# => 2020-01-01 23:59:59 +0000
# => 2020-01-01 23:59:59 +0000
# => 2020-01-01 23:59:59 +0000

We got exactly 1000 items. I’ve tried other numbers, but they seem to be a little over.

When using Twitter Search API (twitter gem)

I also tried using the Twitter Search API to get tweets on the same date, but I could not get any tweets as expected. This is because the Twitter Search API can only retrieve tweets for the last 7 days.

# https://github.com/sferik/twitter
client = Twitter::REST::Client.new

tweets = client.search('Twitter since:2020-01-01 until:2020-01-02', count: 100)

puts tweets.size
# => 0