I want to get tweets that include the #Weekend Hackathon keyword when creating a weekend hackathon website. Get it using Python's tweepy module.
Register the application from Twitter Developers and get the Consumer API keys and Access token & access token secret.
Tweepy is convenient for operating the Twitter API, so please use it.
$ pip install tweepy
OAuth authentication required to operate Twitter API Authenticate using the API KEY obtained from the Developer site.
import tweepy
#Enter the obtained API KEY and TOKEN
API_KEY = ""
API_SECRET_KEY = ""
ACCESS_TOKEN = ""
ACCESS_TOKEN_SECRET = ""
def twitter_api() -> tweepy.API:
auth = tweepy.OAuthHandler(API_KEY, API_SECRET_KEY)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
return tweepy.API(auth)
Created get_search method to enable keyword search
** Argument description **
KEY | Contents |
---|---|
api | OAuth authentication |
q | Keywords you want to search |
start_date | Search start period |
end_date | Search end period |
count | Number of acquisitions |
Since the data is easy to process, I decided to put it in Pandas tweet_created_at is changed to Japan time after +9 hours because it is US time
from datetime import timedelta
import pandas as pd
import tweepy
#Enter the obtained API KEY and TOKEN
API_KEY = ""
API_SECRET_KEY = ""
ACCESS_TOKEN = ""
ACCESS_TOKEN_SECRET = ""
def twitter_api() -> tweepy.API:
auth = tweepy.OAuthHandler(API_KEY, API_SECRET_KEY)
auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
return tweepy.API(auth)
def get_search(
api: tweepy.API, q: str, start_date: str, end_date: str, count: int = 1000
) -> pd.DataFrame:
q = f"{q} since:{start_date} until:{end_date} -filter:retweets"
tweets = api.search(
q=q,
count=count,
tweet_mode="extended",
locale="ja",
lang="ja",
include_entities=False,
)
df = pd.DataFrame(
columns=[
"user_id",
"user_name",
"user_screen_name",
"user_profile_image_url",
"tweet_id",
"tweet_full_text",
"tweet_favorite_count",
"tweet_created_at",
]
)
for tweet in tweets:
df = df.append(
{
"user_id": tweet.user.id,
"user_name": tweet.user.name,
"user_screen_name": tweet.user.screen_name,
"user_profile_image_url": tweet.user.profile_image_url.replace(
"_normal", ""
),
"tweet_id": tweet.id,
"tweet_full_text": tweet.full_text,
"tweet_favorite_count": tweet.favorite_count,
"tweet_created_at": tweet.created_at + timedelta(hours=+9),
},
ignore_index=True,
)
return df
Simple OAuth authentication and search, DataFrame so process it and use it
api = twitter_api()
search = get_search(api, "#Weekend hackathon", "2021-01-15", "2021-01-18")
idxmax = search.groupby("user_id").tweet_created_at.idxmax()
tweets = search.iloc[idxmax]
Recommended Posts