[Memorandum] ① Get and save tweets ~ I want to identify the news tweets that are spread ~

Development environment

Windows10 Anaconda3 ( jupyter notebook )

Description and purpose

A memorandum of graduation thesis of a university student The theme is to create a discriminator between what is spread and what is not spread in news tweets. This time, I am writing about getting Tweet in it.

Prerequisites

・ Tweet Developer certified ・ Tweepy installed

reference

https://qiita.com/i_am_miko/items/a2e5168e619ed37afeb9

Get Tweets

The account to get is @livedoornews. The reason is that it excels in the number of followers and the sensitivity of those followers (whether to improve RT).

`get_newstweet.ipynb`


#Import the required libraries
import tweepy
import pandas as pd

`get_newstweet.ipynb`



#Consumer key and access token settings for using Twitter API
Consumer_key = "API key"
Consumer_secret = "API secret Key"
Access_token = "Access token"
Access_secret = "Access token secret"

#Authentication
auth = tweepy.OAuthHandler(Consumer_key,Consumer_secret)
auth.set_access_token(Access_token, Access_secret)
api = tweepy.API(auth)

`get_newstweet.ipynb`


#Specify account name
acount = "@livedoornews"
"""
Acquisition contents: Tweet number, time, tweet text, number of likes, number of RTs
"""
def get_tweets(acount):
    tweet_data = [] #Empty list to store the data to get
    for tweet in tweepy.Cursor(api.user_timeline,screen_name = acount,exclude_replies = True).items():
        tweet_data.append([tweet.id,tweet.created_at,tweet.text.replace('\n',''),tweet.favorite_count,tweet.retweet_count])
        df = pd.DataFrame(tweet_data,columns=['tweet_no', 'time', 'text', 'favorite_count', 'RT_count']) #Stored in pandas DataFrame
    return df

df = get_tweets(acount)

Save the retrieved tweets (csv)

If you want to continue taking tweets with the above function, you need to save additional. Therefore, I made two saving methods, one for new saving and the other for additional saving.

First, save new

`get_newstweet.ipynb`


#Save new
file_name = "../data/tweet_{}.csv".format(acount)
df.to_csv(file_name, index=False) #index is often not needed

Second, overwrite save

`get_newstweet.ipynb`


#overwrite save
file_name = "../data/tweet_{}.csv".format(acount)
pre_df = pd.read_csv(file_name) #Load the previous csv
df = pd.concat([df, pre_df])
df = df.drop_duplicates(subset=['tweet_no']) #Delete duplicates with Tweet No.(Leave the new data)
df.to_csv(file_name, index=False)

Summary and next content

That's all for getting tweets and saving them. I think there is a better way to save new or overwrite. Next time, I would like to delete RT and URL.

Recommended Posts

[Memorandum] ① Get and save tweets ~ I want to identify the news tweets that are spread ~

I want to identify the alert email. --Is that x a wildcard? ---

I want to visualize where and how many people are in the factory

I want to get the file name, line number, and function name in Python 3.4

I want to get the operation information of yahoo route

I want to map the EDINET code and securities number

Keras I want to get the output of any layer !!

I want to get information from fstab at the ssh connection destination and execute a command

I want to get the name of the function / method being executed

I want to record the execution time and keep a log.

Bug that unnecessary files are created when the -i and -e options are added to the sed command.

I want to connect remotely to another computer, and the nautilus command

[For beginners] I want to get the index of an element that satisfies a certain conditional expression

Memorandum Regular expression When there are multiple characters in the character string that you want to separate

I want to separate the processing between test time and production environment

Get tweets with Google Cloud Function and automatically save images to Google Photos

I want to analyze the emotions of people who want to meet and tremble

I implemented the VGG16 model in Keras and tried to identify CIFAR10

I want to pin Spyder to the taskbar

I want to output to the console coolly

I want to handle the rhyme part1

I want to handle the rhyme part3

I want to display the progress bar

I want to handle the rhyme part2

I want to handle the rhyme part5

I want to handle the rhyme part4

The file edited with vim was readonly but I want to save it

I want to get the path of the directory where the running file is stored.

python I don't know how to get the printer name that I usually use.

The story of IPv6 address that I want to keep at a minimum

I want to drop a file on tkinter and get its path [Tkinter DnD2]

Python programming: I tried to get (crawling) news articles using Selenium and BeautifulSoup4.

I want to make a music player and file music at the same time

I tried to summarize the operations that are likely to be used with numpy-stl

I want to exe and distribute a program that resizes images Python3 + pyinstaller

I tried to save the data with discord

I want to handle the rhyme part7 (BOW)

I want to get League of Legends data ③

I want to get League of Legends data ②

I want to customize the appearance of zabbix

I want to get League of Legends data ①

I want to use the activation function Mish

I want to display the progress in Python!

[LPIC 101] I tried to summarize the command options that are easy to make a mistake

The story of Linux that I want to teach myself half a year ago

I want to get started with the Linux kernel, what is the list head structure?

I tried to score the syntax that was too humorous and humorous using the COTOHA API.

[Pyhton] I want to solve the problem that tkinter does not work on MacOS11

I want to replace the variables in the python template file and mass-produce it in another file.

A solution to the problem that files containing [and] are not listed in glob.glob ()

I want to cut out only the face from a person image with Python and save it ~ Face detection and trimming with face_recognition ~

I want to see the file name from DataLoader

Get the title of yahoo news and analyze sentiment

I tried to read and save automatically with VOICEROID2 2

I want to generate a UUID quickly (memorandum) ~ Python ~

I want to grep the execution result of strace

I want to scroll the Django shift table, but ...

I want to handle optimization with python and cplex

I want to inherit to the back with python dataclass

I want to fully understand the basics of Bokeh

I want to write in Python! (3) Utilize the mock