Here's how to use Tweepy, a Python library, to collect a large number of your tweets.
--Use Python3 series (Persons of Python2 series should do their best) --Tweepy talks on the assumption that it is already installed. --Twitter API registration has already been registered
Tweet_data.py
# -*- coding: utf-8 -*-
import tweepy
#Tweepy settings
CONSUMER_KEY = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
CONSUMER_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
ACCESS_TOKEN = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
ACCESS_SECRET = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth)
i = 1
with open("Tweet_data.txt", "a+") as tf:
for status in tweepy.Cursor(api.user_timeline).items():
try:
status = str(status.text).replace("\n","") #Remove line breaks in tweets
if "RT" in status: #RT does not write
pass
elif "https" in status: #Do not write tweets with images and URLs
pass
elif "@" in status: #In the case of rip, remove the ID and write
status = status[status.find(" ")+1:len(status)] # "@"From" "Get the index and get only the part after that
tf.write(status+"\n")
print("Step%d: "%(i) +status) #View tweets written to a txt file
i += 1
else:
tf.write(status+"\n")
print("Step%d: "%(i) +status) #View tweets written to a txt file
i += 1
except UnicodeEncodeError: #When I'm running, I suddenly get a UnicodeEncodeError, but it continues
pass
When this code is executed, it will continue to write to the txt file forever until the first tweet, so if enough tweets are collected, if you interrupt it appropriately with ctrl + c, it will be saved in the txt file.
Succeeded in securing a large amount of my tweets
I want to know the language model network
Ero
After all, Masgomi
How much tweet data should I collect?
Maybe this will give you a Unicode Error
Take care ... If you feel something is wrong, go to the hospital and see it.
Should buy
Is it a little better if you wash it?
The source is my grandma
Well, maybe it swells and it only hurts for a while
If this was a stray dog, it was dead
If you don't disinfect it, you won't die, but it will swell ...
e…! ?? !! ?? !! ?? I haven't disinfected it! ?? !! ?? !! ?? !! ??
It was good
Are you okay…
UnicodeEncodeError: 'cp932' codec can't encode character '\U0001f4a2' in position 28: illegal multibyte sequence
I'm really worried ...
I'm worried about what I'm worried about even if I'm told it's okay
I'm really worried
worry
Is it really okay
No no no no no
variable.find[x:y]Seems to go
I want to extract only after the white space
For example @JUN_NETWORKS When there was a rip, I want to erase only the ID part and take out only the text like this rip.
@Is there a way to remove only whitespace characters from the string?
I said that it was pretty good, but somewhere I got the ID of Lip, so I have to erase it
It doesn't look okay at all
All right…?
Be bitten too much by a northern dog ...
Well, you can create a file like this.
If you have any questions, please leave a comment or send me a rip on my Twitter and I will answer.
Recommended Posts