Power-up! !!
With an app maker that sometimes turns around when watching Twitter Good friends, my boom (words that often mutter), etc ... How often do you do it?
Really? Isn't there something I think?
Oh, it's not business efficiency, but I will write it. This is Rookie. If you are worried about streamlining your work, please refer to here.
Does everyone care about how to get the Twitter timeline and how to handle it? I'm curious. That kind of curiosity is important, isn't it?
It seems to do this.
I came up with it at noon on Wednesday and made it two nights from that day, so the quality of the content is limited.
Do you have a Macbook? If you happen to don't have one now, go to the Apple Shop first. Then you can put Emacs and Python in it for the time being.
--Twitter developer account
You need Consumer Key, Consumer Key Secret, Oauth Token, Oauth Token Secret to get it with API. These can be obtained by registering a Twitter developer account. As a result, you also need a Twitter user account. I forgot which site I saw, but here Please do something.
This is also required. Basically, you can drop the zip from the official site and execute binary.
--I like Windows ...
It can't be helped, please here.
--Ubuntu has
Then here.
--Install Macab
Please download and unzip the Source and IPA dictionaries from here. In each
$ ./configure --with-charset=utf8
$ make
$ sudo make install
Then it's OK.
Get the timeline of a specific user with the Twitter API.
user.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from requests_oauthlib import OAuth1Session
import json
import MeCab
CK = 'AAAAAAAAAAAAAAAAAAA'
# Consumer Key
CS = 'BBBBBBBBBBBBBBBBBBB'
# Consumer Key Secret
AT = 'CCCCCCCCCCCCCCCCCCC'
# Oauth Token
AS = 'DDDDDDDDDDDDDDDDDDD'
# Oauth Token Secret
url = "https://api.twitter.com/1.1/statuses/user_timeline.json" #this url for getting home timeline
#here can be set ID of tweet (ordered by time), and number of tweets (default is 20, max 200)
params = {'count':200, 'screen_name':'ACCOUNT'}
# ACCOUNT =When ripping on Twitter@The part displayed by ACCOUNT
# GET request
twitter = OAuth1Session(CK, CS, AT, AS)
req = twitter.get(url, params = params)
f = open("user.json","w")
if req.status_code == 200:
timeline = json.loads(req.text)
for tweet in timeline:
array=[]
for word in tweet["text"].split(" "):
array.append(word)
tweet['words']=array
array=[]
tagger = MeCab.Tagger()
text_str = tweet["text"].encode('utf-8')
node = tagger.parseToNode(text_str)
mecab = []
while node:
pos = node.feature.split(",")[0]
if pos == "noun":
word = node.surface.decode("utf-8")
mecab.append(word)
elif pos == "verb":
word = node.surface.decode("utf-8")
mecab.append(word)
node = node.next
tweet['mecab']=mecab
json.dump(tweet, f)
f.write('\n')
else:
print("Error: %d" % req.status_code)
If you want your own TL
timeline.py
...
url = "https://api.twitter.com/1.1/statuses/home_timeline.json"
...
params = {'count':200}
...
For those who want the results of tweet search
search.py
...
url = "https://api.twitter.com/1.1/search/tweets.json"
...
params = {'q':'Megu Aoshima', 'count':'200'}
...
is.
The Logstash config file is required. It is specified at startup.
(People who start with service should put it under /etc/logstash/conf.d/
)
logstash.conf
input {
file {
path => "/Users/you/py/timeline.json"
start_position => "beginning"
type => "timeline"
codec => "json"
}
file {
path => "/Users/you/py/user.json"
start_position => "beginning"
type => "user"
codec => "json"
}
}
filter {
date {
match => [ "created_at" , "EEE MMM dd HH:mm:ss Z yyyy"]
target => "created_at"
}
grok {
match => { "created_at" => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY}T%{HOUR:tweet_hour:int}:%{MINUTE}:%{SECOND}.000Z"}
}
ruby {
code => "event.set('[tweet_hour]', event.get('tweet_hour') + 9)"
}
if [tweet_hour] > 23 {
ruby {
code => "event.set('[tweet_hour]', event.get('tweet_hour') - 24)"
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "twitter-%{type}-%{+YYYY.MM.dd}"
}
}
The rest of the Elastic Stacks do a good job.
Kibana's settings are deep, but for now let's copy the settings file from here and import it.
[Management]-> [Saved Objects]-> [Import] on the left side of the Kibana screen (http: // localhost: 5601
)
is.
After that, let's play around with this and remember it with your body.
Please run Elastic Stacks first.
The zip decompression group is (Elastic Stack-version) / bin / (Elastic Stack)
,
The autostart group is service (Elastic Stack) start
.
Execute the code to get the tweet earlier.
$ python user.py
If you put the file that comes out in the place where Logstash reads it (/ home / you / py
in the above example), Logstash will suck it.
All you have to do is look at the Kibana screen.
http://localhost:5601
Now let's look at the results.
You can see the hashtags you use often and the people who rip them often. You can see that I'm an RT demon. Amazon seems to be my hit word for some reason. I have a clue ... maybe because I've been RTing the Amazon Dash Button. You can also understand this. The time to tweet often is also visualized, but ... well, my eyes are tired and I can't see well.
By the way, I put in Mecab with much effort, but it has been divided into very small parts because Japanese is a difficult enemy. I'm not sure about Mecab, but I think that if you narrow it down to words with 3 or more letters on the kibana side, you can see more decent results.
I had fun! For the past two days, I've been friends with my Mac even after glaring at Windows at work, but the fun isn't tiring.
There are many other visualization methods in Kibana. Please try it with your own hands, or google a little and you will find some intriguing examples.
Oh, why did I start writing this ... It was fun, so ...
Abuse is prohibited, right? It's a promise with your brother, right?
Recommended Posts