Power-up! !!

Introduction

With an app maker that sometimes turns around when watching Twitter Good friends, my boom (words that often mutter), etc ... How often do you do it?

Really? Isn't there something I think?

Oh, it's not business efficiency, but I will write it. This is Rookie. If you are worried about streamlining your work, please refer to here.

Does everyone care about how to get the Twitter timeline and how to handle it? I'm curious. That kind of curiosity is important, isn't it?

It seems to do this.

environment

macOS Sierra 10.12.1
Emacs 22.1.1
Python 2.7.10
Mecab 0.996
Mecab-ipadic 2.7.0-20070801
Logstash 5.0.1
Elasticsearch 5.0.1
Kibana 5.0.1

However

I came up with it at noon on Wednesday and made it two nights from that day, so the quality of the content is limited.

Environment

Macbook

Do you have a Macbook? If you happen to don't have one now, go to the Apple Shop first. Then you can put Emacs and Python in it for the time being.

--Twitter developer account

You need Consumer Key, Consumer Key Secret, Oauth Token, Oauth Token Secret to get it with API. These can be obtained by registering a Twitter developer account. As a result, you also need a Twitter user account. I forgot which site I saw, but here Please do something.

Elastic Stack(ELK)

This is also required. Basically, you can drop the zip from the official site and execute binary.

--I like Windows ...

It can't be helped, please here.

--Ubuntu has

Then here.

--Install Macab

Please download and unzip the Source and IPA dictionaries from here. In each

$ ./configure --with-charset=utf8
$ make
$ sudo make install

Then it's OK.

Code to write

Code to get tweets

Get the timeline of a specific user with the Twitter API.

`user.py`


#!/usr/bin/env python
# -*- coding: utf-8 -*-

from requests_oauthlib import OAuth1Session
import json
import MeCab

CK = 'AAAAAAAAAAAAAAAAAAA'
# Consumer Key
CS = 'BBBBBBBBBBBBBBBBBBB'
# Consumer Key Secret
AT = 'CCCCCCCCCCCCCCCCCCC'
# Oauth Token
AS = 'DDDDDDDDDDDDDDDDDDD'
# Oauth Token Secret

url = "https://api.twitter.com/1.1/statuses/user_timeline.json" #this url for getting home timeline

#here can be set ID of tweet (ordered by time), and number of tweets (default is 20, max 200)
params = {'count':200, 'screen_name':'ACCOUNT'}
# ACCOUNT =When ripping on Twitter@The part displayed by ACCOUNT

# GET request
twitter = OAuth1Session(CK, CS, AT, AS)
req = twitter.get(url, params = params)

f = open("user.json","w")

if req.status_code == 200:
    timeline = json.loads(req.text)

    for tweet in timeline:
        array=[]
        for word in tweet["text"].split(" "):
            array.append(word)
        tweet['words']=array

        array=[]
        tagger = MeCab.Tagger()
        text_str = tweet["text"].encode('utf-8')
        node = tagger.parseToNode(text_str)
        mecab = []
        while node:
            pos = node.feature.split(",")[0]
            if pos == "noun":
                word = node.surface.decode("utf-8")
                mecab.append(word)
            elif pos == "verb":
                word = node.surface.decode("utf-8")
                mecab.append(word)
            node = node.next
        tweet['mecab']=mecab

        json.dump(tweet, f)
        f.write('\n')
else:
    print("Error: %d" % req.status_code)

If you want your own TL

`timeline.py`


...
url = "https://api.twitter.com/1.1/statuses/home_timeline.json"
...
params = {'count':200}
...

For those who want the results of tweet search

`search.py`


...
url = "https://api.twitter.com/1.1/search/tweets.json"
...
params = {'q':'Megu Aoshima', 'count':'200'}
...

is.

Elastic Stack configuration file

The Logstash config file is required. It is specified at startup. (People who start with service should put it under /etc/logstash/conf.d/)

`logstash.conf`


input {
  file {
    path => "/Users/you/py/timeline.json"
    start_position => "beginning"
    type => "timeline"
    codec => "json"
  }
  file {
    path => "/Users/you/py/user.json"
    start_position => "beginning"
    type => "user"
    codec => "json"
  }
}

filter {
  date {
    match => [ "created_at" , "EEE MMM dd HH:mm:ss Z yyyy"]
    target => "created_at"
  }
  grok {
    match => { "created_at" => "%{YEAR}-%{MONTHNUM}-%{MONTHDAY}T%{HOUR:tweet_hour:int}:%{MINUTE}:%{SECOND}.000Z"}
  }
  ruby {
    code => "event.set('[tweet_hour]', event.get('tweet_hour') + 9)"
  }
  if [tweet_hour] > 23 {
    ruby {
      code => "event.set('[tweet_hour]', event.get('tweet_hour') - 24)"
    }
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "twitter-%{type}-%{+YYYY.MM.dd}"
  }
}

The rest of the Elastic Stacks do a good job.

Kibana's settings are deep, but for now let's copy the settings file from here and import it. [Management]-> [Saved Objects]-> [Import] on the left side of the Kibana screen (http: // localhost: 5601) is. After that, let's play around with this and remember it with your body.

Run

Please run Elastic Stacks first. The zip decompression group is (Elastic Stack-version) / bin / (Elastic Stack), The autostart group is service (Elastic Stack) start.

Execute the code to get the tweet earlier.

$ python user.py

If you put the file that comes out in the place where Logstash reads it (/ home / you / py in the above example), Logstash will suck it.

All you have to do is look at the Kibana screen.

http://localhost:5601

result

Now let's look at the results.

スクリーンショット 2016-12-08 23.56.59.png

The number of data is not very large due to restrictions on the Twitter API.

You can see the hashtags you use often and the people who rip them often. You can see that I'm an RT demon. Amazon seems to be my hit word for some reason. I have a clue ... maybe because I've been RTing the Amazon Dash Button. You can also understand this. The time to tweet often is also visualized, but ... well, my eyes are tired and I can't see well.

By the way, I put in Mecab with much effort, but it has been divided into very small parts because Japanese is a difficult enemy. I'm not sure about Mecab, but I think that if you narrow it down to words with 3 or more letters on the kibana side, you can see more decent results.

Impressions

I had fun! For the past two days, I've been friends with my Mac even after glaring at Windows at work, but the fun isn't tiring.

There are many other visualization methods in Kibana. Please try it with your own hands, or google a little and you will find some intriguing examples.

Oh, why did I start writing this ... It was fun, so ...

Abuse is prohibited, right? It's a promise with your brother, right?

Would you like to make a Twitter resume?