AI beginners try to make professional student bots

This article is the 10th day article of Professional Student Advent Calendar 2016: laughing:

Introduction

In the first place, it is delicate whether this is AI or not, so there is a possibility that the title is wrong ... I tried to make a professional student ChatBot by collecting the comments of professional student on Twitter. ↓ If you say something on Slack like this, a professional student will reply.

スクリーンショット 2016-12-08 16.55.42.png

Background

In-house Slack has created a bot channel for a blogger, and if you ask, you're making a ChatBot from Twitter. There is no choice but to make Chatbot from professional student's Twitter! That's why I started making it.

Try to make

Roughly, it seems that you can make it with the following flow.

  1. Use the Twitter crawler to summarize your remarks in a DB
  2. Let ElasticSearch eat DB data
  3. Use Python's SlackBot to pass your remarks to ElasticSearch and spit out the returned text

I just used the source that my colleague made, so I will explain in detail this time in Skip. .. .. : sweat_drops:

Preparation

I don't want to run ElasticSearch on my machine, so this time I will start CentOS with Vagrant and run it there.

Preparing Vagrant (CentOS)

cd <Appropriate directory>
mkdir pronama-chan-bot && cd $_
vagrant init <CentOS Box file name>
vagrant up
vagrant ssh
sudo yum update -y

After that, work in CentOS of ↑

Java installation

It seems that Java is required to install ʻanalysis-kuromoji` described later, so install it

sudo yum install -y wget
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u65-b17/jdk-8u65-linux-x64.rpm
sudo rpm -ivh jdk-8u65-linux-x64.rpm
java -version

Install ElasticSearch

* What is ElasticSearch?

Full-text search engine provided by Elastic (a mechanism for searching document data including the target word from a large amount of document data).

Install by referring to here

sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
sudo vi /etc/yum.repos.d/elasticsearch.repo

/etc/yum.repos.d/elasticsearch.repo


[elasticsearch-2.x]
name=Elasticsearch repository for 2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
sudo yum install -y elasticsearch

ElasticSearch plugin installation

#Install kuromoji for full-text search in Japanese
sudo /usr/share/elasticsearch/bin/plugin install analysis-kuromoji

#An extended version of the ipa dictionary called neologd? Install because it uses Toyara
sudo /usr/share/elasticsearch/bin/plugin install org.codelibs/elasticsearch-analysis-kuromoji-neologd/2.4.1

#A plugin that allows you to view the results in a web browser
sudo /usr/share/elasticsearch/bin/plugin install polyfractal/elasticsearch-inquisitor

#Plugin that can monitor ElasticSearch
sudo /usr/share/elasticsearch/bin/plugin install royrusso/elasticsearch-HQ
  • The explanation of each plug-in was written by google, so it may be wrong m (_ _) m

ElasticSearch settings

sudo vi /etc/elasticsearch/elasticsearch.yml

Changed as follows

http.compression: true
network.publish_host: "0.0.0.0"
network.host: "0.0.0.0"
network.bind_host: "0.0.0.0"
transport.tcp.port: 9300
transport.tcp.compress: true
http.port: 9200

Startup settings

sudo chkconfig elasticsearch on
sudo service elasticsearch start

Python installation (use pyenv)

sudo yum install -y git
git clone https://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile
pyenv install anaconda3-4.1.1
pyenv rehash
pyenv global anaconda3-4.1.1
python --version

Run Twitter Crawler

Use Python's library for Twitter API called tweepy to get the past timeline and save it in the DB.

Approximate image.py


#Twitter API settings
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth)

#Obtain TL (execute the following until all TLs can be obtained)
statuses = api.user_timeline('pronama', max_id = None, count = 200)

#Process the acquired data for DB storage
#abridgement

#Save the acquired TL in DB
#abridgement

Feed the DB of crawling results to ElasticSearch

template settings?

PUT the JSON set in localhost: 9200 / _template / <template_name> and it's OK Set the index name in template and set kuromoji in tokenizer. I'm not sure about this yet, so I'll investigate later (flag not to)

Delete index once

DELETE localhost: 9200 / <index_name> and it's OK

Save the data pulled from DB to json once

Convert DB data to json using Python

Bulk Insert json data

POST with localhost: 9200 / <index_name> / speech / _bulk --data-binary <json data>

ElasticSearch itself should be working so far, so hit the following command and check if the result is returned.

curl -XGET 'http://localhost:9200/<index_name>/_search?pretty' -d '
{
  "query": {
      "function_score": {
          "functions": [
              {
                  "random_score": {
                    "seed" : "999999999"
                  }
              }
          ], 
          "query": {
              "query_string": {
                  "query": "text.kuromoji:<Text>^100 OR text.2gram:$<Text>^10"
              }
          }, 
          "score_mode": "multiply"
      }
  }, 
  "size": 1, 
  "sort": {
      "_score": {
          "order": "desc"
      }
  }, 
  "track_scores": true
}
'

スクリーンショット 2016-12-08 19.41.22.png

Yay! !! For some reason, the reply from "Yahho" is "I did it!", But it seems to be working!

SlackBot settings

Now let's set this up as a SlackBot.

Register Bot users in Slack

Register the bot user from [Add Configuration] in here. Enter each item appropriately. This time, register with the name "@pronama_chan".

スクリーンショット 2016-12-08 13.49.30.png

Don't forget to make a note of the "API Token" displayed on the next screen.

スクリーンショット 2016-12-08 13.50.09.png

Create a bot channel in Slack

Create a channel from Slack. スクリーンショット 2016-12-08 13.48.20.png

Don't forget to invite the @pronama_chan created in ↑.

スクリーンショット 2016-12-08 13.52.25.png

Create SlackBot

Create a Slack Bot using Python's slack bot library.

This is also a rough image.py


from slackbot.bot import Bot
from slackbot.bot import respond_to,default_reply

bot_response(userid, word):
    #POST to ElasticSearch
    response = requests.post(
        'http://{}/{}/_search'.format(hostname, index_name),
        <JSON string to be thrown to ElasticSearch generated from word>.encode('utf-8'))

    #From what was POSTed
    return <Extracted character string hit from response>

@respond_to('(.*)')
def chat(message, word):
    response = bot_response(message._get_user_id(), word)
    message.reply(response)

def main():
    bot = Bot()
    bot.run()

if __name__ == "__main__":
    main()

Tried to make it

Try throwing from Slack

スクリーンショット 2016-12-08 16.55.42.png

Kita━━━ ヽ (∀ ゚) people (゚ ∀ ゚) people (゚ ∀) ノ ━━━ !!

Finally

I'm a beginner of AI at a web shop, but I managed to make a professional student Chatbot. It's fun but tiring to use technologies that you don't normally use. .. ..

スクリーンショット 2016-12-08 20.20.36.png

Good!

Recommended Posts

AI beginners try to make professional student bots
Machine learning beginners try to make a decision tree
I tried to make a face diagnosis AI for a female professional golfer ①
Try to make a kernel of Jupyter
I tried to make a face diagnosis AI for a female professional golfer ②
Try to make something like C # LINQ
I refactored "I tried to make Othello AI when programming beginners studied python"
How to make a dialogue system dedicated to beginners
How to make Spigot plugin (for Java beginners)
Try to make matplotlib's color cycle look good
Try to make a "cryptanalysis" cipher with Python
Try to calculate RPN in Python (for beginners)
I tried to make AI for Smash Bros.
Try to make a dihedral group with Python
How to make Python faster for beginners [numpy]
Try to make client FTP fastest with Pythonista
Beginners try to make an online battle Othello web application with Django + React + Bootstrap (1)
Try to make a Python module in C language
Beginners try to convert Word files to PDF at once
Try to make a command standby tool with python
Try to make RESTful API with MVC using Flask 1.0.2