This article is the 10th day article of Professional Student Advent Calendar 2016: laughing:
In the first place, it is delicate whether this is AI or not, so there is a possibility that the title is wrong ... I tried to make a professional student ChatBot by collecting the comments of professional student on Twitter. ↓ If you say something on Slack like this, a professional student will reply.
In-house Slack has created a bot channel for a blogger, and if you ask, you're making a ChatBot from Twitter. There is no choice but to make Chatbot from professional student's Twitter! That's why I started making it.
Roughly, it seems that you can make it with the following flow.
I just used the source that my colleague made, so I will explain in detail this time in Skip. .. .. : sweat_drops:
I don't want to run ElasticSearch on my machine, so this time I will start CentOS with Vagrant and run it there.
cd <Appropriate directory>
mkdir pronama-chan-bot && cd $_
vagrant init <CentOS Box file name>
vagrant up
vagrant ssh
sudo yum update -y
After that, work in CentOS of ↑
It seems that Java is required to install ʻanalysis-kuromoji` described later, so install it
sudo yum install -y wget
wget --no-check-certificate --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u65-b17/jdk-8u65-linux-x64.rpm
sudo rpm -ivh jdk-8u65-linux-x64.rpm
java -version
Full-text search engine provided by Elastic (a mechanism for searching document data including the target word from a large amount of document data).
Install by referring to here
sudo rpm --import https://packages.elastic.co/GPG-KEY-elasticsearch
sudo vi /etc/yum.repos.d/elasticsearch.repo
/etc/yum.repos.d/elasticsearch.repo
[elasticsearch-2.x]
name=Elasticsearch repository for 2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
sudo yum install -y elasticsearch
#Install kuromoji for full-text search in Japanese
sudo /usr/share/elasticsearch/bin/plugin install analysis-kuromoji
#An extended version of the ipa dictionary called neologd? Install because it uses Toyara
sudo /usr/share/elasticsearch/bin/plugin install org.codelibs/elasticsearch-analysis-kuromoji-neologd/2.4.1
#A plugin that allows you to view the results in a web browser
sudo /usr/share/elasticsearch/bin/plugin install polyfractal/elasticsearch-inquisitor
#Plugin that can monitor ElasticSearch
sudo /usr/share/elasticsearch/bin/plugin install royrusso/elasticsearch-HQ
- The explanation of each plug-in was written by google, so it may be wrong m (_ _) m
sudo vi /etc/elasticsearch/elasticsearch.yml
Changed as follows
http.compression: true
network.publish_host: "0.0.0.0"
network.host: "0.0.0.0"
network.bind_host: "0.0.0.0"
transport.tcp.port: 9300
transport.tcp.compress: true
http.port: 9200
Startup settings
sudo chkconfig elasticsearch on
sudo service elasticsearch start
sudo yum install -y git
git clone https://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bash_profile
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bash_profile
echo 'eval "$(pyenv init -)"' >> ~/.bash_profile
source ~/.bash_profile
pyenv install anaconda3-4.1.1
pyenv rehash
pyenv global anaconda3-4.1.1
python --version
Use Python's library for Twitter API called tweepy to get the past timeline and save it in the DB.
Approximate image.py
#Twitter API settings
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
api = tweepy.API(auth)
#Obtain TL (execute the following until all TLs can be obtained)
statuses = api.user_timeline('pronama', max_id = None, count = 200)
#Process the acquired data for DB storage
#abridgement
#Save the acquired TL in DB
#abridgement
PUT the JSON set in localhost: 9200 / _template / <template_name>
and it's OK
Set the index name in template
and set kuromoji
in tokenizer
. I'm not sure about this yet, so I'll investigate later (flag not to)
DELETE localhost: 9200 / <index_name>
and it's OK
Convert DB data to json using Python
POST with localhost: 9200 / <index_name> / speech / _bulk --data-binary <json data>
ElasticSearch itself should be working so far, so hit the following command and check if the result is returned.
curl -XGET 'http://localhost:9200/<index_name>/_search?pretty' -d '
{
"query": {
"function_score": {
"functions": [
{
"random_score": {
"seed" : "999999999"
}
}
],
"query": {
"query_string": {
"query": "text.kuromoji:<Text>^100 OR text.2gram:$<Text>^10"
}
},
"score_mode": "multiply"
}
},
"size": 1,
"sort": {
"_score": {
"order": "desc"
}
},
"track_scores": true
}
'
Yay! !! For some reason, the reply from "Yahho" is "I did it!", But it seems to be working!
Now let's set this up as a SlackBot.
Register the bot user from [Add Configuration] in here. Enter each item appropriately. This time, register with the name "@pronama_chan".
Don't forget to make a note of the "API Token" displayed on the next screen.
Create a channel from Slack.
Don't forget to invite the @pronama_chan
created in ↑.
Create a Slack Bot using Python's slack bot library.
This is also a rough image.py
from slackbot.bot import Bot
from slackbot.bot import respond_to,default_reply
bot_response(userid, word):
#POST to ElasticSearch
response = requests.post(
'http://{}/{}/_search'.format(hostname, index_name),
<JSON string to be thrown to ElasticSearch generated from word>.encode('utf-8'))
#From what was POSTed
return <Extracted character string hit from response>
@respond_to('(.*)')
def chat(message, word):
response = bot_response(message._get_user_id(), word)
message.reply(response)
def main():
bot = Bot()
bot.run()
if __name__ == "__main__":
main()
Kita━━━ ヽ (∀ ゚) people (゚ ∀ ゚) people (゚ ∀) ノ ━━━ !!
I'm a beginner of AI at a web shop, but I managed to make a professional student Chatbot. It's fun but tiring to use technologies that you don't normally use. .. ..
Good!
Recommended Posts