This article is the 7th day of Mynavi Advent Calendar 2019!
I usually specialize in natural language processing in the department that handles big data and AI. This time, I will spend a lot of time talking to Slack in-house about unleashing chatbots and playing.
A bot with high fanning performance is running on AWS. Sometimes I get an injection attack from my senior at work, but I'm fine. I regret that it would have been better to have a single cloud instead of a multi-cloud.
Below is an image of actually playing with a chatbot.
Hmmm, what a face to my parents ...
All human beings live with the desire to make chatbots. At least I am, so I'm sure others are. Meanwhile, a paper on Microsoft's high school girl AI "Rinna" was published in the world (2016).
cute
I have to make this ... (2 years have passed)
I use Slack as my main communication tool. There were 677 on the Public Channel alone. I'm not sure if it's more or less: thinking:
Since the times culture is rooted, the purpose of this chatbot is to operate it at my own times and "make everyone smile". It contains a prayer that we should become such a dialogue bot. I was there. By the way, the name is "Natsu-chan" because it was released in the summer. Every time the version goes up, I name the season at that time.
The Slack API used is the Events API (https://api.slack.com/events-api). I don't forget to narrow down the permissions so that the API is called only when mentioned.
I made it quickly.
The configuration diagram of the chatbot is shown below.
Actually, I did my best.
At first, I gave a single EC2 an Elastic IP, saved the session with the Screen command, and linked it with Slack. However, there was a god's announcement that "there is no merit in using AWS," and I thought that was the case, and I had a lot of fun with it.
I will list the problems of the current configuration in my own way.
This time, I implemented it with Seq2Seq + Attention + Sentencepiece. The specific technology group is shown below.
item | Contents |
---|---|
Learning algorithm | Seq2Seq (4-layer LSTM)+ Global Attention |
Tokenizer | Sentencepiece |
Pre-learning | Word2Vec |
Optimization method | Adam |
Training data | Dialogue bankruptcy corpus+Old name+Conversation log on Slack |
Library used | Chainer |
We are using Word2Vec to learn word vectors from a learning corpus tokenized by Sentencepiece as pre-learning. The learned word vector was used as the initial value of Word Embedding of Encoder and Decoder.
Below is the __init__
part.
def __init__(self, vocab_size, embed_size, hidden_size, eos, w=None, ignore_label=-1):
super(Seq2Seq, self).__init__()
self.unk = ignore_label
self.eos = eos
with self.init_scope():
# Embedding Layer
self.x_embed = L.EmbedID(vocab_size, embed_size, initialW=w, ignore_label=ignore_label)
self.y_embed = L.EmbedID(vocab_size, embed_size, initialW=w, ignore_label=ignore_label)
# 4-Layer LSTM
self.encoder = L.NStepLSTM(n_layers=4, in_size=embed_size, out_size=hidden_size, dropout=0.1)
self.decoder = L.NStepLSTM(n_layers=4, in_size=embed_size, out_size=hidden_size, dropout=0.1)
# Attention Layer
self.attention = L.Linear(2*hidden_size, hidden_size)
# Output Layer
self.y = L.Linear(hidden_size, vocab_size)
Seq2Seq is a series conversion model that uses Encoder-Decoder. In the original paper, it was published in machine translation (English-French translation task), but converting an input string to an output string can also be used in dialogue! That is the recognition that is also used in dialogue bots.
The training data is a set of input sentence-output sentence as follows.
Input statement:It's really fun to talk to you. Would you like to go up to the living room and talk?
Output statement:I have something to do today, so I'll be free.
The answers obtained from the learned model are like a question-and-answer formula, and do not consider the flow of conversation at all. Sometimes the conversation seems to have continued, but it just happens to continue and is not considered as a model at all.
The following is the __call__
part.
def __call__(self, x, y):
"""
:param x:Mini-batch input data
:param y:Mini-batch output corresponding to input data
:return:Error and accuracy
"""
batch_size = len(x)
eos = self.xp.array([self.eos], dtype='int32')
#EOS signal embedding
y_in = [F.concat((eos, tmp), axis=0) for tmp in y]
y_out = [F.concat((tmp, eos), axis=0) for tmp in y]
# Embedding Layer
emb_x = sequence_embed(self.x_embed, x)
emb_y = sequence_embed(self.y_embed, y_in)
# Encoder,Input to Decoder
h, c, a = self.encoder(None, None, emb_x) # h => hidden, c => cell, a => output(Attention)
_, _, dec_hs = self.decoder(h, c, emb_y) # dec_hs=> output
#batch size decoder output concat
dec_h = chainer.functions.concat(dec_hs, axis=0)
#Attention calculation
attention = chainer.functions.concat(a, axis=0)
o = self.global_attention_layer(dec_h, attention)
t = chainer.functions.concat(y_out, axis=0)
loss = F.softmax_cross_entropy(o, t) #Error calculation
accuracy = F.accuracy(o, t) #Precision calculation
return loss, accuracy
Beam search is used for inference. The beam width is 3. The maximum word length is 50.
For these implementations, I referred to @ nojima's blog below. It was a great help. Thank you! https://nojima.hatenablog.com/entry/2017/10/10/023147
Learning uses Google Colaboratory. https://colab.research.google.com/notebooks/welcome.ipynb?hl=ja
It is implemented with Flask + uWSGI. When working with the Events API, Challenge
occurs from the Slack side.
https://api.slack.com/events-api#subscriptions
Specifically, the following POST is thrown at the endpoint. If you return the value of this " challenge "
as it is, the challenge is successful. After that, it is possible to link with Slack according to the set Slack API settings.
Note that if the Challenge fails, Slack will not POST to the API endpoint.
{
"token": "Slack API token",
"challenge": "3eZbrw1aBm2rZgRNFdxV2595E9CY3gmdALWMmHkvFXO7tYXAYM8P",
"type": "url_verification"
}
Permission settings can be set in "Subscribe to bot events" in "Event Subscriptions". I added ʻapp_mention` as an Event. This is an Event that POSTs post data to the configured endpoint when mentioned on the channel that introduced the bot.
Under "Subscribe to bot events" there is "Subscribe to workspace events", which is a setting that applies as an event for the entire workspace. Please note that it will be difficult.
The issues on the API side are as follows.
I was glad that they were accepted favorably within times. For some reason, I interact with bots more than once a week (including myself).
Sometimes I hear complaints, sometimes I get injection attacks, and sometimes I get "strong" words. Thinking about the engineers behind the chatbots, I couldn't hit them hard. I'm sorry, Siri. I'm sorry, Cortana. I'm sorry, Kyle.
It was a year when I realized that there are many things that I can understand for the first time after making it.
:innocent:
Recommended Posts