Learn Zundokokiyoshi with LSTM

Overview

I tried to learn Zundokokiyoshi using LSTM. It is implemented using Chainer. I posted the wrong code a while ago, but I'll fix it and repost it.

The following post is detailed for the explanation of LSTM.

Understanding LSTM-with recent trends

model

Build a model like the one below

code

zundoko.py


#!/usr/bin/env python
# -*- coding: utf-8 -*-

import numpy as np
import chainer
from chainer import Variable, optimizers, functions as F, links as L

np.random.seed()
zun = 0
doko = 1
input_num = 2
input_words = ['Dung', 'Doco']
none = 0
kiyoshi = 1
output_num = 2
output_words = [None, '\ Ki yo shi! /']
hidden_num = 8
update_iteration = 20

class Zundoko(chainer.Chain):
    def __init__(self):
        super(Zundoko, self).__init__(
            word=L.EmbedID(input_num, hidden_num),
            lstm=L.LSTM(hidden_num, hidden_num),
            linear=L.Linear(hidden_num, hidden_num),
            out=L.Linear(hidden_num, output_num),
        )

    def __call__(self, x, train=True):
        h1 = self.word(x)
        h2 = self.lstm(h1)
        h3 = F.relu(self.linear(h2))
        return self.out(h3)

    def reset_state(self):
        self.lstm.reset_state()

kiyoshi_list = [zun, zun, zun, zun, doko]
kiyoshi_pattern = 0
kiyoshi_mask = (1 << len(kiyoshi_list)) - 1
for token in kiyoshi_list:
    kiyoshi_pattern = (kiyoshi_pattern << 1) | token

zundoko = Zundoko()
for param in zundoko.params():
    data = param.data
    data[:] = np.random.uniform(-1, 1, data.shape)
optimizer = optimizers.Adam(alpha=0.01)
optimizer.setup(zundoko)

def forward(train=True):
    loss = 0
    acc = 0
    if train:
        batch_size = 20
    else:
        batch_size = 1
    recent_pattern = np.zeros((batch_size,), dtype=np.int32)
    zundoko.reset_state()
    for i in range(200):
        x = np.random.randint(0, input_num, batch_size).astype(np.int32)
        y_var = zundoko(Variable(x, volatile=not train), train=train)
        recent_pattern = ((recent_pattern << 1) | x) & kiyoshi_mask
        if i < len(kiyoshi_list):
            t = np.full((batch_size,), none, dtype=np.int32)
        else:
            t = np.where(recent_pattern == kiyoshi_pattern, kiyoshi, none).astype(np.int32)
        loss += F.softmax_cross_entropy(y_var, Variable(t, volatile=not train))
        acc += float(F.accuracy(y_var, Variable(t, volatile=not train)).data)
        if not train:
            print input_words[x[0]]
            y = np.argmax(y_var.data[0])
            if output_words[y] != None:
                print output_words[y]
                break
        if train and (i + 1) % update_iteration == 0:
            optimizer.zero_grads()
            loss.backward()
            loss.unchain_backward()
            optimizer.update()
            print 'train loss: {} accuracy: {}'.format(loss.data, acc / update_iteration)
            loss = 0
            acc = 0

for iteration in range(20):
    forward()

forward(train=False)

Output example

train loss: 18.4753189087 accuracy: 0.020000000298
train loss: 16.216506958 accuracy: 0.0325000006706
train loss: 15.0742883682 accuracy: 0.0350000008941
train loss: 13.9205350876 accuracy: 0.385000001639
train loss: 12.5977449417 accuracy: 0.96249999404
(Omission)
train loss: 0.00433994689956 accuracy: 1.0
train loss: 0.00596862798557 accuracy: 1.0
train loss: 0.0027643663343 accuracy: 1.0
train loss: 0.011038181372 accuracy: 1.0
train loss: 0.00512072304264 accuracy: 1.0
Dung
Dung
Dung
Doco
Doco
Doco
Dung
Dung
Dung
Doco
Doco
Doco
Doco
Doco
Doco
Dung
Doco
Doco
Dung
Doco
Doco
Dung
Doco
Dung
Dung
Dung
Dung
Dung
Dung
Dung
Dung
Doco
\ Ki yo shi! /

A little addicted

At first I was using dropout, but then learning didn't work and the output was almost None.

Recommended Posts

Learn Zundokokiyoshi with LSTM
Zundokokiyoshi with python
Zundokokiyoshi with TensorFlow
Learn Python with ChemTHEATER
Learn Pandas with Cheminformatics
Learn with chemoinformatics scikit-learn
Learn with Cheminformatics Matplotlib
Multivariate LSTM with Keras
Learn with Cheminformatics NumPy
DCGAN with TF Learn
Learn Pendulum-v0 with DDPG
Learn librosa with a tutorial 1
Learn elliptical orbits with Chainer
Learn new data with PaintsChainer
Zundokokiyoshi with python / ruby / Lua
Learn algorithms with Go @ recursive call
Learn Zundokokiyoshi using a simple RNN
Learn with Causal ML Package Meta-Learner
Learn with FizzBuzz Iterator, Generator, Decorator
Learn with PyTorch Graph Convolutional Networks
[TensorFlow 2] Learn RNN with CTC Loss
Let's learn Deep SEA with Selene
Learn search with Python # 2bit search, permutation search
Learn document categorization with spaCy CLI
Implement Keras LSTM feedforward with numpy
Beginner RNN (LSTM) | Try with Keras
Classifying SNLI datasets with Word2Vec + LSTM
Beginners automatically generate documents with Pytorch's LSTM
Getting Started with python3 # 1 Learn Basic Knowledge
Learn to colorize monochrome images with Chainer
Learn data distributed with TensorFlow Y = 2X
Learn Python! Comparison with Java (basic function)
Learn with Splatoon nervous breakdown! Graph theory
Stock Price Forecast with TensorFlow (LSTM) ~ Stock Forecast Part 1 ~
"How to pass PATH" to learn with homebrew
Learn the design pattern "Singleton" with Python
Preparing to learn technical indicators with TFlearn
Learn the design pattern "Facade" with Python