The other day, I went to TensorFlow Study Group as a presenter. People around me asked me if I wouldn't write anything about the content of the study session, but I was afraid of the number of people announcement that no one seems to be happy ), So instead, I will consume the dead material that I originally planned to use.

Target

What you want to do is very simple as the title suggests. If you enter the name of the Pokemon, I want you to see the race value and type-like thing. It feels like I'm trying to take something that is often the case with Twitter's diagnostic makers.

Model design

Input details

We decomposed the Pokemon name character by character and used the number of occurrences of each character and 2-gram as features. For example, in the case of Dedenne:

{
De: 2,Down: 1,Ne: 1,
Dede: 1,Den: 1,Nne: 1
}

Creating features for n-gram is a hassle if you do it yourself, but if you use scikit-learn's Vectorizer, a few You can create n-gram features while making detailed settings in lines. You can easily save a vectorizer that stores all the necessary information, so Ultra Super Miracle is recommended. Unless you have to accumulate virtue in preparation for the afterlife or there are special circumstances, you should definitely use it.

Output details

The output is a bit cumbersome, and you have to roughly divide it into the following three outputs.

Race value --Six consecutive values of HP, attack, defense, special attack, special defense, quickness, (regression problem)
Type 1 --18 categories (classification problem)
Type 2 --19 types of categories including "None" (classification problem)

You can set up a model for each and make predictions, but the flexibility to pack everything together is the strength of neural networks (I personally think), so I tried to put them all into one network. T. In other words, this is what it is.

The number of units in the last layer is 6 + 18 + 19. The part corresponding to the race value is output as it is, and the loss is defined by the square error. The output corresponds to 18 or 19 types, each of which is pushed into softmax, and the loss is defined by each cross entropy. The final objective function is the weighted addition of these losses.

Data collection

~~ It is not confidential data, so if you do your best, you will get together. Please do your best. ~~ Only the data used was uploaded to GitHub along with the code.

The finished code

# -*- coding: utf-8 -*-

import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.externals import joblib
from sklearn.feature_extraction import DictVectorizer


def inference(x_placeholder, n_in, n_hidden1, n_hidden2):
    """
    Description
    -----------
    Forward step which build graph.

    Parameters
    ----------
    x_placeholder: Placeholder for feature vectors
    n_in: Number of units in input layer which is dimension of feature
    n_hidden1: Number of units in hidden layer 1
    n_hidden2: Number of units in hidden layer 2

    Returns
    -------
    y_bs: Output tensor of predicted values for base stats
    y_type1: Output tensor of predicted values for type 1
    y_type2: Output tensor of predicted values for type 2
    """
    # Hidden1
    with tf.name_scope('hidden1') as scope:
        weights = tf.Variable(
            tf.truncated_normal([n_in, n_hidden1]),
            name='weights'
        )
        biases = tf.Variable(tf.zeros([n_hidden1]))
        hidden1 = tf.nn.sigmoid(tf.matmul(x_placeholder, weights) + biases)

    # Hidden2
    with tf.name_scope('hidden2') as scope:
        weights = tf.Variable(
            tf.truncated_normal([n_hidden1, n_hidden2]),
            name='weights'
        )
        biases = tf.Variable(tf.zeros([n_hidden2]))
        hidden2 = tf.nn.sigmoid(tf.matmul(hidden1, weights) + biases)

    # Output layer for base stats
    with tf.name_scope('output_base_stats') as scope:
        weights = tf.Variable(
            tf.truncated_normal([n_hidden2, 6]),
            name='weights'
        )
        biases = tf.Variable(tf.zeros([6]))
        y_bs = tf.matmul(hidden2, weights) + biases

    # Output layer for type1
    with tf.name_scope('output_type1') as scope:
        weights = tf.Variable(
            tf.truncated_normal([n_hidden2, 18]),
            name='weights'
        )
        biases = tf.Variable(tf.zeros([18]))
        # y_type1 = tf.nn.softmax(tf.matmul(hidden2, weights) + biases)
        y_type1 = tf.matmul(hidden2, weights) + biases

    # Output layer for type2
    with tf.name_scope('output_type2') as scope:
        weights = tf.Variable(
            tf.truncated_normal([n_hidden2, 19]),
            name='weights'
        )
        biases = tf.Variable(tf.zeros([19]))
        y_type2 = tf.matmul(hidden2, weights) + biases
        # y_type2 = tf.nn.softmax(tf.matmul(hidden2, weights) + biases)

    return [y_bs, y_type1, y_type2]


def build_loss_bs(y_bs, t_ph_bs):
    """
    Parameters
    ----------
    y_bs: Output tensor of predicted values for base stats
    t_ph_bs: Placeholder for base stats

    Returns
    -------
    Loss tensor which includes placeholder of features and labels
    """
    loss_bs = tf.reduce_mean(tf.nn.l2_loss(t_ph_bs - y_bs), name='LossBaseStats')
    return loss_bs


def build_loss_type1(y_type1, t_ph_type1):
    """
    Parameters
    ----------
    y_type1: Output tensor of predicted values for base stats
    t_ph_type1: Placeholder for base stats

    Returns
    -------
    Loss tensor which includes placeholder of features and labels
    """
    loss_type1 = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(y_type1, t_ph_type1),
        name='LossType1'
    )
    return loss_type1


def build_loss_type2(y_type2, t_ph_type2):
    """
    Parameters
    ----------
    y_type2: Output tensor of predicted values for base stats
    t_ph_type2: Placeholder for base stats

    Returns
    -------
    Loss tensor which includes placeholder of features and labels
    """
    loss_type2 = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits(y_type2, t_ph_type2),
        name='LossType2'
    )
    return loss_type2


def build_optimizer(loss, step_size):
    """
    Parameters
    ----------
    loss: Tensor of objective value to be minimized
    step_size: Step size for gradient descent

    Returns
    -------
    Operation of optimization
    """
    optimizer = tf.train.GradientDescentOptimizer(step_size)
    global_step = tf.Variable(0, name='global_step', trainable=False)
    train_op = optimizer.minimize(loss, global_step=global_step)
    return train_op


if __name__ == '__main__':
    # Set seed
    tf.set_random_seed(0)

    # Load data set and extract features
    df = pd.read_csv('data/poke_selected.csv')

    # Fill nulls in type2
    df.loc[df.type2.isnull(), 'type2'] = 'Nothing'

    # Vectorize pokemon name
    pokename_vectorizer = CountVectorizer(analyzer='char', min_df=1, ngram_range=(1, 2))
    x = pokename_vectorizer.fit_transform(list(df['name_jp'])).toarray()
    t_bs = np.array(df[['hp', 'attack', 'block', 'contact', 'defense', 'speed']])

    # Vectorize pokemon type1
    poketype1_vectorizer = DictVectorizer(sparse=False)
    d = df[['type1']].to_dict('record')
    t_type1 = poketype1_vectorizer.fit_transform(d)

    # Vectorize pokemon type2
    poketype2_vectorizer = DictVectorizer(sparse=False)
    d = df[['type2']].to_dict('record')
    t_type2 = poketype2_vectorizer.fit_transform(d)

    # Placeholders
    x_ph = tf.placeholder(dtype=tf.float32)
    t_ph_bs = tf.placeholder(dtype=tf.float32)
    t_ph_type1 = tf.placeholder(dtype=tf.float32)
    t_ph_type2 = tf.placeholder(dtype=tf.float32)

    # build graph, loss, and optimizer
    y_bs, y_type1, y_type2 = inference(x_ph, n_in=1403, n_hidden1=512, n_hidden2=256)
    loss_bs = build_loss_bs(y_bs, t_ph_bs)
    loss_type1 = build_loss_type1(y_type1, t_ph_type1)
    loss_type2 = build_loss_type2(y_type2, t_ph_type2)
    loss = tf.add_n([1e-4 * loss_bs, loss_type1, loss_type2], name='ObjectiveFunction')
    optim = build_optimizer(loss, 1e-1)

    # Create session
    sess = tf.Session()

    # Initialize variables
    init = tf.initialize_all_variables()
    sess.run(init)

    # Create summary writer and saver
    summary_writer = tf.train.SummaryWriter('log', graph_def=sess.graph_def)
    tf.scalar_summary(loss.op.name, loss)
    tf.scalar_summary(loss_bs.op.name, loss_bs)
    tf.scalar_summary(loss_type1.op.name, loss_type1)
    tf.scalar_summary(loss_type2.op.name, loss_type2)
    summary_op = tf.merge_all_summaries()
    saver = tf.train.Saver()

    # Run optimization
    for i in range(1500):
        # Choose indices for mini batch update
        ind = np.random.choice(802, 802)
        batch_xs = x[ind]
        batch_ts_bs = t_bs[ind]
        batch_ts_type1 = t_type1[ind]
        batch_ts_type2 = t_type2[ind]
        # Create feed dict
        fd = {
            x_ph: batch_xs,
            t_ph_bs: batch_ts_bs,
            t_ph_type1: batch_ts_type1,
            t_ph_type2: batch_ts_type2
        }
        # Run optimizer and update variables
        sess.run(optim, feed_dict=fd)
        # Show information and write summary in every n steps
        if i % 100 == 99:
            # Show num of epoch
            print 'Epoch:', i + 1, 'Mini-Batch Loss:', sess.run(loss, feed_dict=fd)
            # Write summary and save checkpoint
            summary_str = sess.run(summary_op, feed_dict=fd)
            summary_writer.add_summary(summary_str, i)
            name_model_file = 'model_lmd1e-4_epoch_' + str(i+1) + '.ckpt'
            save_path = saver.save(sess, 'model/tensorflow/'+name_model_file)
    else:
        name_model_file = 'model_lmd1e-4_epoch_' + str(i+1) + '.ckpt'
        save_path = saver.save(sess, 'model/tensorflow/'+name_model_file)

    # Show example
    poke_name = 'Thunder'
    v = pokename_vectorizer.transform([poke_name]).toarray()
    pred_bs = sess.run(y_bs, feed_dict={x_ph: v})
    pred_type1 = np.argmax(sess.run(y_type1, feed_dict={x_ph: v}))
    pred_type2 = np.argmax(sess.run(y_type2, feed_dict={x_ph: v}))
    print poke_name
    print pred_bs
    print pred_type1, pred_type2
    print poketype1_vectorizer.get_feature_names()[pred_type1]
    print poketype2_vectorizer.get_feature_names()[pred_type2]

    # Save vectorizer of scikit-learn
    joblib.dump(pokename_vectorizer, 'model/sklearn/pokemon-name-vectorizer')
    joblib.dump(poketype1_vectorizer, 'model/sklearn/pokemon-type1-vectorizer')
    joblib.dump(poketype2_vectorizer, 'model/sklearn/pokemon-type2-vectorizer')

Let me learn

It doesn't matter if the material is tuned seriously because it's a material, so I just looked at TensorBoard to see if the minimum type loss and the racial value loss were reduced in a balanced manner.

Try playing

Let's load the model and play with it like this.

# -*- coding: utf-8 -*-

import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.externals import joblib
import pn2bs


# Placeholder
x_ph = tf.placeholder(dtype=tf.float32)
t_ph = tf.placeholder(dtype=tf.float32)

y_bs, y_type1, y_type2 = pn2bs.inference(x_ph, n_in=1403, n_hidden1=512, n_hidden2=256)

# Create session
sess = tf.Session()

# Load TensorFlow model
saver = tf.train.Saver()
saver.restore(sess, "model/tensorflow/model_lmd1e-4_epoch_1500.ckpt")

# Load vectorizer of scikit-learn
pokename_vectorizer = joblib.load("model/sklearn/pokemon-name-vectorizer")
poketype1_vectorizer = joblib.load("model/sklearn/pokemon-type1-vectorizer")
poketype2_vectorizer = joblib.load("model/sklearn/pokemon-type2-vectorizer")

poke_name = 'Gonzales'
v = pokename_vectorizer.transform([poke_name]).toarray()
pred_bs = sess.run(y_bs, feed_dict={x_ph: v})
pred_type1 = np.argmax(sess.run(y_type1, feed_dict={x_ph: v}))
pred_type2 = np.argmax(sess.run(y_type2, feed_dict={x_ph: v}))
result = {
    'name'   : poke_name,
    'hp'     : pred_bs[0][0],
    'attack' : pred_bs[0][1],
    'block'  : pred_bs[0][2],
    'contact': pred_bs[0][3],
    'defense': pred_bs[0][4],
    'speed'  : pred_bs[0][5],
    'type1'  : poketype1_vectorizer.get_feature_names()[pred_type1],
    'type2'  : poketype2_vectorizer.get_feature_names()[pred_type2],
}
print result['name']
print result['hp']
print result['attack']
print result['block']
print result['contact']
print result['defense']
print result['speed']
print result['type1']
print result['type2']

An example of the result. There is nothing special to say, but please guess something about each.

Name	H	A	B	C	D	S	Type 1	Type 2
Tensor flow	92	102	84	85	65	73	Fairy	evil
Mega Pikachu	74	80	50	97	85	80	Electricity	Nothing
Gonzales	81	103	107	86	103	65	Dragon	Electricity
Mega Gonzales	100	137	131	118	117	103	Dragon	evil

Future issues (not to say that we will work on it)

With the current settings, there is a possibility that Pokemon with unknown meaning such as type 1: water, type 2: water, etc. will be born.
When counting 2-gram, it may be better to consider BOS and EOS as one character and use them.
I forgot the existence of Dropout

Summary

I thought that the flexibility of modeling is great because the neural network throws away the convexity and the good properties of the function to the dove with all its might. For example, if you try to do the same thing with an SVM, you will first stop wondering what to do when you are asked for a multidimensional output such as a race value, and even solve the classification problem related to types together. On the day when I was told, I feel like I should sleep and say sleep. However, when it comes to leaving the neural network to the bottom of the 9th inning with 2 outs, it feels like hmm. something like that.

Postscript

2016-01-26

I'm a little sorry that I haven't released it in a state where I can try it properly even though there are people who still stock it from time to time, so I've posted something that works [on GitHub](https: / /github.com/sfujiwara/pn2bs). I thought it would be cool to put out a radar chart with Google Charts or something according to the entered name, but I wanted to let the bot hit it and play with it, so I chose Web API.

I think that it will probably come out when I use it, so if I confess it first, when I put in an existing Pokemon, it will wither if the status is completely different from the actual one, so I intentionally adopted a model that is overfitting.

Play to predict race value and type from Pokemon name in TensorFlow