Introduction

This article is a sequel to "Tic-tac-toe AI with Pylearn 2". It contains information such as the MLP model you are using and the format of the game record, so we recommend that you read the previous article before reading this article.

In the previous article

In the future, based on the results of one training, I will try to implement it in a tic-tac-toe game so that I can easily obtain the next move for any input. First of all, I have to make a game program of tic-tac-toe.

Since it was tied with, I will realize this. To do this, I had to save and load the trained model. I'll show you how to do that and actually create a tic-tac-toe with wxPython. Necessary files such as source code are posted on GitHub, so please get them from there.

How to save and load a model in Pylearn 2

Once I knew it, it was easy.

from pylearn2.utils import serial
...
...
ann = mlp.MLP([h0,out], nvis=9)
path = "./hoge.pkl"

# save model
serial.save(path, ann, on_overwrite='backup')

# load model
ann = serial.load(path)

You can select'ignore' or'backup'for'on_overwrite'.

ignore: ignore and overwrite
backup: Back up as .bak before save and delete the .bak file if save is successful. You can recover if save fails.

Unless you have a specific reason, I think'backup' is fine.

Tic-tac-toe with wxPython

Operation check environment

LinuxMint 17.3 (python2.7.6 + wxpython3.0.3)

I think wxPython will work with 2.8 as well. Click here for wxPython installation instructions (http://qiita.com/kanlkan/items/5e6f2e63de406f46b3b1#wxpython%E3%81%AE%E3%82%A4%E3%83%B3%E3%82%B9%E3 Please see% 83% 88% E3% 83% BC% E3% 83% AB) and so on. Installing Pylearn 2 on Windows is a difficult task, so run it on Linux. On a Mac ... I'm sorry, I don't know ...

How to use

Set the model parameters and save the learned model Set the model parameters in the right half panel.

h0 : Sigmoid
irange: Initial weight range. It is set randomly from the range of ± orange.
init bias: Initial bias value
out : Softmax
irange: Initial weight range. It is set randomly from the range of ± orange.
open .csv
Specify the game record file to read. It will be read from the same folder as tic_tac_toe.py.
save .pkl
Model file name to save. It will be created in the same folder as tic_tac_toe.py.
term criterion
termination_criterion. Set the EpochCounter value to terminate learning.

Select Game Mode

Man vs Comp: People attack first, Computer second
Comp vs Man: Computer first, people second

1. Specify the model file to load and Game Start
Left-click the mouse to place 〇 ×. Computer automatically puts it from the learning result.

Once you save a model, you can simply specify it and play games with that model.

Points to enjoy

If you change the game record file to be read, the one that the first attack wins (records_1st_win.csv), the one that the first attack loses (records_1st_lose.csv), or the one that puts all together (records.csv), the behavior will change. With a model learned from the game record that the first attack won, if you make Computer hit in the second attack ...
If you set termination_criterion to a very small value ...
Based on the game record that you are doing your best ...

Please enjoy watching the behavior of AI by changing the parameters such as. It is also recommended to play the game while watching the confidence level (I call it) of the hand that Computer outputs to the console as shown below.

[ 0.1223754   0.07839377  0.1005455   0.09967972  0.0958171   0.05355689
  0.13877278  0.08772236  0.22313648]
[  1.69255291e-01   1.79474672e-01   6.59611187e-02   8.35728072e-02
   1.76704145e-01   5.69182580e-05   1.74977445e-01   1.48576416e-01
   1.42118607e-03]
[  3.94020768e-02   3.56583963e-03   9.39233627e-05   1.20089713e-01
   4.85647829e-01   2.05857441e-04   2.00150417e-01   1.50013023e-01
   8.31320404e-04]
[  3.55036488e-01   8.74969597e-03   2.24572898e-04   7.35919590e-04
   1.89100732e-02   3.48102279e-04   2.63566398e-01   3.48737495e-01
   3.69125555e-03]

The nine numbers enclosed in [] add up to 1. Nine numbers correspond to nine squares. AI hits the square with the highest value among them. If you think of this number as a probability,

If there is not much difference in any number → Not very confident.
If only one value is overwhelmingly large → Full of confidence

When you hit a strange square with full confidence, it's a little adorable.

Please try various things and play with them.

reference

https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/train.py http://fastml.com/how-to-get-predictions-from-pylearn2/ http://deeplearning.net/software/pylearn2/library/utils.html

Let's create a tic-tac-toe AI with Pylearn 2-Save and load models-