One of the popular words from 2015 to 2016 is "artificial intelligence", but it's not the case that you say "I can't use it because I don't know it well", so studying Chainer regardless of work. I'm going to start.
I used to use SVM or RandomForest in scenes where a discriminator is needed for business, but from now on, I can expect some questions that will be asked when writing a discriminator with SVM as usual (" What happens if I do deep learning? ”→“ I don't want to do anything ”), I want to emphasize my attitude of being sensitive to buzzwords (mystery).
References:
If you don't understand by reading http://docs.chainer.org/en/stable/install.html, the installation is so easy that you have to leave on the spot.
For the time being, CUDA is not an environment that can be used on my Mac at home, so I will try to install chainer without CUDA suuport.
Before that, the Python environment for my Mac at home is built with Anaconda, so first check if it can be installed with conda.
% anaconda search -t conda chainer
Using Anaconda API: https://api.anaconda.org
Run 'anaconda show <USER/PACKAGE>' to get more details:
Packages:
Name | Version | Package Types | Platforms
------------------------- | ------ | --------------- | ---------------
steerapi/chainer | 0 | conda | win-64
: A flexible framework of neural networks
Found 1 packages
It seems that only the win-64 package is prepared, so install chainer with pip according to Documentation.
% pip install chainer
(Omission)
Installing collected packages: chainer
Successfully installed chainer-1.19.0
At this point, I didn't get an error when I tried import chainer
on ipython, so I think it's probably OK.
After installation, perform Tutorial. This is important for knowing "what can be done?" And also as a hint for "what keywords should I search for when reading a document to achieve what I want to do?"
So, what I've come to understand here is that you need to understand the jargon of neural networks, at least enough to read English documentation. If you can imagine the directed graph and weighting of the combination of ○ and → in Japanese, it is difficult to read this tutorial in the first place (that's me).
Well, but the keyword is "Define-by-Run".
In the code introduced in the tutorial, the following is omitted.
import numpy as np
import chainer
from chainer import cuda, Function, gradient_check, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
from chainer.training import extensions
I don't use CUDA in the actual home Mac environment, so I think it is necessary to deduct that.
MNIST The tutorial also introduces the implementation of MNIST. First, prepare the data.
train, test = datasets.get_mnist()
When this is executed, the handwritten character data of the example used in MNIST is downloaded as shown in the figure.
% ls -A ~/.chainer/dataset/pfnet/chainer/mnist/
test.npz train.npz
The training dataset will be shuffled for each trial, but the test dataset will not need to be shuffled, so it says to set it as follows. That is, the training and testing datasets need to have different options for iterators.
train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)
Now that the dataset is ready, follow the tutorial to define a three-tier network structure.
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__(
l1 = L.Linear(None, n_units),
l2 = L.Linear(None, n_units),
l3 = L.Linear(None, n_out)
)
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
y = self.l3(h2)
return y
It's a chainer because l1, l2, and l3 are called in a chain. Each of l1, l2, and l3 looks like a function that has an input and outputs it, but this is called a link in the chainer system, and the purpose of this system is to optimize this input. Is it a feeling of becoming?
So, in such a neural network of a three-layer network, the second layer is usually treated as a hidden layer, but especially in the definition of this network structure, it is not clearly declared that l2 is a hidden layer. However, when \ call is executed, h1 is calculated from the input x with l1, and that h1 is input to l2 without being output in particular, and the resulting h2 is also output without any particular output. Since it is passed to and the structure is such that only y of the calculation result is output, is it correct to understand that the link l2 points to the hidden layer as a result?
This means that when expanding class MLP from 3 layers to 4 layers and 5 layers, it seems good to just add an intermediate layer when \ __ init__.
The function that evaluates the accuracy and loss of this network is defined as `` `chainer.links.Classifier```, so we call it Sole.
model = L.Classifier(MLP(100, 10))
optimizer = optimizers.SGD()
optimizer.setup(model)
This sudden SGD () is [stochastic gradient descent](https://ja.wikipedia.org/wiki/%E7%A2%BA%E7%8E%87%E7%9A % 84% E5% 8B% BE% E9% 85% 8D% E9% 99% 8D% E4% B8% 8B% E6% B3% 95).
At this point, it is finally possible to learn using the learning set.
updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (20, 'epoch'), out='result')
Now, to learn, you can call run () of trainer, but I want to know the learning status (or rather, I want to see the python script I wrote so far working properly ) In that case, it seems that you should set the extension.
trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
(Omitted)
So, after 20 times of repeated learning, a classifier was created. For details, the result folder stores the running log in a text file called log (because I specified extensions.LogReport and specified out ='result' for Trainer).
Summary so far.
--A batch called Link that receives input and outputs it (function? Class?) --Link can be described as being called in a chain reaction --What is output after receiving the input to link1 becomes the input of link2. --The output from link2 becomes the input to link3 ――What was "Define-by-Run"? ――It's hard to understand unless you compare it with another neural network implementation in the past.
From the impression I've seen so far, I have to have a little more knowledge about neural networks, but there is no requirement for "complex writing that strongly depends on the library", but it is a very python-like writing and thinking. It turned out that it made me feel like it was cool. I see, it's a popular reason.
By the way, at http://qiita.com/fukuit/items/d69d8ca1ad558c4de014, I tried to determine the numbers with the k-th nearest neighbor attached to OpenCV, but how is the result compared to Sole? In other words, it seems that the accuracy is about 0.95 in 20 trials, so it may be said that the performance as a discriminator is better than KNN which was about 0.91.
The tutorial is up to that point.
Recommended Posts