As a continuation of Blog Post, I would like to summarize a little more about Theano's details that I couldn't write on the slides.
First of all, Theano commentary is a concise expression of Theano features, so I recommend you to read it.
As it is written here, the feature of Theano is
And so on.
http://deeplearning.net/software/theano/tutorial/index.html#tutorial A rough summary of.
import numpy
import theano
import theano.tensor as T
These three are promises.
If you have a general understanding of the following, you should be able to read and understand the implementation of Deep Learning and make changes.
Variables handled by Theano are handled by the concept of "tensor". The types and operations around the tensor are roughly defined under theano.tensor (as T).
I don't really understand tensors, but for the time being "** T. * Below are variable types and major general-purpose mathematical functions (exp, log, sin, etc.) **" about it.
Here, "** variable type **" is
there is. These are combined to represent the variable type (tensor type).
As a name,
And
And so on (there are others).
Combining these,
And so on. For example:
x = T.lscalar("x")
m = T.dmatrix()
These x and m are "symbols" and do not have actual values. This is a little different from ordinary Python variables.
See below for more information on variable generation. http://deeplearning.net/software/theano/library/tensor/basic.html#libdoc-basic-tensor
For example
x = T.lscalar("x")
y = x*2
z = T.exp(x)
And execute. Since x is a symbol that has no value, y also has no value, and y is also a symbol that means "x * 2". z is also a symbol for exp (x). (Actually a Python Object)
The calculations that make up a neural network are also treated as a mass of operations (in short, an expression) between these symbols until a value is actually given. Since it is treated as an expression, it is easy for humans to understand, and I think that it will be possible to perform automatic differentiation, which will be described later, and to optimize it at runtime.
In order to actually perform the calculation, it is necessary to define a "function".
For example, if you want to make "** f (x) = x * 2 **"
f = theano.function([x], x*2)
And
y = x*2
f = theano.function([x], y)
And so on. f becomes a function, and when called
>>> f(3)
array(6)
It will be.
function is
Is specified. It seems that the function is compiled at this point, and even complex functions are executed at high speed.
function has the keyword ** gives **. As the name implies, givens works like "replace a symbol in an expression with another symbol or value".
For example
>>> x = T.dscalar()
>>> y = T.dscalar()
>>> c = T.dscalar()
>>> ff = theano.function([c], x*2+y, givens=[(x, c*10), (y,5)])
>>> ff(2)
array(45)
You can say that. Originally, the value to be calculated is "x * 2 + y", but the argument of the function itself is supposed to take the symbol "c". Actually, it cannot be calculated unless x and y are given, but it can also be calculated by giving the values of x and y in this givens part. This will be used in future Tutorials to partially use data in machine learning.
One of Theano's main features is this differentiation function. You can "analyze the formula to find the differentiated formula" called automatic differentiation.
For example
x, y = T.dscalars ("x", "y") # * How to write collectively z = (x+2*y)**2
If we differentiate the equation with respect to x, we get dz / dx = 2 (x + 2 * y).
gx = T.grad(z, x)
You can convert the expression with.
For the derivative with respect to y, dz / dy = 4 (x + 2 * y), but that
gy = T.grad(z, y)
You can convert the expression with.
When actually finding the value, it still needs to be functionalized,
fgy = theano.function([x,y], gy)
>>> fgy(1,2)
array(20.0)
And so on.
Variable = theano.shared (object)
You can declare shared data that can be referenced in the above ** function ** in the form of. For example
>>> x = T.dscalar("x")
>>> b = theano.shared(numpy.array([1,2,3,4,5]))
>>> f = theano.function([x], b * x)
>>> f(2)
array([ 2., 4., 6., 8., 10.])
You can use it with. To reference and set the value of a shared variable
>>> b.get_value()
array([1,2,3,4,5])
>>> b.set_value([10,11,12])
And so on. It is immediately reflected in the function defined earlier, and when you execute ** f (2) ** again, you can see that the result has changed.
>>> f(2)
array([ 20., 22., 24.])
function has a keyword argument called ** updates **, which allows you to update shared variables.
For example, to set c as a shared variable and increment it by 1 each time the function f is executed, write as follows.
c = theano.shared(0)
f = theano.function([], c, updates= {c: c+1})
The part ** updated = {c: c + 1} ** represents the update of the value that is familiar in the programming language ** c = c + 1 **. When you do this, you get:
>>> f()
array(0)
>>> f()
array(1)
>>> f()
array(2)
These can be used to implement the gradient method. For example, for the data ** x = [1,2,3,4,5] **, find ** c ** that minimizes ** y = sum ((xc) ^ 2) **. I want to. The code is as follows, for example.
x = T.dvector("x") # input
c = theano.shared(0.) #I will update this. The initial value is 0 for the time being.
y = T.sum((x-c)**2) # y=Value you want to minimize
gc = T.grad(y, c) #Partial derivative of y with respect to c
d2 = theano.function([x], y, updates={c: c - 0.05*gc}) #Updates c every time it runs and returns the current y
As a result, if you give ** [1,2,3,4,5] ** to ** d2 () ** several times,
>>> d2([1,2,3,4,5])
array(55.0)
>>> c.get_value()
1.5
>>> d2([1,2,3,4,5])
array(21.25)
>>> c.get_value()
2.25
>>> d2([1,2,3,4,5])
array(12.8125)
>>> c.get_value()
2.625
It will be. You can see that y gradually decreases and c gradually approaches "3".
If you can understand this area, you will understand how logistic regression in the following tutorial works.
http://deeplearning.net/software/theano/tutorial/examples.html#a-real-example-logistic-regression
(Well, do you know from the beginning? ^^;)
Recommended Posts