Keras was originally a library used in combination with the Deep Learning framework Theano, but with a recent update, Google's TensorFlow can also be used as a backend.
Keras: Deep Learning library for Theano and TensorFlow
You have just found Keras.
Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
Since Theano mainly supports the basic part of tensor operation, it is reasonable that there are some more abstract libraries based on Theano, but it supports a wider range of functions. Personally, I was a little surprised that Keras supported TensorFlow. Below, I examined the situation while creating a simple code.
(The program environment is Linux Ubuntu 14.04, Python 2.7.11, Theano ver.0.7.0, TensorFlow ver.0.6.0, Keras ver.0.3.0.)
Theano and ThensorFlow have been installed in an environment where GPU can be used before. This time, "Keras" was added.
> git clone https://github.com/fchollet/keras.git
> cd keras
> python setup.py install
Except for the examples directory, sample code is available for typical problems.
> ls examples
addition_rnn.py imdb_bidirectional_lstm.py kaggle_otto_nn.py mnist_mlp.py
babi_memnn.py imdb_cnn.py lstm_text_generation.py mnist_transfer_cnn.py
babi_rnn.py imdb_cnn_lstm.py mnist_cnn.py reuters_mlp.py
cifar10_cnn.py imdb_lstm.py mnist_irnn.py
Among them,'mnist_mlp.py' was an example of MNIST (handwritten digit classification problem), so when I executed this, an error occurred at first.
Traceback (most recent call last):
File "mnist_mlp.py", line 50, in <module>
model.compile(loss='categorical_crossentropy', optimizer=rms)
...
(Omitted)
...
File "build/bdist.linux-x86_64/egg/keras/activations.py", line 25, in relu
File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 463, in relu
AttributeError: 'module' object has no attribute 'relu'
This is because'relu' was specified as the activation function in'minist_mlp.py', but it was not supported by Theano version ver.0.7.0 which was originally installed by Anaconda. .. (Theno document says that'relu'is supported from ver.0.7.1.) However, Theano on GitHub has'relu' (although ver.0.7.0). So, when I upgade this, the above error disappeared.
(Omitted)
...
Epoch 20/20
1s - loss: 0.0092 - acc: 0.9974 - val_loss: 0.0555 - val_acc: 0.9842
Test score: 0.0554862711126
Test accuracy: 0.9842
The classification accuracy of Test is about 98.4%.
There are two ways to change the bank end.
{"epsilon ": 1e-07," floatx ":" float32 "," backend ":" theano "}
, so the last term, "theano" Change it to "thensorfolw".export KERAS_BACKEND=tensorflow
With this alone, the same Keras code can be executed using the TensorFlow library. (Very easy!) In comparing the two methods, setting environment variables seems to be easier. (Of course, to return to theono, you can either ʻunset` the above environment variable or set it to'theano'.)
By the way, the result of the benchmark test on performance is posted here, so I will quote it. https://github.com/fchollet/keras/wiki/Keras,-now-running-on-TensorFlow
Task | TensorFlow | Theano |
---|---|---|
mnist_mlp.py: compilation (s) | 0.6 | 5.9 |
mnist_mlp.py: runtime/epoch (s) | 7.5 | 6.3 |
imdb_lstm.py: compilation (s) | 39.3 | 38.3 |
imdb_lstm.py: runtime/epoch (s) | 283 | 123 |
mnist_cnn.py: compilation (s) | 0.8 | 11.4 |
mnist_cnn.py: runtime/epoch (s) | 190 | 3230 |
Even if you run the same code for the same problem, you can see that the required time changes considerably depending on the bank end. (It seems that it is due to CPU calculation.)
I tried to change the Wine classification MLP code of TensorFlow version / Theano version, which was dealt with in the recently posted article, to Keras version. It's a short code I wrote for Tutorial, but it's even more compact in the Keras version. (For the original code, refer to TensorFlow version, Theano version.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.regularizers import l2
from keras.optimizers import SGD, Adagrad
from input_winedata import load_data
if __name__ == '__main__':
WineData = '../../Data/Wine/wine.data'
train_x, train_y, test_x, test_y = load_data(WineData)
print(train_x.shape[0], 'train samples')
print(test_x.shape[0], 'test samples')
model = Sequential()
model.add(Dense(20, input_shape=(13,), W_regularizer=l2(0.001)))
model.add(Activation('sigmoid'))
model.add(Dropout(0.05))
model.add(Dense(20, W_regularizer=l2(0.001)))
model.add(Activation('sigmoid'))
model.add(Dropout(0.05))
model.add(Dense(3, W_regularizer=l2(0.001)))
model.add(Activation('softmax'))
adagrad = Adagrad(lr=0.01, epsilon=1e-08)
model.compile(loss='categorical_crossentropy', optimizer=adagrad)
batch_size = 16
nb_epoch = 1000
print('Train...')
model.fit(train_x, train_y,
batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=False, verbose=2,
validation_data=(test_x, test_y))
score = model.evaluate(test_x, test_y,
show_accuracy=True, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])
Below, we will confirm the details.
After inputting the data at the beginning, the definition of the network model (MLP) is done. Even in Theano and TensorFlow, if you define the layer class of the hidden layer and the layer class of the output layer, the main MLP definition part can be summarized quite concisely, but in Keras you do not have to create the class yourself, the following code The network can be defined with.
model = Sequential()
model.add(Dense(20, input_shape=(13,), W_regularizer=l2(0.001)))
model.add(Activation('sigmoid'))
model.add(Dropout(0.05))
model.add(Dense(20, W_regularizer=l2(0.001)))
model.add(Activation('sigmoid'))
model.add(Dropout(0.05))
model.add(Dense(3, W_regularizer=l2(0.001)))
model.add(Activation('softmax'))
Choose to define a forward-propagating network with the first model = Sequential ()
. After that, from the input side (definition of hidden layer 1), (definition of hidden layer 2), and (definition of output layer) are performed in this order. It is possible to write very clearly, such as selection of activation function, specification of regularization (Rezularizer), specification of Dropout, etc.
By the way, the forward propagation network model model = Sequential ()
can support up to CNN (Convolutional Neural Network), and for RNN (Recurrent Neural Net) etc. with a more complicated structure, in Keras, model = Graph () The structure can be defined by the Graph model starting with the declaration
. (I would like to investigate this in the future.)
In the second half of the MLP code, specify the optimizer and call the method of model.fit ()
.
adagrad = Adagrad(lr=0.01, epsilon=1e-08)
model.compile(loss='categorical_crossentropy', optimizer=adagrad)
batch_size = 16
nb_epoch = 1000
print('Train...')
model.fit(train_x, train_y,
batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=False, verbose=2,
validation_data=(test_x, test_y))
With the above, the program is executed until learning.
Best of all, it's great to be able to define a network and perform calculations without a separate script such as "YAML" or "Lua". With the two types of backends, "Theano" and "TensorFlow", I can't think of any interesting usage at this point, but there is no doubt that the frontage for users will expand.
Also, as is often said, the trade-off that the flexibility is lost due to the increase in abstraction naturally applies to "Keras". However, it seems to be a very useful library for the purpose of "I want to try my model immediately" or "I want to repeat numerical experiments of hyperparameters" in prototyping. (I became a Keras supporter at once.)
Recommended Posts