Keras as wrapper of Theano & TensorFlow

Keras was originally a library used in combination with the Deep Learning framework Theano, but with a recent update, Google's TensorFlow can also be used as a backend.

Keras: Deep Learning library for Theano and TensorFlow

You have just found Keras.

Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

Since Theano mainly supports the basic part of tensor operation, it is reasonable that there are some more abstract libraries based on Theano, but it supports a wider range of functions. Personally, I was a little surprised that Keras supported TensorFlow. Below, I examined the situation while creating a simple code.

(The program environment is Linux Ubuntu 14.04, Python 2.7.11, Theano ver.0.7.0, TensorFlow ver.0.6.0, Keras ver.0.3.0.)

From installation to running'mnist_mlp.py'

Theano and ThensorFlow have been installed in an environment where GPU can be used before. This time, "Keras" was added.

> git clone https://github.com/fchollet/keras.git
> cd keras
> python setup.py install

Except for the examples directory, sample code is available for typical problems.

> ls examples
addition_rnn.py  imdb_bidirectional_lstm.py  kaggle_otto_nn.py        mnist_mlp.py
babi_memnn.py    imdb_cnn.py                 lstm_text_generation.py  mnist_transfer_cnn.py
babi_rnn.py      imdb_cnn_lstm.py            mnist_cnn.py             reuters_mlp.py
cifar10_cnn.py   imdb_lstm.py                mnist_irnn.py

Among them,'mnist_mlp.py' was an example of MNIST (handwritten digit classification problem), so when I executed this, an error occurred at first.

Traceback (most recent call last):
  File "mnist_mlp.py", line 50, in <module>
    model.compile(loss='categorical_crossentropy', optimizer=rms)
...
(Omitted)
...
  File "build/bdist.linux-x86_64/egg/keras/activations.py", line 25, in relu
  File "build/bdist.linux-x86_64/egg/keras/backend/theano_backend.py", line 463, in relu
AttributeError: 'module' object has no attribute 'relu'

This is because'relu' was specified as the activation function in'minist_mlp.py', but it was not supported by Theano version ver.0.7.0 which was originally installed by Anaconda. .. (Theno document says that'relu'is supported from ver.0.7.1.) However, Theano on GitHub has'relu' (although ver.0.7.0). So, when I upgade this, the above error disappeared.

(Omitted)
...
Epoch 20/20
1s - loss: 0.0092 - acc: 0.9974 - val_loss: 0.0555 - val_acc: 0.9842
Test score: 0.0554862711126
Test accuracy: 0.9842

The classification accuracy of Test is about 98.4%.

Changed backend from'Theano'to'TensorFlow'

There are two ways to change the bank end.

  1. Change the environment file ("$ HOME / .keras / keras.json") under your home directory "keras.json" is a one-line file of {"epsilon ": 1e-07," floatx ":" float32 "," backend ":" theano "}, so the last term, "theano" Change it to "thensorfolw".
  2. Change the environment variable "KERAS_BACKEND". If the environment variable is not defined, the value of keras.json above will be set. If an environment variable is set, that setting takes precedence.
export KERAS_BACKEND=tensorflow

With this alone, the same Keras code can be executed using the TensorFlow library. (Very easy!) In comparing the two methods, setting environment variables seems to be easier. (Of course, to return to theono, you can either ʻunset` the above environment variable or set it to'theano'.)

By the way, the result of the benchmark test on performance is posted here, so I will quote it. https://github.com/fchollet/keras/wiki/Keras,-now-running-on-TensorFlow

Task TensorFlow Theano
mnist_mlp.py: compilation (s) 0.6 5.9
mnist_mlp.py: runtime/epoch (s) 7.5 6.3
imdb_lstm.py: compilation (s) 39.3 38.3
imdb_lstm.py: runtime/epoch (s) 283 123
mnist_cnn.py: compilation (s) 0.8 11.4
mnist_cnn.py: runtime/epoch (s) 190 3230

Even if you run the same code for the same problem, you can see that the required time changes considerably depending on the bank end. (It seems that it is due to CPU calculation.)

Try it with your own code

I tried to change the Wine classification MLP code of TensorFlow version / Theano version, which was dealt with in the recently posted article, to Keras version. It's a short code I wrote for Tutorial, but it's even more compact in the Keras version. (For the original code, refer to TensorFlow version, Theano version.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
np.random.seed(1337)  # for reproducibility

from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.regularizers import l2
from keras.optimizers import SGD, Adagrad

from input_winedata import load_data

if __name__ == '__main__':
    WineData = '../../Data/Wine/wine.data'
    train_x, train_y, test_x, test_y = load_data(WineData)
    print(train_x.shape[0], 'train samples')
    print(test_x.shape[0], 'test samples')
     
    model = Sequential()
    model.add(Dense(20, input_shape=(13,), W_regularizer=l2(0.001)))
    model.add(Activation('sigmoid'))
    model.add(Dropout(0.05))
    model.add(Dense(20, W_regularizer=l2(0.001)))
    model.add(Activation('sigmoid'))
    model.add(Dropout(0.05))
    model.add(Dense(3, W_regularizer=l2(0.001)))
    model.add(Activation('softmax'))
    
    adagrad = Adagrad(lr=0.01, epsilon=1e-08)
    model.compile(loss='categorical_crossentropy', optimizer=adagrad)
    
    batch_size = 16
    nb_epoch = 1000

    print('Train...')
    model.fit(train_x, train_y,
          batch_size=batch_size, nb_epoch=nb_epoch,
          show_accuracy=False, verbose=2,
          validation_data=(test_x, test_y))
    score = model.evaluate(test_x, test_y,
                       show_accuracy=True, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])

Below, we will confirm the details.

After inputting the data at the beginning, the definition of the network model (MLP) is done. Even in Theano and TensorFlow, if you define the layer class of the hidden layer and the layer class of the output layer, the main MLP definition part can be summarized quite concisely, but in Keras you do not have to create the class yourself, the following code The network can be defined with.

    model = Sequential()
    model.add(Dense(20, input_shape=(13,), W_regularizer=l2(0.001)))
    model.add(Activation('sigmoid'))
    model.add(Dropout(0.05))
    model.add(Dense(20, W_regularizer=l2(0.001)))
    model.add(Activation('sigmoid'))
    model.add(Dropout(0.05))
    model.add(Dense(3, W_regularizer=l2(0.001)))
    model.add(Activation('softmax'))

Choose to define a forward-propagating network with the first model = Sequential (). After that, from the input side (definition of hidden layer 1), (definition of hidden layer 2), and (definition of output layer) are performed in this order. It is possible to write very clearly, such as selection of activation function, specification of regularization (Rezularizer), specification of Dropout, etc.

By the way, the forward propagation network model model = Sequential () can support up to CNN (Convolutional Neural Network), and for RNN (Recurrent Neural Net) etc. with a more complicated structure, in Keras, model = Graph () The structure can be defined by the Graph model starting with the declaration . (I would like to investigate this in the future.)

In the second half of the MLP code, specify the optimizer and call the method of model.fit ().

    adagrad = Adagrad(lr=0.01, epsilon=1e-08)
    model.compile(loss='categorical_crossentropy', optimizer=adagrad)
    
    batch_size = 16
    nb_epoch = 1000

    print('Train...')
    model.fit(train_x, train_y,
          batch_size=batch_size, nb_epoch=nb_epoch,
          show_accuracy=False, verbose=2,
          validation_data=(test_x, test_y))

With the above, the program is executed until learning.

Impressions of using

Best of all, it's great to be able to define a network and perform calculations without a separate script such as "YAML" or "Lua". With the two types of backends, "Theano" and "TensorFlow", I can't think of any interesting usage at this point, but there is no doubt that the frontage for users will expand.

Also, as is often said, the trade-off that the flexibility is lost due to the increase in abstraction naturally applies to "Keras". However, it seems to be a very useful library for the purpose of "I want to try my model immediately" or "I want to repeat numerical experiments of hyperparameters" in prototyping. (I became a Keras supporter at once.)

References (web site)

Recommended Posts

Keras as wrapper of Theano & TensorFlow
Guarantee of reproducibility with keras (as of September 22, 2020)
Introduction of Virtualenv wrapper
Convenient library of Tensorflow TF-Slim
Tuning experiment of Tensorflow data
Implemented word2vec with Theano + Keras
I touched Tensorflow and keras
[Keras] batch inference of arcface