Keras code is simple and modular, so it's simple to write, easy to understand and easy to use. However, if you try to do layering or learning other than those provided as standard, there are not many samples and you often do not know how to write.
I will share the tips I learned when I wrote some unusual Models recently as a memorandum.
Container
if you want to share the weightLambda
is convenientLayer
There are two ways to write a Model in Keras. Sequential Model and Functional API Model -guide /).
Sequential Model is
model = Sequential()
model.add(Dense(32, input_dim=784))
model.add(Activation('relu'))
I think there are many people who first saw this and thought, "Keras is really easy to understand!"
Apart from that
inputs = Input(shape=(784,))
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
model = Model(input=inputs, output=predictions)
There is a way to write. This is a way of writing with the rhythm of LayerInstance (InputTensor)-> OutputTensor
.
Dense (64, activation ='relu') (x)
may seem strange to those who are new to Python-like languages,
Just create an Instance ** of ** Dense
Class in the part ofDense (64, activation ='relu')
andDenseInstance (x)
to it.
dense = Dense(64, activation='relu')
x = dense(x)
And the meaning is the same.
The flow is to determine the ** input Layer ** ** output Layer ** and pass it to the Model
Class.
If the input layer is real data, specify it using ʻInput` class (like Placeholder).
What you should be aware of here is that ** holds a weight for each LayerInstance **. In other words, ** if you use the same Layer Instance, you share the weight **. Be careful of unintended sharing as well as intentional sharing.
This way you can easily enter the same Output Tensor into another layer. The amount of description does not change much, and as you get used to it, it is recommended that you practice writing with this Functional API to prepare for future difficult models.
Container
Sometimes you have a different input layer and a different output layer, but you want to share the underlying Network and Weight.
In that case, it will be easier to handle if you put them together in the Container
class.
Since Container
is a subclass of Layer
, like Layer, ** using the same ContainerInstance means sharing a Weight **.
For example
inputs = Input(shape=(784,))
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
shared_layers = Container(inputs, predictions, name="shared_layers")
Such shared_layers
can be treated as if it were a single layer.
The Container
itself basically does not have its own Weight, but only serves as a bundle for other Layer
s.
On the other hand, if you don't want to share Weight, you have to connect LayerInstance
individually without sharingContainer
.
Often when writing your own calculations or Tensor transformations
TypeError: ('Not a Keras tensor:', Elemwise{add,no_inplace}.0)
I see the error.
This usually happens when you put a "raw Tensor" in the input for LayerInstance
instead of a" Output for another Layer ".
For example
from keras import backend as K
inputs = Input((10, ))
x = K.relu(inputs * 2 + 1)
x = Dense(64, activation='relu')(x)
And so on.
I'm not sure, but Layer's Output is an object with an internal Shape called KerasTensor, which seems to be different from the calculation result such as K.hogehoge
.
In that case, you can use Lambda
below (it's better not to forcefully fill in _keras_shape
^^;).
Lambda
is convenientFor example, suppose you want to divide a 10-element Vector into 5 in the first half and 5 in the second half. As mentioned above
inputs = Input((10, ))
x0_4 = inputs[:5]
x5_9 = inputs[5:]
d1 = Dense(10)(x0_4)
d2 = Dense(10)(x5_9)
If you do, an error will occur.
Therefore
inputs = Input((10, ))
x0_4 = Lambda(lambda x: x[:, :5], output_shape=(5, ))(inputs)
x5_9 = Lambda(lambda x: x[:, 5:], output_shape=lambda input_shape: (None, int(input_shape[1]/2), ))(inputs)
d1 = Dense(10)(x0_4)
d2 = Dense(10)(x5_9)
If you Wrap with Lambda
class like this, it will work.
There are a few points here.
In Keras, the first dimension is consistently the Sample dimension (batch_size dimension).
When implementing a layer such as Lambda
, write a calculation formula that includes the Sample dimension internally.
So you need to write lambda x: x [:,: 5]
instead of lambda x: x [: 5]
.
ʻOutput_shapecan be omitted if the input and output shapes are the same, but must be specified if they are different. Tuple and Function can be specified as the argument of output_shape, but ** Tuple does not include Sample dimension **, ** Function includes Sample dimension **. In the case of Function, basically it is OK if the Sample dimension is set to
None. Also note that ʻinput_shape
is an argument when specified by Function, but it includes the Sample dimension.
Lambda(lambda x: x[:, :5], output_shape=(5, ))(inputs)
Lambda (lambda x: x [:,: 5], output_shape = (None, 5)) (inputs)
#It's actually OK when inputs are 1D, but it's NG when it's 2D, so don't include it It is good to remember.Lambda(lambda x: x[:, 5:], output_shape=lambda input_shape: (int(input_shape[1]/2), ))(inputs)
Lambda(lambda x: x[:, 5:], output_shape=lambda input_shape: (None, int(input_shape[1]/2)))(inputs)
You can specify the Loss function with the Model's compile
method, and you can also specify your own custom Loss function.
Regarding the shape of the Function, it takes two arguments, y_true
and y_pred
, and returns the number of ** Samples **.
For example:
def generator_loss(y_true, y_pred): # y_true's shape=(batch_size, row, col, ch)
return K.mean(K.abs(y_pred - y_true), axis=[1, 2, 3])
In this Write LSGAN in Keras,
The functions provided for applying weights and masks are just that, and on the contrary, if you do not use them, you may calculate across samples like this time.
It was pointed out that I think that's true. Therefore, if you don't plan to use it elsewhere and you don't need to use sample_weight etc., it's okay to return one Loss value.
Layer
If you want to pass from a layer, you can just call Layer # add_loss
, but it's a little difficult to pass from a non-Layer (or I don't know the correct way).
Loss formulas other than Loss Function are collected from each layer by Model # losses
at the timing when compile
of Model
Instance is executed (from regularizer etc.).
In other words, you can somehow pass it here.
For example, you can manage to inherit Container
or Model
and overriede #losses
.
When I made VATModel, I passed it this way.
You may want the Loss calculation during training to reflect previous Loss results.
For example, in the case of DiscriminatorLoss
of BEGAN calculation as shown below.
https://github.com/mokemokechicken/keras_BEGAN/blob/master/src/began/training.py#L104
Parameter update during learning
The parameter information passed by Model # updates
is used when compile
of Model
Instance is executed.
There is usually no way to pass data from the Loss Function to that Model # updates (probably), so I'll do a little trick.
__name__
seems to be requiredK.variable
__call__ ()
part__call__ ()
, use K.update ()
to generate "Object for parameter update" and add it to the array self.updates
.Thinking about it, it is possible to do the following.
class DiscriminatorLoss:
__name__ = 'discriminator_loss'
def __init__(self, lambda_k=0.001, gamma=0.5):
self.lambda_k = lambda_k
self.gamma = gamma
self.k_var = K.variable(0, dtype=K.floatx(), name="discriminator_k")
self.m_global_var = K.variable(0, dtype=K.floatx(), name="m_global")
self.loss_real_x_var = K.variable(0, name="loss_real_x") # for observation
self.loss_gen_x_var = K.variable(0, name="loss_gen_x") # for observation
self.updates = []
def __call__(self, y_true, y_pred): # y_true, y_pred shape: (BS, row, col, ch * 2)
data_true, generator_true = y_true[:, :, :, 0:3], y_true[:, :, :, 3:6]
data_pred, generator_pred = y_pred[:, :, :, 0:3], y_pred[:, :, :, 3:6]
loss_data = K.mean(K.abs(data_true - data_pred), axis=[1, 2, 3])
loss_generator = K.mean(K.abs(generator_true - generator_pred), axis=[1, 2, 3])
ret = loss_data - self.k_var * loss_generator
# for updating values in each epoch, use `updates` mechanism
# DiscriminatorModel collects Loss Function's updates attributes
mean_loss_data = K.mean(loss_data)
mean_loss_gen = K.mean(loss_generator)
# update K
new_k = self.k_var + self.lambda_k * (self.gamma * mean_loss_data - mean_loss_gen)
new_k = K.clip(new_k, 0, 1)
self.updates.append(K.update(self.k_var, new_k))
# calculate M-Global
m_global = mean_loss_data + K.abs(self.gamma * mean_loss_data - mean_loss_gen)
self.updates.append(K.update(self.m_global_var, m_global))
# let loss_real_x mean_loss_data
self.updates.append(K.update(self.loss_real_x_var, mean_loss_data))
# let loss_gen_x mean_loss_gen
self.updates.append(K.update(self.loss_gen_x_var, mean_loss_gen))
return ret
class DiscriminatorModel(Model):
"""Model which collects updates from loss_func.updates"""
@property
def updates(self):
updates = super().updates
if hasattr(self, 'loss_functions'):
for loss_func in self.loss_functions:
if hasattr(loss_func, 'updates'):
updates += loss_func.updates
return updates
discriminator = DiscriminatorModel(all_input, all_output, name="discriminator")
discriminator.compile(optimizer=Adam(), loss=DiscriminatorLoss())
Maybe the last two are tips that will be unnecessary after a while. Keras is easy to follow the source code, so it's surprisingly pretty. Also, with Keras2, the error messages are easier to understand, which is helpful when debugging.
Recommended Posts