Diversion of layers of trained keras model

Introduction

This article is a memo for your own learning. If you have any questions, I would appreciate it if you could point them out.

This learning is this: Is it possible to take a layer from a model created and trained with keras and reuse it for a newly created model?

Trained model

Use the model on this page (https://qiita.com/yoho/items/e9a65b10ca510ab50a36). The variable name is model.

001.py


model.summary()

#Layer (type)                 Output Shape              Param #   
#=================================================================
#dense_1 (Dense)              (None, 100)               1100      
#_________________________________________________________________
#dense_2 (Dense)              (None, 100)               10100     
#_________________________________________________________________
#dense_3 (Dense)              (None, 40)                4040      
#_________________________________________________________________
#dense_4 (Dense)              (None, 20)                820       
#_________________________________________________________________
#dense_5 (Dense)              (None, 2)                 42        
#=================================================================
#Total params: 16,102
#Trainable params: 16,102
#Non-trainable params: 0
_________________________________________________________________

Use vars () to find out what variables the model has.

002.py


vars(model)

Although not shown here, you can see the variables that model has. Among them, the variable _layers seems to hold a list of layer objects.

I will try to access it.

003.py


model._layers
#The output is separated by line breaks for readability.
#[<keras.engine.input_layer.InputLayer object at 0x138144208>, 
#<keras.layers.core.Dense object at 0x13811af98>, 
#<keras.layers.core.Dense object at 0x138144320>, 
#<keras.layers.core.Dense object at 0x1381443c8>, 
#<keras.layers.core.Dense object at 0x138144518>, 
#<keras.layers.core.Dense object at 0x1381741d0>]

You can also access it as model.layers. Is the getter method defined? In this case, it seems that there is no input_Layer at the beginning compared to the case of model._layers.

004.py


model.layers
#The output is separated by line breaks for readability.
#[<keras.layers.core.Dense object at 0x13811af98>, 
#<keras.layers.core.Dense object at 0x138144320>, 
#<keras.layers.core.Dense object at 0x1381443c8>, 
#<keras.layers.core.Dense object at 0x138144518>, 
#keras.layers.core.Dense object at 0x1381741d0>]

Examine the last layer

What about the last layer, which has the fewest parameters? Use pprint for readability.

005.py



import pprint
pprint.pprint(vars(model.layers[4]))
#The following is the output.
"""
{'_built': True,
 '_inbound_nodes': [<keras.engine.base_layer.Node object at 0x138174e10>],
 '_initial_weights': None,
 '_losses': [],
 '_metrics': [],
 '_non_trainable_weights': [],
 '_outbound_nodes': [],
 '_per_input_losses': {},
 '_per_input_updates': {},
 '_trainable_weights': [<tf.Variable 'dense_5/kernel:0' shape=(20, 2) dtype=float32, numpy=
array([[-0.07533632,  0.20118327],
       [-0.17458896, -0.44313124],
       [ 0.4008763 ,  0.3295961 ],
       [-0.40597808, -0.02159814],
       [ 0.59269255, -0.15129048],
       [-0.14078082, -0.44002545],
       [ 0.18300773,  0.17778364],
       [ 0.3685053 , -0.36274177],
       [-0.28516215, -0.0659026 ],
       [ 0.45126018, -0.2892398 ],
       [ 0.19851999, -0.39362603],
       [ 0.2631754 ,  0.40239784],
       [ 0.08184562, -0.08194606],
       [-0.43493706,  0.18896711],
       [ 0.36158973,  0.20016526],
       [-0.05036243, -0.20633343],
       [-0.41589907,  0.57210416],
       [-0.10199612, -0.37373352],
       [ 0.30416492, -0.19923651],
       [ 0.02667725, -0.5090254 ]], dtype=float32)>,
                        <tf.Variable 'dense_5/bias:0' shape=(2,) dtype=float32, numpy=array([0.05854932, 0.07379959], dtype=float32)>],
 '_updates': [],
 'activation': <function linear at 0x1380400d0>,
 'activity_regularizer': None,
 'bias': <tf.Variable 'dense_5/bias:0' shape=(2,) dtype=float32, numpy=array([0.05854932, 0.07379959], dtype=float32)>,
 'bias_constraint': None,
 'bias_initializer': <keras.initializers.Zeros object at 0x138180400>,
 'bias_regularizer': None,
 'dtype': 'float32',
 'input_spec': InputSpec(min_ndim=2, axes={-1: 20}),
 'kernel': <tf.Variable 'dense_5/kernel:0' shape=(20, 2) dtype=float32, numpy=
array([[-0.07533632,  0.20118327],
       [-0.17458896, -0.44313124],
       [ 0.4008763 ,  0.3295961 ],
       [-0.40597808, -0.02159814],
       [ 0.59269255, -0.15129048],
       [-0.14078082, -0.44002545],
       [ 0.18300773,  0.17778364],
       [ 0.3685053 , -0.36274177],
       [-0.28516215, -0.0659026 ],
       [ 0.45126018, -0.2892398 ],
       [ 0.19851999, -0.39362603],
       [ 0.2631754 ,  0.40239784],
       [ 0.08184562, -0.08194606],
       [-0.43493706,  0.18896711],
       [ 0.36158973,  0.20016526],
       [-0.05036243, -0.20633343],
       [-0.41589907,  0.57210416],
       [-0.10199612, -0.37373352],
       [ 0.30416492, -0.19923651],
       [ 0.02667725, -0.5090254 ]], dtype=float32)>,
 'kernel_constraint': None,
 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x138174b70>,
 'kernel_regularizer': None,
 'name': 'dense_5',
 'stateful': False,
 'supports_masking': True,
 'trainable': True,                  #######Important here#######
 'units': 2,
 'use_bias': True}
"""

The last layer receives 20 outputs from the previous layer and produces 2 outputs, so the number of weight parameters is 20 x 2 = 40 and the bias parameters are 2.

I'm not sure, but it seems to have two of the same weight parameters (_trainable_weights [0] and kernel).

What is the actual state of weight parameters? Let's check _trainable_weights [0] with type ().

006.py


type( model.layers[4]._trainable_weights[0] )
#<class 'tensorflow.python.ops.resource_variable_ops.ResourceVariable'>

It looks like a ResourceVariable object in tensorflow, but I won't look any further here.

It seems that get_weights () can retrieve a list of weights and biases. The list has 2 elements.

007.py


type(model.layers[4].get_weights())
#<class 'list'>
len(model.layers[4].get_weights())
#2

The content of this list is ndarray. I will look at the contents.

008.py


type(model.layers[4].get_weights()[0])#0th confirmation in the list
#<class 'numpy.ndarray'>
model.layers[4].get_weights()[0]#0th in the list
"""
array([[-0.07533632,  0.20118327],
       [-0.17458896, -0.44313124],
       [ 0.4008763 ,  0.3295961 ],
       [-0.40597808, -0.02159814],
       [ 0.59269255, -0.15129048],
       [-0.14078082, -0.44002545],
       [ 0.18300773,  0.17778364],
       [ 0.3685053 , -0.36274177],
       [-0.28516215, -0.0659026 ],
       [ 0.45126018, -0.2892398 ],
       [ 0.19851999, -0.39362603],
       [ 0.2631754 ,  0.40239784],
       [ 0.08184562, -0.08194606],
       [-0.43493706,  0.18896711],
       [ 0.36158973,  0.20016526],
       [-0.05036243, -0.20633343],
       [-0.41589907,  0.57210416],
       [-0.10199612, -0.37373352],
       [ 0.30416492, -0.19923651],
       [ 0.02667725, -0.5090254 ]], dtype=float32)
"""

type(model.layers[4].get_weights()[1])#First confirmation on the list
#<class 'numpy.ndarray'>
model.layers[4].get_weights()[1]#First in the list
#array([0.05854932, 0.07379959], dtype=float32)

Take out the layer and try to make a new model.

When creating a model, I'm not sure when the parameters are initialized, so I'll check if they are initialized when I embed them in the model.

Here, we will use the last layer of the five layers of the model as the second layer of the new model.

Since the layer to be diverted is a layer that receives 20 inputs, the input layer of the new model should accept 20 inputs.

009.py


inputs = keras.layers.Input(shape=(20,))
x = model.layers[4](inputs)
model_new = keras.Model(inputs=inputs, outputs=x)

I was connected.

Whether the layer is the same layer as the 5th layer of the previous model, Check if the parameters have changed.

010.py


model_new.layers[1]
<keras.layers.core.Dense object at 0x1381741d0>#The object address is the same.
model_new.layers[1].get_weights()[0]#Check if the parameters have changed-->Not changed.
"""
array([[-0.07533632,  0.20118327],
       [-0.17458896, -0.44313124],
       [ 0.4008763 ,  0.3295961 ],
       [-0.40597808, -0.02159814],
       [ 0.59269255, -0.15129048],
       [-0.14078082, -0.44002545],
       [ 0.18300773,  0.17778364],
       [ 0.3685053 , -0.36274177],
       [-0.28516215, -0.0659026 ],
       [ 0.45126018, -0.2892398 ],
       [ 0.19851999, -0.39362603],
       [ 0.2631754 ,  0.40239784],
       [ 0.08184562, -0.08194606],
       [-0.43493706,  0.18896711],
       [ 0.36158973,  0.20016526],
       [-0.05036243, -0.20633343],
       [-0.41589907,  0.57210416],
       [-0.10199612, -0.37373352],
       [ 0.30416492, -0.19923651],
       [ 0.02667725, -0.5090254 ]], dtype=float32)
"""

When I examined this layer with vars (), Only the value of the list with \ _inbound_nodes as the key was different.

011.py


pprint.pprint(model.layers[1]._inbound_nodes)
#[<keras.engine.base_layer.Node object at 0x138144be0>]
pprint.pprint(model_new.layers[1]._inbound_nodes)
#[<keras.engine.base_layer.Node object at 0x138174e10>,
# <keras.engine.base_layer.Node object at 0x110cfe198>]#This is increasing.

Keras.engine.base_layer.Node is a little pending.

To set the parameters of this layer not to change during training (freeze the layer) Set the trainable property to False.

012.py


model_new.layers[1].trainable=False

The model needs to be compiled.

Summary

  1. You can extract your favorite layer from the trained keras NN model and use it for other models.
  2. You can freeze the layer parameters so that they don't change.

Recommended Posts

Diversion of layers of trained keras model
Implementation of VGG16 using Keras created without using a trained model
I tried using the trained model VGG16 of the deep learning library Keras
We have released a trained model of fastText
Summary of Tensorflow / Keras
Two-dimensional visualization of document vectors using Word2Vec trained model
Visualize Keras model in Python 3.5
Benefits of refining Django's Model
[Keras] batch inference of arcface
Challenge image classification by TensorFlow2 + Keras 4 ~ Let's predict with trained model ~