This article describes the following and implements the Maxout function in Python code.
In this article, Maxout function is implemented so that it can be used as a layer of Keras. The code and description are at the end.
The Maxout function is used as a layer activation function in deep learning models such as CNN and DNN. The main advantage of using the Maxout function as an activation function is that you can pass data to the next layer without changing the size of the data given by the previous layer.
To explain this, in general, CNN and DNN often use a Pooling layer to reduce the size of data, whereas the Maxout function reduces the number of dimensions corresponding to the number of channels instead of reducing the size. I am doing. As a result, it is not necessary to use the Pooling layer, and it is used when you want to keep the size of the data in the layer as much as possible. (Actually, it is used in the CNN layer and is often used in combination with the Pooling layer)
When written as a mathematical formula, the Maxout function can be expressed as follows.
What the Maxout function actually does is take the Max of the pixels located at the same location in each dimension (channel, feature map) and use that as the pixel of the output data. The picture will look like the one below.
Reference https://www.google.com/url?sa=i&url=https%3A%2F%2Flink.springer.com%2Farticle%2F10.1186%2Fs40537-019-0233-0&psig=AOvVaw2-jjWv_TTq3t2bz_Py6_S0&ust=1592137921627000&source=images&cd=vfe&ved=0CA0QjhxqFwoTCOiAvJDm_ukCFQAAAAAdAAAAABAD
When actually implementing it as code, make it possible to specify the number of dimensions after output. For example, let the number of dimensions of the output be 2 and the number of dimensions of the input be N. In this case, the input data is divided into two n / 2D chunks, and Maxout is performed for each.
Implement as follows. It has been confirmed to work with both Tensorflow 2 and 1.
Maxout.py
import tensorflow as tf
from typeguard import typechecked
import keras
class Maxout(keras.layers.Layer):
#num_Specify the number of dimensions after output with unit
#Specify the axis for which you want Max in axis (usually the default value. For Channel first, specify 1)
@typechecked
def __init__(self, num_units: int, axis: int = -1, **kwargs):
super().__init__(**kwargs)
self.num_units = num_units
self.axis = axis
def call(self, inputs):
inputs = tf.convert_to_tensor(inputs)
shape = inputs.get_shape().as_list()
# Dealing with batches with arbitrary sizes
for i in range(len(shape)):
if shape[i] is None:
shape[i] = tf.shape(inputs)[i]
num_channels = shape[self.axis]
if not isinstance(num_channels, tf.Tensor) and num_channels % self.num_units:
raise ValueError(
"number of features({}) is not "
"a multiple of num_units({})".format(num_channels, self.num_units)
)
if self.axis < 0:
axis = self.axis + len(shape)
else:
axis = self.axis
assert axis >= 0, "Find invalid axis: {}".format(self.axis)
expand_shape = shape[:]
expand_shape[axis] = self.num_units
k = num_channels // self.num_units
expand_shape.insert(axis, k)
outputs = tf.math.reduce_max(
tf.reshape(inputs, expand_shape), axis, keepdims=False
)
return outputs
def compute_output_shape(self, input_shape):
input_shape = tf.TensorShape(input_shape).as_list()
input_shape[self.axis] = self.num_units
return tf.TensorShape(input_shape)
def get_config(self):
config = {"num_units": self.num_units, "axis": self.axis}
base_config = super().get_config()
return {**base_config, **config}
A usage example is shown below. If you call it like this, it will work.
example.py
from Maxout import Maxout
conv2d = Conv2D(64, kernel_size, strides, padding)(input)
maxout = Maxout(n_units)(conv2d)
This time I explained the Maxout function. Maxout is often used as an activation function for LCNN etc. in recent studies. I hope you find this article useful.
Reference Maxout Networks (https://arxiv.org/pdf/1302.4389.pdf) A Light CNN for Deep Face Representation with Noisy Labels (https://arxiv.org/pdf/1511.02683.pdf)
Recommended Posts