As an exercise in "Learn from Mosaic Removal: State-of-the-art Deep Learning" written by koshian2, the convolution function was taken up as an example. This time, I would like to summarize what I understood about this convolution function. https://qiita.com/koshian2/items/aefbe4b26a7a235b5a5e
This book is understandable even for people like me who have started machine learning. It is explained in an easy-to-understand order from the basics. In particular, it was good that the latest papers were reviewed. The latest papers distributed in general bookstores are about a year or two ago. It's a book that makes you feel passionate about deep learning, and I'm glad you bought it.
Convolution is used in the convolutional neural network, which is famous for deep learning. Somehow, it's a kitchen-like word, and I have a feeling that I want to say it in words for the time being.
The calculation of convolution is done by the method shown in the image. Take a 3x3 cell from the input matrix, multiply it with a matrix called a convolution kernel, and add the sum as the output. This is different from the so-called forward propagation network in which the units in the adjacent layer are fully connected (see the figure below).
X=\left(
\begin{array}{cc}
0 & 1 & 2 & 3 & 4 \\
5 & 6 & 7 & 8 & 9 \\
10 & 11 & 12 & 13 & 14 \\
15 & 16 & 17 & 18 & 19 \\
20 & 21 & 22 & 23 & 24 \\
\end{array}
\right)
\\
kernel=\left(
\begin{array}{cc}
0 & 1 & 2 \\
3 & 4 & 5 \\
6 & 7 & 8 \\
\end{array}
\right)
This time, we define the input as a 5x5 line of X and the kernel factor as a 3x3 line of kernel.
c.ipynp
import numpy as np
def conv(inputs, kernel):
outputs = np.zeros((3,3),inputs.dtype)
for i in range(3): #Calculate in the row direction three times.
for j in range (3): #Calculate in the column direction three times.
patch= X[i:i+3,j:j+3]
prod = patch * kernel #Multiply the mass and kernel.
sum = np.sum(prod) #Add the hangs.
outputs[i,j] = sum #Put a value in the output layer.
return outputs
#Define an X matrix. The point is to convert it to 5x5 by reshape.
X = np.arange(25,dtype=np.float32).reshape(5,5)
#Put a value in the output layer.
kernel = np.arange(9, dtype = np.float32).reshape(3,3)
conv(X,kernel)
Outputs=\left(
\begin{array}{cc}
312 & 348 & 384 \\
492 & 528 & 564 \\
672 & 708 & 744 \\
\end{array}
\right)
I was able to calculate.
Image processing can be used to fade, black and white, and enhance edges. This is done by the convolution process we just summarized.
Here is the original photo. This is a picture I took when I went to the aquarium last year.
Process the image with an edge enhancement filter. When using Tensorflow, the order of the tensor axes makes sense. In the image, the order is basically batch, vertical resolution, horizontal resolution, and channel. Therefore, when adding axes (dimensions), it is necessary to pay attention to this order. Also on the program as below
c.ipynp
float_img[ :, :, :, i:i+1]
Since there are 3 channels (R, G, B) for color images, it is necessary to turn i for the 4th channel.
c.ipynp
kernel = []
kernel = np.array([0,0,0,0,10,0,0,0,0]).reshape(3,3,1,1)
kernel = 0.5* kernel.astype(np.float32)
outputs = []
float_img = tf.cast(img,tf.float32)/255.0
for i in range(3):
conv_result = tf.nn.conv2d(float_img[:,:,:,i:i+1],kernel,1,'SAME')
outputs.append(conv_result)
outputs = tf.concat(outputs, axis = -1)
fig = plt.figure(figsize=(14,14))
ax = fig.gca()
ax.imshow(outputs[0].numpy())
The kernel of the edge enhancement filter used is as follows.
kernel=\frac{1}{2}\left(
\begin{array}{cc}
-1 & -1 & -1 \\
-1 & 10 & -1 \\
-1 & -1 & -1 \\
\end{array}
\right)
In this way, we were able to obtain the same effect as ImageFilter.EDGE_ENHANCE in PIL, a library often used in image processing. From this, we were able to confirm that what is being done in the convolution operation and image processing is the same. The kernel can be blurred, black and white, or the output can be changed as specified. Also, if you change this value depending on the input position, you will get another effect. I would like to learn more about this kernel.
The full program can be found here. https://github.com/Fumio-eisan/convol_20200307
Recommended Posts