<Course> Deep Learning: Day2 CNN

sutudy-ai


Deep learning

table of contents [Deep Learning: Day1 NN] (https://qiita.com/matsukura04583/items/6317c57bc21de646da8e) [Deep Learning: Day2 CNN] (https://qiita.com/matsukura04583/items/29f0dcc3ddeca4bf69a2) [Deep Learning: Day3 RNN] (https://qiita.com/matsukura04583/items/9b77a238da4441e0f973) [Deep Learning: Day4 Reinforcement Learning / TensorFlow] (https://qiita.com/matsukura04583/items/50806b750c8d77f2305d)

Deep Learning: Day2 CNN (Lecture Summary)

Reviewing the Big Picture of Deep Learning – Learning Concepts

  1. Enter a value in the input layer
  2. The value is transmitted while calculating with the weight, bias, and activation function.
  3. Value is transmitted from the output layer
  4. Find the error using the error function from the value output from the output layer and the correct answer value.
  5. Update weights and biases to reduce error. (Especially important)
  6. By repeating the operations 1 to 5, the output value will be closer to the correct answer value.
Screenshot 2020-01-02 10.07.45.png Image of error back propagation スクリーンショット 2020-01-02 10.12.55.png スクリーンショット 2020-01-02 10.36.12.png スクリーンショット 2020-01-02 10.38.48.png スクリーンショット 2020-01-02 10.39.04.png

As a merit, the derivative can be calculated while avoiding unnecessary recursive calculation by back-calculating the derivative from the calculation result of the error. Reduction of calculation cost.

Learning techniques for deep models

** Initial weight setting-He ** Activation function when setting the initial value of He Relu function スクリーンショット 2020-01-02 11.04.56.png How to set the initial value The value obtained by dividing the weight element by the square root of the number of nodes in the previous layer and multiplying it by route 2.

Batch normalization is a method to suppress the bias of input value data in units of $ \ Rightarrow $ mini-batch. What is the use of batch normalization? ︖ $ \ Rightarrow $ Add a layer containing batch normalization processing before and after passing a value to the activation function.

u^{(l)}=w^{(l)}z^{(l)}+b^{(l)}Or z
スクリーンショット 2020-01-02 11.52.05.png スクリーンショット 2020-01-02 11.52.18.png スクリーンショット 2020-01-02 11.52.54.png スクリーンショット 2020-01-02 12.16.36.png
 + 2-3 RMSProp
スクリーンショット 2020-01-02 12.17.42.png
 + 2-4Adam

About convolutional neural networks

Exercise

DN06_Jupyter Exercise

Consideration of confirmation test

[P12] Find dz / dx using the principle of chain rule.

     z = t^2,t=x+y

⇒ [Discussion] It can be calculated by the following calculation.

 \frac{dz}{dx}=\frac{dz}{dy}\frac{dy}{dx}
,t=x+y
z = t^Since it is 2, if it is differentiated by t\frac{dz}{dt}=2t

t=x+Since it is y, if it is differentiated by x\frac{dt}{dx}=1

\frac{dz}{dx}=2t ・ 1=2t=2(x+y)

[P20] When the sigmoid function is differentiated, the maximum value is taken when the input value is 0. Select the correct value from the options. (1) 0.15 (2) 0.25 (3) 0.35 (4) 0.45

⇒ [Discussion] Differentiation of sigumoid

     (sigmoid)'=(1-sigmoid)(sigmoid)

Since the sigmoid function is maximum at 0.5,

     (sigmoid)'=(1-0.5)(0.5)=0.Will be 25

[P28] What kind of problem occurs when the initial value of the weight is set to 0? Explain briefly. ⇒ [Discussion] Gradient may not be obtained. Since the formula for the initial value of the weight was mentioned above, we will utilize it.

[P31] List two commonly considered effects of batch normalization. ⇒ [Discussion] The distribution of parameters in the middle layer becomes appropriate. Learning in the middle layer stabilizes This method is currently widely used despite being a new method proposed in 2015. ..

[P36] Example challenge スクリーンショット 2020-01-02 11.54.43.png

Correct answer: data_x [i: i_end], data_t [i: i_end] • [Explanation] This is a process to retrieve data for batch size. ⇒ [Discussion] The description is similar and it is easy to make a mistake, so be careful.

[P63] Confirmation test

スクリーンショット 2020-01-02 12.38.45.png

⇒ [Discussion] The answer is "a" スクリーンショット 2020-01-02 12.50.27.png It is good to remember it with the figure.

[P68] Answer either of the graphs showing L1 regularization. ⇒ [Discussion] The answer is right スクリーンショット 2020-01-02 12.37.49.png

It is good to remember it with the figure. Lasso is a figure with a characteristic rhombus. (Ridge is circular)

[P69] Example Challenge

スクリーンショット 2020-01-02 13.31.21.png

⇒ [Discussion] The answer is (4) param スクリーンショット 2020-01-02 13.40.33.png It is good to remember it with the calculation formula. Understand L1 and L2 correctly.

[P71] Example challenge

スクリーンショット 2020-01-02 13.44.02.png

⇒ [Discussion] The answer is "sign (param)" [Explanation] The L1 norm is|param|So that gradient is added to the gradient of the error. That is, sign(param)Is. sign is a sign function. It is also necessary to understand the sign sign function that appears for the first time.

[P78] Example Challenge スクリーンショット 2020-01-02 13.53.49.png ⇒ [Discussion] Correct answer: image [top: bottom, left: right,:] [Explanation] Consider that the format of the image is (vertical width, horizontal width, channel).

[P100] Confirmation test Answer the size of the output image when the input image of size 6x6 is folded with the filter of size 2x2. The stride and padding are set to 1. ⇒ [Discussion] Answer 7✖️7 Input size height (H), input size width (W) Output Hight(OH) Output Width(OW) Filler Hight(FH) Filler Width(FW) Stride (S) Panning (P)

   OH =\frac{H+2P-FH}{S}+1 =\frac{6+2.1-2}{1}+1=7
   OW =\frac{W+2P-FW}{S}+1 =\frac{6+2.1-2}{1}+1=7

Since it is a fixed calculation method, it is convenient to remember it as a formula.

Exercise

DN23_Jupyter Exercise

スクリーンショット 2020-01-02 15.46.59.png

Result of changing to ReLU-Xavier combination スクリーンショット 2020-01-02 15.36.15.png Result of changing to Sigmoid-HE combination スクリーンショット 2020-01-02 15.43.35.png

DN32_Jupyter Exercise (Dropout)

スクリーンショット 2020-01-03 2.28.29.png スクリーンショット 2020-01-03 2.31.55.png

DN35_Jupyter Exercise (im2col)

** [try] Let's check the processing of im2col -Comment out the line that is transposing in the function and execute the code below. ・ Let's change the size of each dimension of input_data, filter size, stride, and padding **

⇒ [Discussion] The results of the exercise are as follows.

python


#Im2col processing confirmation
input_data = np.random.rand(2, 1, 4, 4)*100//1 # number, channel, height,Represents width
print('========== input_data ===========\n', input_data)
print('==============================')
filter_h = 3
filter_w = 3
stride = 1
pad = 0
col = im2col(input_data, filter_h=filter_h, filter_w=filter_w, stride=stride, pad=pad)
print('============= col ==============\n', col)
print('==============================')
スクリーンショット 2020-01-03 3.05.36.png

Try changing the size of each dimension of input_data and the filter size, stride, and padding as follows.

python


filter_h = 6
filter_w = 6
stride = 2
pad = 1
スクリーンショット 2020-01-03 3.11.05.png

・ It is necessary to understand that im2col and col2im do not return in exactly the same way. ・ The scene to use is different in the first place. im2col is used for convolution, while col2im is used for final output.

** [try] Let's check the processing of col2im ・ Let's convert the col output by checking im2col to image and check it ** ⇒ [Discussion]

python


#Added processing of col2im
img = col2im(col, input_shape=input_data.shape, filter_h=filter_h, filter_w=filter_w, stride=stride, pad=pad)
print(img)
スクリーンショット 2020-01-03 3.23.19.png ## DN37_Jupyter Exercise (3) スクリーンショット 2020-01-03 3.30.52.png

・ Please note that the convolution process takes time to learn. To process without stress, it is recommended to raise the PC specifications or prepare a device equipped with GPU.

Recommended Posts

<Course> Deep Learning: Day2 CNN
<Course> Deep Learning: Day1 NN
<Course> Deep Learning Day4 Reinforcement Learning / Tensor Flow
Rabbit Challenge Deep Learning 1Day
Subjects> Deep Learning: Day3 RNN
Rabbit Challenge Deep Learning 2Day
Deep Learning
Thoroughly study Deep Learning [DW Day 0]
Introduction to Deep Learning ~ CNN Experiment ~
Machine learning beginners take Coursera's Deep learning course
[Rabbit Challenge (E qualification)] Deep learning (day2)
[Rabbit Challenge (E qualification)] Deep learning (day3)
Deep Learning Memorandum
Start Deep learning
Python learning day 4
Python Deep Learning
Deep learning × Python
[Rabbit Challenge (E qualification)] Deep learning (day4)
"Deep Learning from scratch" Self-study memo (No. 11) CNN
First Deep Learning ~ Struggle ~
Python: Deep Learning Practices
Deep learning / activation functions
Deep Learning from scratch
Learning record 4 (8th day)
Learning record 9 (13th day)
Learning record 3 (7th day)
Deep learning 1 Practice of deep learning
Deep learning / cross entropy
Learning record 5 (9th day)
Learning record 6 (10th day)
First Deep Learning ~ Preparation ~
Programming learning record day 2
First Deep Learning ~ Solution ~
Learning record 8 (12th day)
[AI] Deep Metric Learning
Learning record 1 (4th day)
Learning record 7 (11th day)
I tried deep learning
Machine learning course memo
Python: Deep Learning Tuning
Learning record 2 (6th day)
Deep learning large-scale technology
Learning record 16 (20th day)
Learning record 22 (26th day)
Deep learning / softmax function
Deep learning course that can be crushed on site
Learning record No. 21 (25th day)
Effective Python Learning Memorandum Day 15 [15/100]
Deep Learning from scratch 1-3 chapters
Try deep learning with TensorFlow
Learning record 13 (17th day) Kaggle3
Deep Learning Gaiden ~ GPU Programming ~
Effective Python Learning Memorandum Day 6 [6/100]
Learning record No. 10 (14th day)
Effective Python Learning Memorandum Day 12 [12/100]
Learning record No. 17 (21st day)
Effective Python Learning Memorandum Day 9 [9/100]
Learning record 12 (16th day) Kaggle2
Deep learning image recognition 1 theory
Effective Python Learning Memorandum Day 8 [8/100]
Learning record No. 18 (22nd day)