This time, I would like to introduce *** Deep Learning, which was implemented without using the *** library for studying. The language is Python.
However, there are books that I referred to. In this entry, I will introduce the book and explain the code reimplemented in Python based on the Java code described in the book.
I don't think that the fact that deep learning is drawing attention as a catalyst for the recent AI boom is not as much as I will mention here. Considering that a large number of books of this kind have been published recently, it is easy to see how much attention they have received.
However, not everyone in the world specializes in AI, and rather, such people are really limited to some groups. Many people
I'm not working directly on deep learning right now, but considering that AI will be introduced as an infrastructure in the near future, it can't be ignored.
I think you have the feeling. That's why I think many people think that they should study for the time being, and I am one of them.
When I picked up the book, I felt that it tended to be polarized into the following types (it is just a tendency, and it is more than enough that I miss a good book). ..
――A type that makes you feel frustrated with a lot of formulas ――After explaining the outline, the type that ends up with indigestion due to how to use the library
I was unnecessarily reluctant to use the library without knowing the contents at all, so I was wondering if there would be a book that went a little further-(I didn't implement it myself). Neglect orz No, I don't specialize in deep learning for myself ^^;).
Then, a book with just the right feeling came out, so I jumped in and read it all at once. This was a big hit.
-[Nest cage, Deep Learning Java programming Theory and implementation of deep learning, Impress, 2016. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4844381288%2F%3Ftag%3Da8- affi-271202-22)
[](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+ BWGDT & a8ejpredirect = https% 3A% 2F% 2Fwww.amazon.co.jp% 2Fdp% 2F4844381288% 2F% 3Ftag% 3Da8-affi-271202-22)
The concept of implementing with a simple but super-simple model fits my idea that I think this is the most useful for understanding! Prove analytically and finish high! Rather, I think it is worth noting that the explanation is such that the differences in characteristics between methods can be grasped sensuously.
After reading this, I wanted to know the part where there is a leap in mathematical expansion and a little deeper part. So, when I read the following standard books, I got a refreshing head.
-[Okaya, Deep Learning, Kodansha, 2015. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4061529021%2F%3Ftag%3Da8- affi-271202-22)
[](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+ BWGDT & a8ejpredirect = https% 3A% 2F% 2Fwww.amazon.co.jp% 2Fdp% 2F4061529021% 2F% 3Ftag% 3Da8-affi-271202-22)
[](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+ BWGDT & a8ejpredirect = https% 3A% 2F% 2Fwww.amazon.co.jp% 2Fdp% 2F476490487X% 2F% 3Ftag% 3Da8-affi-271202-22)
So, [Nest cage, Deep Learning Java programming Theory and implementation of deep learning, Impress, 2016. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4844381288%2F%3Ftag%3Da8- affi-271202-22) is not enough just to use the library, but I think that it is a book suitable for beginners to read for those who have a high hurdle to suddenly enter a specialized book.
If you specialize in understanding Convolutional Neural Networks (CNN), the following books are by far the best. Although specialized in CNN, it definitely contributes to the understanding of this implementation.
-[Saito, Deep Learning from scratch-Theory and implementation of deep learning learned from Python, O'Reilly Japan, 2016. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4873117585%2F%3Ftag%3Da8- affi-271202-22)
[](https://px.a8.net/svt/ejp?a8mat=2NZCQW + 6MQUCY + 249K + BWGDT & a8ejpredirect = https% 3A% 2F% 2Fwww.amazon.co.jp% 2Fdp% 2F4873117585% 2F% 3Ftag% 3Da8-affi-271202-22)
[Deep Learning Java Programming Theory and Implementation of Deep Learning](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp % 2Fdp% 2F4844381288% 2F% 3Ftag% 3Da8-affi-271202-22) also deals with CNN, so of course you can understand it considerably here as well. I think this book is excellent in that it compares typical methods from a bird's-eye view in a short time.
Deep Learning from scratch explained exactly from the basics of neural networks, and in the meantime, it was connected to CNN! I have the impression that it is structured like a story.
[Deep Learning Java Programming Theory and Implementation of Deep Learning](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp I was able to run the Java sample code described in% 2Fdp% 2F4844381288% 2F% 3Ftag% 3Da8-affi-271202-22), but I thought it would be boring just to run it. So this time, I decided to move to Python myself.
Targets are *** Deep Belief Nets (DBN) *** and *** Stacked Denoising Autoencoders (SDA) ***,
The great thing is the author, Mr. Nest, and I'm just riding for free. However, I thought that it would be a rather crazy act to just move without thinking, so I made the following rules.
-** No copying ***
I just said that it was a slightly strict sutra copy lol.
The code is published below.
--GitHub repository
The execution method for each algorithm is as follows.
cd <cloned path>/DeepLearningWithPython/DeepNeuralNetworks
python DeepBeliefNets.py
cd <cloned path>/DeepLearningWithPython/DeepNeuralNetworks
python StackedDenoisingAutoencoders.py
For software configuration, [Deep Learning Java Programming Theory and Implementation of Deep Learning](https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww. It follows amazon.co.jp% 2Fdp% 2F4844381288% 2F% 3Ftag% 3Da8-affi-271202-22), so please refer to that.
The following three types of results are output.
-------------------------------
DBN(or SDA) Regression model evaluation
-------------------------------
Accuracy: 100.0 %
Precision:
class 1: 100.0 %
class 2: 100.0 %
class 3: 100.0 %
Recall:
class 1: 100.0 %
class 2: 100.0 %
class 3: 100.0 %
The meaning of each result is as follows.
--Accuracy (correct answer rate): Correct answer rate in all data
--Precision: Correct answer rate in the predicted positive
data
--Recall: Percentage of data whose correct answer is positive
that can be predicted as positive
It is expressed by a mathematical formula as follows.
Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \\
Precision = \frac{TP}{TP + FP} \\
Recall = \frac{TP}{TP + FN}
The breakdown of TP, TN, FP and FN is shown in the table below.
Positive and predicted | Negative and predicted | |
---|---|---|
Correct is correct | True Positive (TP) | False Negative (FN) |
Negative is correct | False Positive (FP) | True Negative (TN) |
The network model used is the same for both DBN / SDA.
--Input layer neurons: 60
For both DBN / SDA, you will learn as follows.
--Pre-training
--Fine tuning
Both methods use exactly the same model, and the test data created in the same way can be properly learned. If you get almost the same results with different methods, what's the difference? I am concerned about that.
It can be said that the common point of both methods is to obtain parameters that match the input / output data of the two-layer network. In the case of DBN, due to the characteristics of the Boltzmann machine network, *** input data *** and *** state *** are compared, but if you close your eyes to the details, they are essentially the same. I'm doing that.
In addition, by using data with various noises added at the learning stage, it is common to make it more resistant to noise when identifying it in production. It can be interpreted that there is a difference in how the noise is given.
In the case of DBN, it is stochastically determined whether neurons are activated. Even though they are in the same state, they may or may not be activated. This characteristic corresponds to automatically adding noise to the parameters being trained. The algorithm adds noise without permission.
On the other hand, in the case of SDA, some noise is added to the training data and then input to the learning device. This is because the algorithm does not have the property of adding noise to itself because it proceeds deterministically.
From these points of view, it can be understood that there are the following differences in features.
It seems that there are still various views on the question of how these characteristics affect learning and why learning of deep networks works well when initialization is performed in this way. My head hasn't been organized yet, so I'll refrain from saying it here lol ^^;
Even if there was a sample program that could be used as a reference, by moving my own hands and implementing it, I gained a little understanding of Deep Learning.
If the information posted here is of any use to anyone, I would be more than happy.
I haven't done dropouts or CNN implementation yet, but I'll implement it again when I feel like it ^^
-[Nest cage, Deep Learning Java programming Theory and implementation of deep learning, Impress, 2016. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4844381288%2F%3Ftag%3Da8- affi-271202-22) -[Okaya, Deep Learning, Kodansha, 2015. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4061529021%2F%3Ftag%3Da8- affi-271202-22) -[Aso et al., Deep Learning, Modern Science, Supervised by The Japanese Society for Artificial Intelligence, Kamishima ed., 2015. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F476490487X%2F%3Ftag%3Da8- affi-271202-22) -[Saito, Deep Learning from scratch-Theory and implementation of deep learning learned from Python, O'Reilly Japan, 2016. ](Https://px.a8.net/svt/ejp?a8mat=2NZCQW+6MQUCY+249K+BWGDT&a8ejpredirect=https%3A%2F%2Fwww.amazon.co.jp%2Fdp%2F4873117585%2F%3Ftag%3Da8- affi-271202-22)
Recommended Posts