Over the last few days of the year, I tried Deep Learning ❸ Framework Edition (Zero D3 [^ 1]), which is currently under public review. It was. Personally, I feel that it was a better book than MUJI, so I would like to write down what was attractive and who would be happy to read it.
Through scratch implementation using Numpy, it was a content to understand recent frameworks (PyTorch and Chainer) at the code level. As it was a framework edition, the focus was not only on DL, but also on how to write efficient code and the design concept of Define-by-run, which is the basis of the framework.
There were some things I felt and gained from going through it. The order is Tekito.
I feel that it has appeared more times than MUJI. There were 5 stages (chapter) in all, 2 of which were related to Backprop.
Specifically, from basic differentiation and Backprop, there were quite a few new parts that were not dealt with unmarked, such as automatic differentiation (AutoGrad), higher-order differentiation, explanation of Newton's method, and implementation. I sensuously thought that it was No Backprop, No Deep Learning.
It was impressive that this explanation came out after reading to some extent, not at the very beginning. Since there was an explanation of nodes, input, creator, output, etc., I thought that the configuration was such that the difference between the two could be understood smoothly.
That's right because it's aimed at that, but when you read this book, you'll find that you're pretty conscious. If you haven't touched PyTorch, I think this is a big deal.
As an example, I would write the following code when learning PyTorch. In this book, the implementation level covers how the code corresponding to the red line works behind the scenes.
I was able to read the PyTorch official book that will be published this spring, so I read it, but I just explained how to use it. It was an impression that I stayed at. In that respect, this book is unique, and as the author says, I thought it was almost non-existent. [^ 2]
I often used Python's special methods, and I personally used methods that I rarely see (such as rmul).
Also, the operators used in x * w + b (+, *, etc.) have usage conditions, and the amount of calculation becomes enormous, so how to manage memory etc. I was touching it.
Only if you've used a framework such as PyTorch, you'll gradually get closer to familiar code later in the book. I personally had an uplifting feeling that the background of each PyTorch line that I had somehow understood was connected.
For example, it's like walking to a destination you normally go by train and expanding the map in your brain (maybe it's closer to your image if this road was connected here).
Since the main theme of this book is frameworks, the first half was devoted to the implementation of detailed methods (transpose, etc.) and memory management methods.
However, later in this book, the foundation of the framework will be solidified, and we will move on to implementing Optimizer, Loss, CNN, LSTM, and so on. There are some similar parts to MUJI, so if you know how to use numpy and Python to some extent, I have a personal impression that there is no problem even if you start from zero D3.
There was no touch on natural language (NLP), but RNN and LSTM were introduced in the context of time series analysis. So if you want to know the NLP itself, it's zero D2.
I think the feature of this book is not only the explanation of DL itself, but also "Understanding the structure of the framework itself through code". I have the impression that many books so far have provided either "Explanation of DL itself + Numpy implementation" or "Explanation of DL itself + How to use the framework" (the actual situation may not be so because it is a small sample of itself). I don't know).
As a case I fell into, I got a rough idea of DL, Numpy, and the framework, and although there was no problem with running the framework alone, there was a gap between it and the actual understanding (there was no bridge). Status).
So, for those who have used PyTorch, I think this book offers value to strengthen the connection between Numpy or formulas and PyTorch.
On the other hand, for those who are interested in DL but have never used the framework, I think this book provides a transitional value for a smooth transition to the framework.
Regardless of practicality, it was fun to unravel the black boxes one by one, find relationships between known things, and connect the two. It was a book that I would like to take a few laps because I haven't fully understood it yet.
It seems to be public review until 1/13, so it may be interesting to read while reviewing.
[^ 1]: I don't know the abbreviation. Which is Zero D, Zero Deep, or Zero? [^ 2]: I think that Deep Learning from the Foundations of fast.ai is a close position. This is the content to implement fastai library using PyTorch. It is overwhelmingly more difficult than the Zero D3 and has a wider coverage.
Recommended Posts