Deep learning to start without GPU

Deep learning takes more than a few weeks to learn without a PC with a GPU, making it difficult to try. In addition, the fact that the number of training data is often tens of thousands or more is another factor that raises the bar. Therefore, this time, I will introduce a method of learning with a PC without GPU and a small number of data.

Target audience

--I don't have a PC with GPU, and I don't like to rent an instance with GPU on AWS, but I want to play with CNN. --Those who want to create a network specialized for another purpose instead of a ready-made network. (Example: Face recognition, judgment of a picture that seems to be popular, etc.)

Problems with deep learning

--On a PC without a GPU, there is only a problem that can be solved in terms of learning time, such as MNIST and CIFAR10. --Even a simple image classification problem requires learning data of 50,000 sheets or more. --The existing learned network on the net cannot be used unlike the items you actually want to classify. (Example: Optimized for classification items that do not matter even though you want to distinguish the face, vehicle type, and illness)

There is a method called transfer learning [^ 1] as a method to solve these problems. Transfer learning is a method of learning only a part of the network that has been learned by another problem such as ImageNet. Various methods have been devised, but what is important in practice is that they often work well even if they are diverted to another image classification problem. [^ 2] (Example: classification of grass and birds, face recognition, emotional judgment, etc.)

Benefits of transfer learning

It also has the following advantages, so it is an effective means when you want to feel free to try it.

--The amount of data is small (a learning device can be created with a small amount of learning data) --Can handle even large images to some extent. (It takes more than a few weeks to learn when trying to create a CNN that processes 224x224x3 image data.) --It can be processed in a realistically computable time without a GPU.

There are various methods, but recently I have used the following method.

Collect the images you want to classify. (Please do your best to collect)
Exclude the wrong image.
Drop the image into a low-dimensional space using a trained network excluding the fully connected layer.
Find the relationship between the low-dimensional space and the correct answer using SVM [^ 3] and gradient boosting [^ 4].

Note). Please note that about 30% is wrong when collecting by Google image search.

By using deep learning only where it falls into a low-dimensional space like this, Since it can be solved by another high-speed learning method without time-consuming network learning, it can be solved even on a PC without a GPU.

Numerical experiment

I tried to solve the problem of classifying a certain anime character with the following 3 patterns. A). A simple CNN, (32x32x3) that appears mostly in tutorials B). ResNet-50 layer (32x32x3, I think this is mostly used for recent classification problems) C). ResNet-50 layer (learned network) + SVM The correct answer rate was A). 94.33%, B). 94.57%, C). 93.83%, respectively. The result was almost the same as when solving with pure CNN. Note). It seems that I made some mistakes when creating the learning data, so I wondered if a serious person could improve the performance a little more.

Repository

https://github.com/namakemono/keras-anime-face-recognition

There is a code that describes methods 3 and 4. I'm glad if you can use it as a reference.

Summary

We introduced a method called transfer learning that can produce reasonably good performance without a PC with a GPU by transfer learning. I think it depends on the application, but this method can be classified even on a PC without a GPU, so why not give it a try?

References