I'm a super beginner in machine learning, so I came up with a web application that uses images. "If you use deep learning, you can do something about it ~" I thought that it didn't work, and I consulted with a famous machine learning senior in the company.
** A website where you can answer when asked "Who is your favorite type of entertainer?" At a drinking party **
Don't you really want it? That standard question at a drinking party has been quite troublesome since I became a member of society, so I really want you to be able to answer like a story like this.
For the time being, I had some knowledge about deep learning, so what I thought about to achieve this requirement was
--Deep learning for "model image" so that the judgment result is "5 grade evaluation of users such as type and weakness"
By that
――You can predict which of the five levels will be evaluated for the "image of a celebrity".
It was to create a learning model like this. However, if there are tens of thousands of learning data such as JINS BRAIN, the prediction accuracy will increase again, but this time, "about 50" Since it is learned from "the image of the model of", the number of sheets was too small and the prediction could not be made at all.
When I commented in the in-house daily report that "I am suffering because the learning accuracy does not improve with 50 sheets", I received comments from senior engineers who are knowledgeable about AI in the company and synchronization, so I paired the contents with my impressions I would like to summarize it. It's just a memo, so there is nothing in particular, but I hope that similar machine learning super beginners can see it and realize the depth of this world.
First of all, it is not a fundamental solution to this case, but if the number of images is small, the standard technique is to generate similar images in a programmable manner and inflate them. http://qiita.com/bohemian916/items/9630661cd5292240f8c7 You can learn by changing the contrast by referring to this article. However, relying too much on padding can easily lead to overfitting and should not be overfitted.
I also received such a comment.
If it is determined by a simple DNN implementation, 50 sheets is not possible. However, the accuracy can be improved even with a small amount of data by adding not only images but also features that can be easily inserted as training data. For example, if you want to determine whether it is an apartment, an apartment, or a detached house from a photograph of the exterior of the building, but only 50 image data can be collected, the image data and the number of floors and amount of the building can be taken separately. Add the amount to increase the accuracy.
I see, in this example, the features are black hair or big eyes.
I think there is a way to extract feature points from an image and process it (put it in a classifier such as SVM).
Hmmm, there are other ways to classify images besides DNN. Of course, I've asked about SVM, but I personally thought that it was in the development of artificial intelligence and was a legacy of the past.
For example, if it is something like "If a certain logo is included, it will be unique", I think that sift, surf features may be used. Personally, I feel that "image recognition = machine learning" has stopped thinking.
That's right. There is also a view that machine learning is not necessary if you can explain in words what kind of features a human face has. The sirf and surf features were new to me. Reference: https://www.slideshare.net/lawmn/siftsurf
Since what we are trying to do this time is to output people who will like it in the future based on past input, we received the opinion that it is close to the recommendation sensuously. ▼ Reference http://qiita.com/ynakayama/items/59beb40b7c3829cc0bf2
However, when it comes to collaborative filtering, we make predictions by referring to the input information of other people, so we decided that it did not meet the service requirements this time. For example, in a music app, after some time has passed since its release, users who like this song should generally like this song as well.
When it came to this area, I didn't really get my head around. However, I will make a note for the time being. http://blog.amedama.jp/entry/2017/04/02/130530 I don't have a head, but is it a mechanism that can reduce the dimension of multidimensional data and reduce the loss of meaning as much as possible? If this can be done, the features of the image can be expressed in two dimensions ...?
Factorization Machines http://qiita.com/wwacky/items/b402a1f3770bee2dd13c ???? It's no longer clear whether it will be used for any purpose. .. .. I will ask you again. I can't understand it once. .. ..
http://pythondatascience.plavox.info/scikit-learn/%E3%82%AF%E3%83%A9%E3%82%B9%E3%82%BF%E5%88%86%E6%9E%90-k-means This is what came to me when I thought about the service specifications this time. It seems to be classified as unsupervised learning It seems that it is possible to classify the image of the model and find out in advance which image the celebrity's image is close to by this method. In other words, you can think about which of the classes is closest to your taste in the image entered by the user and output it. Rather than creating a trained model, it feels like a really simple classification.
It's different from the image I originally thought, but if the technology that can classify images can be realized other than deep learning, there seems to be no need to use deep learning.
The problem is how to classify this, but ... Is it just like using the pixel value of an image as a value, or evaluating skin color, hair color, eye size, etc. and quantifying them to classify them?
http://qiita.com/PonDad/items/2410c55b2d21e7cad7bc I feel that reinforcement learning is also a relatively possible method. Load the image, and if it is "type", give a reward, and if it is "not good", give a punishment. However, I feel that the purpose is different, so I concluded that there is no reason to adopt this rather than cluster analysis.
http://qiita.com/ynakayama/items/ca3f5e9d762bbd50ad1f It seems that people who like this model will learn that there is a high probability that they will also like this model. However, although this can be used when growing the service, I feel that it cannot be introduced in the initial release. It was similar to collaborative filtering.
As an extension of the idea of clustering, even the similarity of the images is calculated, and the similarity between the image of the model for the sample and the face of the celebrity is calculated in advance, and it is the most similar to the user's input. The idea came up that we should choose a celebrity with a high total degree. I feel like I can do it after reading the following articles.
--Face recognition without deep learning 3 CNN edition --Blog (Hatena branch office) programmed by NEET http://suzuichiblogpg.hatenadiary.jp/entry/2016/10/31/220809
――Let's judge "regular" from the face image using AI! | Future Tech Blog --Future Architect https://future-architect.github.io/articles/20170526/ #future_architect @future_techblog
--Calculate image similarity with Python + OpenCV by @best_not_best on @Qiita http://qiita.com/best_not_best/items/c9497ffb5240622ede01
In particular, the article that classifies Shiba Inu with Python + OpenCV seems to be close to the use case of finding a favorite entertainer from a human face photo.
--Deep Learning that permeates recommendations: An overview of the latest algorithms from practical examples of major services | DeepAge https://deepage.net/deep_learning/2016/09/26/recommend_deeplearning.html
It feels like a quick search, and the articles around here are likely to be helpful.
I thought that there was only deep learning, but when I heard various means, it seems most realistic to use an algorithm that finds the similarity of images.
I was surprised that there were so many ways to do it because I didn't have enough means. Although I knew the name and outline of some of them, I decided that it was a fossilized technology in the process of making AI, but I will keep in mind that it is also the right person in the right place. I did.
Recommended Posts