Second decoction is odd, but I can't help but write what I've done.
A software engineer who is completely new to machine learning and deep learning has created an app that uses a convolutional neural network to identify the faces of members of "Momoiro Clover Z".
It runs on ec2. However, it is heavy because it is t2.micro.
Momokuro image classifier http://momomind.kenmaz.net
code https://github.com/kenmaz/momo_mind
I (@kenmaz) have been writing only iOS apps (objc / swift) for the last few years at work. Before that, I also wrote web applications, server applications, development tools, etc. in Java / Ruby / PHP. So my knowledge of machine learning and artificial intelligence was almost zero.
I will write about why I decided to make an app like the title, along with the introduction of some books.
There was a book "Is artificial intelligence surpassing humans?" That became a hot topic at the end of last year, but it was so interesting that I wanted to actually learn deep learning and try it.
https://www.amazon.co.jp/ Does artificial intelligence surpass humans?-Kakugawa EPUB Selection-Matsuo-Yutaka-ebook / dp / B00UAAK07S
I'm scared of singularity! I thought, but SF is okay, isn't it?
First, I read "Introduction to Machine Learning Theory for IT Engineers". This is also a popular book that is stacked flat in the bookstore.
http://gihyo.jp/book/2015/978-4-7741-7698-7
It covers everything from the basics of machine learning to explanations of related mathematics and statistics such as least squares and maximum likelihood estimation. Deep learning isn't mentioned so much, but the topics around Perceptron, classification algorithms based on logistics regression, and Bayesian inference seemed to be learning machine learning, and it was responsive.
This is a web serialization. Is it serialized by the same author as ↑? I also read this.
http://gihyo.jp/dev/serial/01/machine-learning
So, here I'm in a state of differentiating / integrating, yes yes ... I should have done it, but I took a short detour and Mr. Haruaki Tazaki of Gakushuin University has released it for free. I decided to read the document "Mathematics for Physics". It was recommended somewhere as a high school / university first grade math textbook.
http://www.gakushuin.ac.jp/~881791/mathbook/
I printed it out and read it when I returned home on New Year's Day. When I think about it now, I wonder if it was okay if I didn't read it separately. ..
I thought that mathematics was good, and read the above "Introduction to Machine Learning Theory" again, but I thought that I wanted to bite into deep learning, and I thought "Deep Learning (Machine Learning Professional Series)". ) "was.
http://www.amazon.co.jp/ Deep Learning-Machine Learning Professional Series-Okatani-Takayuki / dp / 4061529021
As the title suggests, you can learn about deep learning. Neural networks, backpropagation, self-encoding, convolutional neural nets, recursive neural nets, etc. .. .. I'm not a specialist, but I feel like it covers important topics related to deep learning. Furthermore, even if you didn't understand the part in the first lap, when you read it again after actually writing the code, a phenomenon like "Oh, this unfamiliar formula, is it related to the part of the code I wrote earlier?" Often happens, so it's good to have a feeling of learning.
The most shocking thing was the description, "There are some things that researchers do not really understand why adding a hidden layer improves accuracy," and I felt romance.
Now, Around this time, I thought it was okay to read a book, I wanted to write something, and I started thinking about making something using TensorFlow, which was a hot topic at that time.
This is finally the main subject (long introduction).
As you can see from the title, the one I made this time is almost the same as the content of the blog article by Mr. Sugiyan.
Identify idol faces with deep learning with TensorFlow <http://d.hatena.ne.jp/sugyan/20160112/1452558576
As expected, it is that to make exactly the same thing, so even if it has the same function, I thought that the analysis target should at least be different from Momokuro, but severe Mononov As an engineer, I couldn't help but enjoy what Mr. Sugiyan was doing, so I decided to use the same material after all. Sorry for the second decoction.
It consists of four components.
crawler
A Ruby script that collects black images from the web
face_detect
Python script that cuts the face part from the image collected by crawler
deeplearning
TensorFlow that learns face images as training / test data+Python script
web
A python script that provides a face recognition function as a web application using a model of learning results
I'm not sure about Python, so let's write most of it in Ruby at first! I thought, but I thought that Python was more advantageous for handling openCV (and of course TensorFlow), so I switched to Python as the main language from the middle. That's why only the crawler is Ruby and everything else is Python. Let's use Python obediently.
crawler
https://github.com/kenmaz/momo_mind/blob/master/crawler/url_fetch.rb First, in ʻurl_fetch.rb`, perform a bing image search using the name of the thigh black member as a keyword, and obtain the image URL. There is nothing in particular. As a point of reflection, the search result of bing image search hits only the image of Momokuro a little old, and it was difficult to collect the latest image. I thought that it would be better to pull the image from the member's Ameba Blog, but for the time being, it is only via bing.
https://github.com/kenmaz/momo_mind/blob/master/crawler/download.rb
Once you get the url list, download the pounding image with download.rb
. When I ran it at night and woke up in the morning, I had a lot of it.
Since the same URL sometimes exists and the image is duplicated, I also took care to avoid duplication as much as possible by using the hash value of the image binary as the file name at the time of saving.
As mentioned above, it was an ordinary crawler.
face_detect Next, from the collected images, the facial parts are extracted as learning data. https://github.com/kenmaz/momo_mind/blob/master/face_detect/detect.py
detect.py
uses a model called haarcascades
provided by openCV as standard to detect only the face part from the image and cut it out as an image. However, if you try to detect a face using this model, the following problems will occur.
--If the face is tilted even a little, it will not be detected as a face. --An image that is not a face is mistakenly recognized as a face
Regarding this issue, I once again referred to Mr. Sugiyan's blog post.
Simple face detection API with Heroku + OpenCV http://d.hatena.ne.jp/sugyan/20151203/1449137219
It is almost the same as above, but the processing contents are as follows.
--Rotate the image itself by 5 degrees to detect the face => Solve the problem of face tilt --Since the same image is only rotated, duplicate faces are detected for one face => For each face image, eye position detection and mouth position detection are additionally performed, and one Pick up an image that looks like it
The eye and mouth detection is evaluated as follows. --There is one eye each in the upper left and upper right of the image --The horizontal positions of the left and right eyes are almost the same --There is one mouth at the bottom center of the image
It is a heavy process because the above process is repeated while rotating the image in 5 degree increments from -50 degrees to 50 degrees.
This script is used not only to generate learning data, but also to detect faces from images uploaded by users when finally publishing it as a web application. It is too heavy to divert the processing as it is, so when using it as a web application, I tried to use only the angles of -5 degrees, 0 degrees, 5 degrees. Therefore, face detection often fails, but it is unavoidable.
For the problem that the same face is duplicated between rotated images, the duplicated face images can be grouped by rotating and moving the coordinates of each face. I did it when I was in high school.
Primary conversion of rotational movement http://www.geisya.or.jp/~mwm48961/kou2/linear_image3.html
With this, a large number of face images of Momoiro members have finally been collected, but in order to use these as learning data, it is necessary to provide learning data as to who each face is of Momoiro. You need to raise it. From now on, I'm trying to make a program that automatically classifies it, but because of the egg and chicken problem, it is necessary for humans to first create the first learning data.
Mr. Sugyan seems to have made a web tool for creating learning data, but I was annoyed, so for the time being, I arranged the images in the Mac Finder and selected multiple images with cmd + click → Create a folder. I decided to do the work because it feels like it.
The folder structure looks like this.
/out
/train
/reni
..About 150 images..
/kanako
..About 150 images..
/shiori
..About 150 images..
/arin
..About 150 images..
/momoka
..About 150 images..
test
/reni
..About 150 images..
/kanako
..About 150 images..
/shiori
..About 150 images..
/arin
..About 150 images..
/momoka
..About 150 images..
I was wondering how much the training and test images should be distributed, but I prepared about 150 images each. At first, you should enrich your training data! So, I tried to allocate training: test = 7: 3, but then the test data might be biased depending on how to select the test data.
I did my best to classify a total of 1,500 images by human power, and I think that this was really good for the face image of Momoiro Clover. Why don't you try it with SMAP members? I said that, but I really don't like the work of classifying 1500 uncle's face images. It was really good to be black.
This is the end of the first part. No one may want it, but I'm sure I'll write the second part.
Click here for more Part 2: http://qiita.com/kenmaz/items/ef0a1308582fe0fc5ea1#_reference-6ae9a51ee7a7a346d3c1
Recommended Posts