This article is the 6th day article of SLP_KBIT Part 2 Advent Calendar 2016.
** * I'm not a lolicon **
This time, I would like to explain the little girl image judgment system Lolinco developed at the hackathon held at SLP. The source code can be found here (https://github.com/SLP-KBIT/Lolinco). This system was developed together with @uchiyu, and the articles are also divided according to the person in charge.
As the title suggests, Lolinco is a system that determines whether a given image shows a little girl **. Before explaining the inside in various ways, I would like you to first look at the operation of the system.
This is the top page of Lolinco. Select the image file you want to identify from "Select File". When you press the "Loli Judgment" button, it is determined whether or not the little girl is in the picture.
You will be notified of the results with a sad expression.
I'm very impressed.
Only one person ... It's okay to see only one person ... I'm satisfied with that ... (Weird things are also recognized as faces, but don't worry)
The tasks in developing Lolinco were shared as follows.
--Creation of teacher data (@gembaf) --Machine learning based on teacher data (@uchiyu) --Discrimination of arbitrary images (@gembaf)
From here, I will explain ** creating teacher data ** and ** identifying arbitrary images **, which I am in charge of.
The procedure for creating teacher data is roughly divided as follows.
For the part that extracts the face from the image, I referred to the following blog.
-Face recognition with OpenCV, trimming and saving only the face part [Python]
The method of distinguishing an arbitrary image can be roughly divided as follows.
I need to collect a large number of images to create teacher data, but this time I used Google's image search. However, it seems that it is a violation of the rules to collect image search pages mechanically without using the API provided by Google. For that reason, I collected all of them by hand this time, probably because I was excited by the long development. I would also like to share the knowledge gained at that time.
In order to collect loli images as teacher data, we searched for the best search word. As a result, when I searched for "(age) girl", I got a nice image. This time, we collected images for girls aged 5 to 12 years. Of course, the teacher data also needs lost data, so I searched for images appropriately.
The results of trial and error are summarized below.
Search word | Impressions |
---|---|
(age)girl | Good vibes |
(age)Girl image | Same as above |
(age)Girl | Although it is reflected, there are many things whose faces are difficult to distinguish. More images like "children in conflict areas" appearing in the news |
Loli girl | The number of adults who look like loli increases. Two-dimensional images also increase |
Loli girl image | Two dimensions(R18)The number of images will increase significantly |
Girl loli image | 3D(R18)The number of images will increase significantly |
For the time being, I found that the combination of ** Lori ** and ** Image ** was dangerous.
This time, we talked about the developed web application. If you are interested in the specific machine learning part, please look forward to tomorrow's @uchiyu article. Also, Lolinco is not very accurate, probably because of the small amount of teacher data. Often, landscapes and clothes are recognized as faces. However, with regard to machine learning and image recognition, I think I can make something that looks more interesting depending on my ideas, so I would like to continue to be interested in it. Please feel free to touch it!