Part 1: Create AI to identify Zuckerberg's face by deep learning ① (Learning data preparation) Part 2: Making AI that identifies Zuckerberg's face by deep learning ② (AI model construction)
Continuing from, this article is Part 3. What we make is "AI that identifies Zuckerberg's face". Use TensorFlow from Google's deep learning library. Sample videos of what I made this time are here.
Continuing from the last time, I will write the processing in TensorFlow. The work of the TensorFlow part is ** "(1) Design of TensorFlow learning model (done!) → (2) Train and train face data → (3) Make it possible to judge the face of any image using the learning result" * It is a general flow of *.
In Part 2 ** "(1) Design of TensorFlow learning model (neural network)" ** was completed, so this time, [Part 2] Part 1]((http://qiita.com/AkiyoshiOkano/items/72f3e4ba9caf514460ee)) Based on the large amount of face data of Zuckerberg, Bill Gates, and Eron Mask, actually ** "face image data" Let us learn "**".
It feels like deep learning using TensorFlow is finally over. (The minimum required literature and articles that I referred to for the premise of deep learning and TensorFlow are summarized in Part 2. I hope you can refer to.)
Let's actually let AI learn the data! I'm looking forward to it!
We will create the processing of the TensorFlow part that actually trains the training data. As I wrote in the previous article, the directory structure of this project looks like this.
Directory structure
/tensoflow
main.py(I will write the learning model and learning process here)
eval.py(A file that returns the case law results of any image)
/data(Face data collected in the previous article)
/train
/zuckerbuerg
/elonmusk
/billgates
data.txt
/test
/zuckerbuerg
/elonmusk
/billgates
data.txt
After that, the folders and files created when tensorflow was installed are in the tensorflow folder.
We will add the processing of the learning part to the main.py
file used last time.
(The code for the building part of the ** learning model created last time ** is included in this `` main.py``` file, but it will be duplicated & long, so the code for the previous part is described below. I omitted it, but when you actually run it, please insert the code of the previous learning model part in this file. It may be good to import it separately in another file.)
main.py(Data learning part)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import cv2
import random
import numpy as np
import tensorflow as tf
import tensorflow.python.platform
#Number of identification labels(This time three)
NUM_CLASSES = 3
#Image size when learning(px)
IMAGE_SIZE = 28
#Number of dimensions of the image(28px*28px*3(Color))
IMAGE_PIXELS = IMAGE_SIZE*IMAGE_SIZE*3
#Flag is a TensorFlow built-in function that can register default values and help screen explanations like constants.
flags = tf.app.flags
FLAGS = flags.FLAGS
#Training data
flags.DEFINE_string('train', './data/train/data.txt', 'File name of train data')
#Verification data
flags.DEFINE_string('test', './data/test/data.txt', 'File name of train data')
#TensorBoard data storage folder
flags.DEFINE_string('train_dir', './data', 'Directory to put the training data.')
#Number of learning training trials
flags.DEFINE_integer('max_steps', 100, 'Number of steps to run trainer.')
#How many images to use in one learning
flags.DEFINE_integer('batch_size', 20, 'Batch size Must divide evenly into the dataset sizes.')
#If the learning rate is too small, learning will not proceed, and if it is too large, the error will not converge or diverge. delicate
flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.')
#------------------------------------------------
##################################################
#Main I wrote last time during this time.Contains the code for building the learning model of py.#
#It may be good to make a separate file only here and import and read it.#
##################################################
#------------------------------------------------
#Calculate how much "error" there was between the prediction result and the correct answer
#logits is the calculation result: float - [batch_size, NUM_CLASSES]
#labels is the correct label: int32 - [batch_size, NUM_CLASSES]
def loss(logits, labels):
#Calculation of cross entropy
cross_entropy = -tf.reduce_sum(labels*tf.log(logits))
#Specify to display in TensorBoard
tf.scalar_summary("cross_entropy", cross_entropy)
#Error rate value(cross_entropy)return it
return cross_entropy
#error(loss)Train a learning model designed using error backpropagation based on
#I'm not sure what's happening behind the scenes, but the weights of each layer of the learning model(w)And so on
#Understanding that the parameters are adjusted by optimizing based on the error(?)
# (The explanation of the book "Is artificial intelligence surpassing humans?")
def training(loss, learning_rate):
#Like this function does all that
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)
return train_step
#Calculate the correct answer rate of the prediction result given by the learning model at inference
def accuracy(logits, labels):
#Compare whether the prediction label and the correct label are equal. Returns True if they are the same
#argmax is the index of the part with the largest value in the array(=Label number that seems to be the most correct answer)return it
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
#boolean correct_Calculate the correct answer rate by changing prediction to float
# false:0,true:Convert to 1 and calculate
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
#Set to display on TensorBoard
tf.scalar_summary("accuracy", accuracy)
return accuracy
if __name__ == '__main__':
#Tensor format so that learning images can be read by TensorFlow(queue)Conversion to
#Open file
f = open(FLAGS.train, 'r')
#Array to put data
train_image = []
train_label = []
for line in f:
#Separated with spaces except line breaks
line = line.rstrip()
l = line.split()
#Read data and reduce to 28x28
img = cv2.imread(l[0])
img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))
#0 after lining up-Set to a float value of 1
train_image.append(img.flatten().astype(np.float32)/255.0)
#Label 1-of-Prepare with k method
tmp = np.zeros(NUM_CLASSES)
tmp[int(l[1])] = 1
train_label.append(tmp)
#Convert to numpy format
train_image = np.asarray(train_image)
train_label = np.asarray(train_label)
f.close()
#Similarly, Tensor format so that verification images can be read by TensorFlow(queue)Conversion to
f = open(FLAGS.test, 'r')
test_image = []
test_label = []
for line in f:
line = line.rstrip()
l = line.split()
img = cv2.imread(l[0])
img = cv2.resize(img, (IMAGE_SIZE, IMAGE_SIZE))
test_image.append(img.flatten().astype(np.float32)/255.0)
tmp = np.zeros(NUM_CLASSES)
tmp[int(l[1])] = 1
test_label.append(tmp)
test_image = np.asarray(test_image)
test_label = np.asarray(test_label)
f.close()
#Specify the scope to be output to the graph of TensorBoard
with tf.Graph().as_default():
#Tensor for inserting images(28*28*3(IMAGE_PIXELS)Any number of dimensional images(None)I have a minute)
images_placeholder = tf.placeholder("float", shape=(None, IMAGE_PIXELS))
#Tensor to put a label(3(NUM_CLASSES)Any number of dimensional labels(None)Enter minutes)
labels_placeholder = tf.placeholder("float", shape=(None, NUM_CLASSES))
#Temporary Tensor to put dropout rate
keep_prob = tf.placeholder("float")
# inference()To create a model
logits = inference(images_placeholder, keep_prob)
# loss()To calculate the loss
loss_value = loss(logits, labels_placeholder)
# training()To train and adjust the parameters of the learning model
train_op = training(loss_value, FLAGS.learning_rate)
#Accuracy calculation
acc = accuracy(logits, labels_placeholder)
#Ready to save
saver = tf.train.Saver()
#Creating a Session(TensorFlow calculations must be done in an absolute Session)
sess = tf.Session()
#Variable initialization(Initialize when starting Session)
sess.run(tf.initialize_all_variables())
#TensorBoard display settings(Tensor Board Declarative?)
summary_op = tf.merge_all_summaries()
# train_Specify the path to output the TensorBoard log with dir
summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph_def)
#Actually max_Execute training as many times as step
for step in range(FLAGS.max_steps):
for i in range(len(train_image)/FLAGS.batch_size):
# batch_Training for size images
batch = FLAGS.batch_size*i
# feed_Specify the data to put in the placeholder with dict
sess.run(train_op, feed_dict={
images_placeholder: train_image[batch:batch+FLAGS.batch_size],
labels_placeholder: train_label[batch:batch+FLAGS.batch_size],
keep_prob: 0.5})
#Calculate the accuracy after each step
train_accuracy = sess.run(acc, feed_dict={
images_placeholder: train_image,
labels_placeholder: train_label,
keep_prob: 1.0})
print "step %d, training accuracy %g"%(step, train_accuracy)
#Add a value to be displayed on the TensorBoard after each step
summary_str = sess.run(summary_op, feed_dict={
images_placeholder: train_image,
labels_placeholder: train_label,
keep_prob: 1.0})
summary_writer.add_summary(summary_str, step)
#Display accuracy for test data after training
print "test accuracy %g"%sess.run(acc, feed_dict={
images_placeholder: test_image,
labels_placeholder: test_label,
keep_prob: 1.0})
#Learn the data and save the final model
# "model.ckpt"Is the output file name
save_path = saver.save(sess, "model.ckpt")
This completes the processing of the learning part of TensorFlow.
By the way, this time, it has almost the same composition as kivantium of Identifying the production company of anime Yuruyuri with TensorFlow. is. At first I did various things myself, but (I couldn't make it feel good) I finally tried to do the same. m (_ _) m
The data reading part is not OpenCV, but it seems that you can write it better by using the tf.TextLineReader
and decode_jpeg
functions built in TensorFlow, but that also can not be done well and can not be done after all ... orz
I thought, "It's a little the same thing to make an article as expected ...", so at least I wrote a comment on the code explanation part in more detail for super beginners like me. (Please let me know if there is something wrong with the explanation m (_ _) m)
Now that we have designed the learning model and processed the data learning part, place the main.py
file in the directory together with the face image data and start TensorFlow with source bin / activate
. Then run the main.py file with python main.py
and it should actually start learning and output the model.ckpt
file of the final learning outcome. Let's actually perform data learning!
① cd tensorflow
(move to tensorflow directory)
② source bin / activate
(start tensorflow)
③ python main.py
(learning execution!)
④ Generate a model.ckpt
file of the final learning result
It's very easy to do as long as you write the process. In my case, it took about 30 minutes to study at 200 STEP. I think it will take a long time to learn 100 million times.
The transition of the accuracy rate (accuracy) and error (cross_entropy) when the learning is executed can be seen because it is output as a graph in TensorBoard.
I think that the learning result will change depending on the learning data even with the same learning model, but in my case it was like this. I think that the correct answer rate during learning will also be output to the console. By the way, the number of trainings and the number of batches were decided appropriately by referring to the articles of other people and setting the number of trainings to 100 to 200 and the number of batches to 10 to 20. What kind of value setting is good?
** Accuracy graph **
I changed the number of batches and the number of learnings and executed the learning three times, but the accuracy did not change at all from 0.33 (1/3) to the end in the first and second times, and the third time again under the same conditions as the second time. After executing the learning, the accuracy went up properly according to the number of STEPs. (This behavior that the correct answer rate suddenly started to rise in the third study this time, I'm not hungry, I want someone to tell me what happened m (_ _) m)
(The output of the console at the time of the third learning. The correct answer rate became 1 at about 40 STEP.)
In my case, the correct answer rate became 1 in about 40 steps, and I felt that it was strangely faster than the cases of other people, so I optimized it only for training data and created a model that can not be actually used. ** "Overfitting" **? I doubted that, but the correct answer rate in the test data for subsequent verification was about 97%, so is there any problem? I decided to proceed. (I also thought, "Is it okay if the correct answer rate is 1? W". I wonder if it was okay to proceed as it is ...)
** Cross_entropy (error) graph **
The graph of cross_entropy looks like this. (At the time of the third learning execution)
After trying so far, I thought ** "Learning model (neural network) design is also the same, but I feel that adjusting the number of trainings and batches is a part where sense and experience are required." ** It was an impression. It's ugly!
This is the end of learning. It seems that there is a function that can easily inflate the data by inverting the learning data or changing the color tone with the built-in function of TensorFlow, so I wanted to use it, but this time it is the third learning execution Even if I didn't use padding, it looked like the learning result I was looking for for the time being, so I gave it up this time. (Reference: Learn how to inflate images from TensorFlow code)
It was my first time learning TensorFlow, so I was groping a lot, but if you have something like "I should do this more here", please let me know. M (_ _) m
Finally, the model data after training can be used to determine the face of any image. We will implement it so that it can be executed with the WEB interface using TensorFlow and Flask, the WEB application framework of Python.
Click here for Part 4 of the continuation → Create AI to identify Zuckerberg's face by deep learning ④ (WEB construction)
** "Creating an AI that identifies Zuckerberg's face with deep learning"
GitHub:https://github.com/AkiyoshiOkano/zuckerberg-detect-ai
Bonus judgment images. This time, if none of the ** 3 people of Zuckerberg, Bill Gates, and Ilone Mask exceed 90% or more, the specification ** will be displayed as "no one of the 3 people".
Savannah Takahashi who is tampered with like Zuckerberg [^ 1]
AI that properly identifies Zuckerberg and Mr. Takahashi Savannah
Mr. Zuckerberg
[^ 1]: Source: http://matsukonews.com/1555
Recommended Posts