Recognize your boss and hide the screen with Deep Learning

background

When you're at work, everyone is browsing non-business information, right?

At that time, I feel awkward that the boss is creeping behind me. Of course, you can switch screens in a hurry, but that kind of behavior can be suspicious, and you may not notice it when you are concentrating. So, in order to switch screens without being suspicious, I created a system that automatically recognizes that the boss is approaching and hides the screen.

Specifically, Keras </ font> is used for machine learning of the boss's face, and the camera is used to recognize that it is approaching and switch screens.

Mission

The mission is to switch screens automatically when the boss approaches. The situation is as follows. situation.png

The distance from the boss's seat to my seat is about 6,7m. It takes 4,5 seconds after the boss leaves my seat to reach my seat. Therefore, it is necessary to hide the screen during this time. I don't have much time to spare.

Strategy ~ How to hide the screen? ~

I think there are many possible strategies, but this is the one I thought of. First, use machine learning to let the computer learn the boss's face in advance. Then, install a webcam in your seat and switch the screen when the webcam catches the boss's face. The perfect strategy. Let's call this wonderful system "** Boss Sensor **". スクリーンショット 2016-07-05 16.16.06.png

Boss sensor system configuration

The super-simple system configuration of the boss sensor is as follows. スクリーンショット 2016-07-05 16.24.23.png

  • Acquire images in real time using a webcam.
  • Face detection and face recognition are performed on the acquired image using the learning model.
  • If the recognition result is the boss, switch the screen.

The following techniques are required to do these:

  • Get face image
  • Face image recognition
  • Screen switching

Let's examine them one by one and finally integrate them.

Get face image

First, get the image from the webcam. The Web camera used this time is [BUFFALO BSW20KM11BK](https://www.amazon.co.jp/%E3%83%90%E3%83%83%E3%83%95%E3%82%A1%E3 % 83% AD% E3% 83% BC-BSW20KM11BK-iBUFFALO-% E3% 83% 9E% E3% 82% A4% E3% 82% AF% E5% 86% 85% E8% 94% B5200% E4% B8% 87% E7% 94% BB% E7% B4% A0WEB% E3% 82% AB% E3% 83% A1% E3% 83% A9-120% C2% B0% E5% BA% 83% E8% A7% 92% E3% 82% AC% E3% 83% A9% E3% 82% B9% E3% 83% AC% E3% 83% B3% E3% 82% BA% E6% 90% AD% E8% BC% 89% E3% 83% A2% E3% 83% 87% E3% 83% AB / dp / B008AO4KXQ). s_DSC_0081.JPG

I think that any webcam can be used as long as it has some performance.

You can get the image from the camera with the attached software, but considering the later processing, it is better to get it from the program. In addition, since face recognition is performed in the subsequent processing, it is necessary to cut out only the face image. So I decided to get the face image using Python and OpenCV. Here is the code for that:

I was able to get a better face image than I expected. スクリーンショット 2016-09-15 9.49.32.png

Learning and recognizing the face of the boss

Then use machine learning to allow the computer to recognize the boss's face. The following three points are a rough procedure:

  1. Image collection
  2. Image processing
  3. Build a machine learning model

Let's look at these one by one.

Image collection

First of all, you need to get an image for learning. I used the following collection method:

  • Google image search
  • Collect images on Facebook
  • Taken with a camera

At first I collected images from web search and Facebook, but I didn't get enough images. Therefore, I used a video camera to shoot a video and decomposed the video into images.

Image processing

By the way, I was able to get a large number of images with faces, but I can't put them on the learning device as they are. This is because the part that has nothing to do with the face occupies a large part of the image. Therefore, I will cut out only the face image.

I mainly used ImageMagick for cutting out. You can get only the face image by cutting it out using ImageMagick.

A lot of face images were collected like this: s_many_bosses.jpg

Perhaps I am the person who possesses the most boss's face image in the world. Must have more than parents.

Now you are ready to study.

Building a machine learning model

Learning is done by building a convolutional neural network (CNN) in Keras. We use TensorFlow for the Keras backend. If you just want to recognize the face, you can hit the Web API for image recognition, but this time I decided to make it myself in consideration of real-time performance.

The network has the following configuration. Keras is convenient because you can easily output the configuration.

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
convolution2d_1 (Convolution2D)  (None, 32, 64, 64)    896         convolution2d_input_1[0][0]      
____________________________________________________________________________________________________
activation_1 (Activation)        (None, 32, 64, 64)    0           convolution2d_1[0][0]            
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 32, 62, 62)    9248        activation_1[0][0]               
____________________________________________________________________________________________________
activation_2 (Activation)        (None, 32, 62, 62)    0           convolution2d_2[0][0]            
____________________________________________________________________________________________________
maxpooling2d_1 (MaxPooling2D)    (None, 32, 31, 31)    0           activation_2[0][0]               
____________________________________________________________________________________________________
dropout_1 (Dropout)              (None, 32, 31, 31)    0           maxpooling2d_1[0][0]             
____________________________________________________________________________________________________
convolution2d_3 (Convolution2D)  (None, 64, 31, 31)    18496       dropout_1[0][0]                  
____________________________________________________________________________________________________
activation_3 (Activation)        (None, 64, 31, 31)    0           convolution2d_3[0][0]            
____________________________________________________________________________________________________
convolution2d_4 (Convolution2D)  (None, 64, 29, 29)    36928       activation_3[0][0]               
____________________________________________________________________________________________________
activation_4 (Activation)        (None, 64, 29, 29)    0           convolution2d_4[0][0]            
____________________________________________________________________________________________________
maxpooling2d_2 (MaxPooling2D)    (None, 64, 14, 14)    0           activation_4[0][0]               
____________________________________________________________________________________________________
dropout_2 (Dropout)              (None, 64, 14, 14)    0           maxpooling2d_2[0][0]             
____________________________________________________________________________________________________
flatten_1 (Flatten)              (None, 12544)         0           dropout_2[0][0]                  
____________________________________________________________________________________________________
dense_1 (Dense)                  (None, 512)           6423040     flatten_1[0][0]                  
____________________________________________________________________________________________________
activation_5 (Activation)        (None, 512)           0           dense_1[0][0]                    
____________________________________________________________________________________________________
dropout_3 (Dropout)              (None, 512)           0           activation_5[0][0]               
____________________________________________________________________________________________________
dense_2 (Dense)                  (None, 2)             1026        dropout_3[0][0]                  
____________________________________________________________________________________________________
activation_6 (Activation)        (None, 2)             0           dense_2[0][0]                    
====================================================================================================
Total params: 6489634

The code is here:

If you can do so far, you can recognize it as a boss when the boss appears on the camera.

Screen switching

Now, after recognizing the boss's face using the learned model, we need to switch screens. This time, let's simply prepare and display an image of the work style. Since I am a programmer, I prepared the following images.

s_pycharm.png

It's like working. All you have to do is display this image.

I want to display the image in full screen, so I will display it using PyQt. Here is the code for that:

Now everything is ready.

Complete

It is completed when the technologies verified so far are combined. I actually tried it.

"The boss has left his seat. He is approaching my seat." standup.jpg

"OpenCV has detected the face. It throws the image into the learning model." approach.jpg

"The screen was switched after recognizing that it was a boss. ヽ (‘∇’) Now s_editor.jpg

Reflection

With self-discipline.

Although the accuracy was high for the images in the test set, the images acquired by the webcam were sometimes difficult to recognize. It seems that it was related to the degree of light hitting and the blurring of the image. I have to do image normalization properly.

The classification was a binary classification of boss or another person, but when an image other than the face image was input, it was sometimes classified as a boss. If anything, it was a boss. It may have been better to recognize using the probability of being a boss rather than binary classification.

Image processing is difficult.

Source code

You can download the source code from the following repositories.

in conclusion

This time, I tried to recognize the boss and hide the screen by combining real-time image acquisition from the webcam and face recognition using Keras.

Currently, face detection is done with OpenCV, but I feel that the accuracy of face detection with OpenCV is not good, so I would like to improve that area by using Dlib. I would like to use a face detection model that I trained myself.

I used to use PyQt to display the wind image that I was working on, but it takes a long time to display it after I ordered it, so I need to improve that area.

The recognition accuracy of the image acquired from the webcam is not good enough, so I would like to improve it from time to time.

The reality was not sweet

Recommended Posts

Recognize your boss and hide the screen with Deep Learning
Put your own image data in Deep Learning and play with it
Extend and inflate your own Deep Learning dataset
Deep Learning from scratch The theory and implementation of deep learning learned with Python Chapter 3
I installed and used the Deep Learning library Chainer
Deep learning image analysis starting with Kaggle and Keras
Extract music features with Deep Learning and predict tags
Try deep learning with TensorFlow
Deep Kernel Learning with Pyro
(Deep learning) Images were collected from the Flickr API and discriminated by transfer learning with VGG16.
"Deep copy" and "Shallow copy" to understand with the smallest example
I captured the Touhou Project with Deep Learning ... I wanted to.
Deep Learning with Shogi AI on Mac and Google Colab
Reinforcement learning 23 Create and use your own module with Colaboratory
Generate Pokemon with Deep Learning
Let's move word2vec with Chainer and see the learning progress
HIKAKIN and Max Murai with live game video and deep learning
Easy deep learning web app with NNC and Python + Flask
Deep Learning with Shogi AI on Mac and Google Colab Chapter 11
Deep Learning with Shogi AI on Mac and Google Colab Chapters 1-6
Deep Learning with Shogi AI on Mac and Google Colab Chapter 8
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7
Deep Learning with Shogi AI on Mac and Google Colab Chapter 10 6-9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 10
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 5-7
Deep Learning with Shogi AI on Mac and Google Colab Chapter 9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 1-2
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Introduction to Deep Learning (2) --Try your own nonlinear regression with Chainer-
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3 ~ 5
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 9
Deep Learning with Shogi AI on Mac and Google Colab Chapter 8 5-9
Operate Firefox with Selenium from python and save the screen capture
Deep Learning with Shogi AI on Mac and Google Colab Chapter 8 1-4
Deep Learning with Shogi AI on Mac and Google Colab Chapter 12 3
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 8
Deep Learning with Shogi AI on Mac and Google Colab Chapter 7 1-4
Try Deep Learning with FPGA-Select Cucumbers
Cat breed identification with deep learning
Make ASCII art with deep learning
Explore the maze with reinforcement learning
Solve three-dimensional PDEs with deep learning.
Organize machine learning and deep learning platforms
Check squat forms with deep learning
Categorize news articles with deep learning
Forecasting Snack Sales with Deep Learning
Make people smile with Deep Learning
Validate the learning model with Pylearn2
Graph of the history of the number of layers of deep learning and the change in accuracy
Connect to VPN with your smartphone and turn off / on the server
Realize environment construction for "Deep Learning from scratch" with docker and Vagrant
Make a DNN-CRF with Chainer and recognize the chord progression of music
Learn with Shogi AI Deep Learning on Mac and Google Colab Use Google Colab
Deep Learning on Mac and Google Colab Words Learned with Shogi AI
Classify anime faces with deep learning with Chainer
Othello-From the tic-tac-toe of "Implementation Deep Learning" (3)
Introduction to Deep Learning ~ Convolution and Pooling ~
Meaning of deep learning models and parameters