Introduction

Recently, the number of meetings and classes at zoom has been increasing, but I feel that I don't know how much interest I have in talking unless I'm face-to-face, so why not try quantifying it? I thought about it and made it.

Since this is my first post, there are some parts that are not good, but I hope you read it to the end: sweat:

Purpose

zoom Acquires an image or video of a meeting, recognizes the face in the picture, and measures the degree of interest in the story.

Implementation

As a test

I decided to use Amazon Rekognition to recognize the faces of people attending the zoom conference this time.

I referred to this article for how to use it. https://qiita.com/G-awa/items/477f2324552cb908ecd0

`detect_face.py`


import cv2
import numpy as np
import boto3

#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')

#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
#font
fontface = cv2.FONT_HERSHEY_DUPLEX

#Loop until you press q.
while(True):

    #Capture frame
    ret, frame = cap.read()
    height, width, channels = frame.shape

    #Convert to jpg Image files are sent via the Internet via API, so keep the size small.
    small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
    ret, buf = cv2.imencode('.jpg', small)

    #Throw API to Amazon Rekognition
    faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])

    #Draw a box around the face
    for face in faces['FaceDetails']:
        smile = face['Smile']['Value']
        cv2.rectangle(frame,
                      (int(face['BoundingBox']['Left']*width),
                       int(face['BoundingBox']['Top']*height)),
                      (int((face['BoundingBox']['Left']+face['BoundingBox']['Width'])*width),
                       int((face['BoundingBox']['Top']+face['BoundingBox']['Height'])*height)),
                      green if smile else red, frame_thickness)
        emothions = face['Emotions']
        i = 0
        for emothion in emothions:
            cv2.putText(frame,
                        str(emothion['Type']) + ": " + str(emothion['Confidence']),
                        (25, 40 + (i * 25)),
                        fontface,
                        fontscale,
                        color)
            i += 1

    #Show the result on the display
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

For the time being, when I tried moving the code, I was able to perform face recognition and sentiment analysis! However, when it came to video acquisition, it was heavy and stopped halfway. So I decided to load the image. (This is the code of the article I referred to.)

Screen capture

I referred to this article for image capture. https://qiita.com/koara-local/items/6a98298d793f22cf2e36

I used PIL to capture the screen.

`capture.py`


from PIL import ImageGrab

ImageGrab.grab().save("./capture/PIL_capture.png ")

I created a separate folder called capture and saved it in that folder.

Implementation

`face_detect.py`


import cv2
import numpy as np
import boto3

#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
#cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')

#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
#font
fontface = cv2.FONT_HERSHEY_DUPLEX


from PIL import ImageGrab

ImageGrab.grab().save("./capture/PIL_capture.png ")

#Capture frame
#ret, frame = cap.read()
frame = cv2.imread("./capture/PIL_capture.png ")
height, width, channels = frame.shape
frame = cv2.resize(frame,(int(width/2),int(height/2)),interpolation = cv2.INTER_AREA)

    #Convert to jpg Image files are sent via the Internet via API, so keep the size small.
small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
ret, buf = cv2.imencode('.jpg', small)

    #Throw API to Amazon Rekognition
faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])

    #Draw a box around the face
for face in faces['FaceDetails']:
    smile = face['Smile']['Value']
    cv2.rectangle(frame,
                    (int(face['BoundingBox']['Left']*width/2),
                    int(face['BoundingBox']['Top']*height/2)),
                    (int((face['BoundingBox']['Left']/2+face['BoundingBox']['Width']/2)*width),
                    int((face['BoundingBox']['Top']/2+face['BoundingBox']['Height']/2)*height)),
                    green if smile else red, frame_thickness)
    emothions = face['Emotions']
    i = 0
    score = 0
    for emothion in emothions:
        
        if emothion["Type"] == "HAPPY":
            score = score + emothion["Confidence"]
        elif emothion["Type"] == "DISGUSTED":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "SURPRISED":
            score = score + emothion["Confidence"]
        elif emothion["Type"] == "ANGRY":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "CONFUSED":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "CALM":
            score = score - emothion["Confidence"]
        elif emothion["Type"] == "SAD":
            score = score - emothion["Confidence"]
        i += 1
        if i == 7:
            cv2.putText(frame,
            "interested" +":"+ str(round(score,2)),
            (int(face['BoundingBox']['Left']*width/2),
            int(face['BoundingBox']['Top']*height/2)),
            fontface,
            fontscale,
            color)


        

#Show the result on the display
cv2.imshow('frame', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()

I used OpenCV to read the image itself. Amazon Rekognition can read 6 emotions of HAPPY, DISGUSETED, SURPRISED, ANGRY, CONFUSED, CALM, SAD, so HAPPY and SURPRISED are calculated as positive emotions (high interest level) and other emotions as negative emotions (low interest level). Finally, it was displayed on the face that recognized the degree of interest in the range of -100 to 100. スクリーンショット 2020-11-17 172257.png I am borrowing an image of a person because I could not gather people with zoom. https://tanachannell.com/4869

Amazon Rekognition has other features, so if you are interested, please take a look! https://docs.aws.amazon.com/ja_jp/rekognition/latest/dg/faces-detect-images.html

problem

-If the number of participants in zoom is large, the displayed characters will overlap and it will be very difficult to see. -Since it is not a capture of the Zoom screen, the command prompt will appear in the image unless the command prompt is minimized immediately after execution.

Finally

I made it so much that I want people to see it! I started writing with that in mind, but when I wrote it, I was able to relive the experience while I was making it, which was a learning experience. It might be a lot of fun if something that I made like this permeates the world!

GitHub https://github.com/r-301/zoom-response-check

zoom I tried to quantify the degree of excitement of the story at the meeting

Introduction

Purpose

Implementation

As a test

detect_face.py

Screen capture

capture.py

Implementation

face_detect.py

problem

Finally

`detect_face.py`

`capture.py`

`face_detect.py`