Recently, the number of meetings and classes at zoom has been increasing, but I feel that I don't know how much interest I have in talking unless I'm face-to-face, so why not try quantifying it? I thought about it and made it.
Since this is my first post, there are some parts that are not good, but I hope you read it to the end: sweat:
zoom Acquires an image or video of a meeting, recognizes the face in the picture, and measures the degree of interest in the story.
I decided to use Amazon Rekognition to recognize the faces of people attending the zoom conference this time.
I referred to this article for how to use it.
import cv2
import numpy as np
import boto3
#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')
#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
fontface = cv2.FONT_HERSHEY_DUPLEX
#Loop until you press q.
#Capture frame
ret, frame =
height, width, channels = frame.shape
#Convert to jpg Image files are sent via the Internet via API, so keep the size small.
small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
ret, buf = cv2.imencode('.jpg', small)
#Throw API to Amazon Rekognition
faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])
#Draw a box around the face
for face in faces['FaceDetails']:
smile = face['Smile']['Value']
green if smile else red, frame_thickness)
emothions = face['Emotions']
i = 0
for emothion in emothions:
str(emothion['Type']) + ": " + str(emothion['Confidence']),
(25, 40 + (i * 25)),
i += 1
#Show the result on the display
cv2.imshow('frame', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
For the time being, when I tried moving the code, I was able to perform face recognition and sentiment analysis! However, when it came to video acquisition, it was heavy and stopped halfway. So I decided to load the image. (This is the code of the article I referred to.)
I referred to this article for image capture.
I used PIL to capture the screen.
from PIL import ImageGrab
ImageGrab.grab().save("./capture/PIL_capture.png ")
I created a separate folder called capture and saved it in that folder.
import cv2
import numpy as np
import boto3
#Settings such as scale and color
scale_factor = .15
green = (0,255,0)
red = (0,0,255)
frame_thickness = 2
#cap = cv2.VideoCapture(0)
rekognition = boto3.client('rekognition')
#font size
fontscale = 1.0
#Font color(B, G, R)
color = (0, 120, 238)
fontface = cv2.FONT_HERSHEY_DUPLEX
from PIL import ImageGrab
ImageGrab.grab().save("./capture/PIL_capture.png ")
#Capture frame
#ret, frame =
frame = cv2.imread("./capture/PIL_capture.png ")
height, width, channels = frame.shape
frame = cv2.resize(frame,(int(width/2),int(height/2)),interpolation = cv2.INTER_AREA)
#Convert to jpg Image files are sent via the Internet via API, so keep the size small.
small = cv2.resize(frame, (int(width * scale_factor), int(height * scale_factor)))
ret, buf = cv2.imencode('.jpg', small)
#Throw API to Amazon Rekognition
faces = rekognition.detect_faces(Image={'Bytes':buf.tobytes()}, Attributes=['ALL'])
#Draw a box around the face
for face in faces['FaceDetails']:
smile = face['Smile']['Value']
green if smile else red, frame_thickness)
emothions = face['Emotions']
i = 0
score = 0
for emothion in emothions:
if emothion["Type"] == "HAPPY":
score = score + emothion["Confidence"]
elif emothion["Type"] == "DISGUSTED":
score = score - emothion["Confidence"]
elif emothion["Type"] == "SURPRISED":
score = score + emothion["Confidence"]
elif emothion["Type"] == "ANGRY":
score = score - emothion["Confidence"]
elif emothion["Type"] == "CONFUSED":
score = score - emothion["Confidence"]
elif emothion["Type"] == "CALM":
score = score - emothion["Confidence"]
elif emothion["Type"] == "SAD":
score = score - emothion["Confidence"]
i += 1
if i == 7:
"interested" +":"+ str(round(score,2)),
#Show the result on the display
cv2.imshow('frame', frame)
I used OpenCV to read the image itself.
Amazon Rekognition can read 6 emotions of HAPPY, DISGUSETED, SURPRISED, ANGRY, CONFUSED, CALM, SAD, so HAPPY and SURPRISED are calculated as positive emotions (high interest level) and other emotions as negative emotions (low interest level). Finally, it was displayed on the face that recognized the degree of interest in the range of -100 to 100.
I am borrowing an image of a person because I could not gather people with zoom.
Amazon Rekognition has other features, so if you are interested, please take a look!
-If the number of participants in zoom is large, the displayed characters will overlap and it will be very difficult to see. -Since it is not a capture of the Zoom screen, the command prompt will appear in the image unless the command prompt is minimized immediately after execution.
I made it so much that I want people to see it! I started writing with that in mind, but when I wrote it, I was able to relive the experience while I was making it, which was a learning experience. It might be a lot of fun if something that I made like this permeates the world!
Recommended Posts