I was able to create a miscellaneous Photoshop video last time (Part ③ Create miscellaneous Photoshop video), but it is not yet practical. Generally, there are the following issues. A. If there are multiple people, overlay them on everyone's face (you cannot select only a specific person) B. Misrecognition of non-face objects (recognition accuracy) C. Face may not be recognized (recognition accuracy) We aim to solve these problems.
First, about A and B. The purpose of creating the miscellaneous collage this time is to "overwrite the face of another person only on the face of a specific person". However, it seems impossible to automatically select a specific person from the recognized faces, so we will use human hands here. The way is ・ Video output by assigning an ID to the recognized face ・ Visually check the ID of the face you want to overwrite ・ Enter the ID It is realized by the method. However, if you simply assign IDs to all the faces found, the number of IDs to enter will be very large, so I would like to deal with this together with the solution of C. In order to reduce the number of IDs, the front and back frames are used to judge whether the face is continuous from the position and size, and if they are continuous, the same ID is assigned. Furthermore, in the frame that continues from X → Y → Z, if the face is recognized by X and Z, but the face is not recognized by Y, it is complemented as Y also has the face.
This time I will create a class for the first time for implementation. All the classes are created in a file called frame_manager.py. First, create the ** FacePosition ** class. Although it is called a class, it is just a structure that holds the coordinates and ID of the face.
frame_manager.py(FacePosition)
class FacePosition:
'''
Class for holding face position and ID as a set
Just a structure with ID and coordinates as face coordinates and size
'''
def __init__(self, id, coordinate):
self.id = id
self.coordinate = coordinate
Use this to retain face information.
Next, create a class ** FaceFrame ** to hold the information of the frame and the faces that exist in it. If you pass the frame and face coordinates (s), the initial ID will be assigned to the face and stored. Count the IDs assigned so far in the variable for Static access so that the initial ID is not covered.
frame_manager.py(FaceFrame)
class FaceFrame:
'''
Class for holding the face recognized in each frame
Since faceCount is a variable for counting the number of IDs used so that the IDs are not covered by the entire application,
Always use FaceFrame.Access with faceCount
'''
faceCount = 0
def __init__(self, frame, coordinates):
'''
Pass the coordinates and size of the face recognized as a frame.
Create instances of FacePoint class for the number of faces
coodinates:An array of face recognition results. cascade.Pass the result of detectMultiScale as it is
'''
#Secure an array for a few minutes of the face
self.faces = [None]*len(coordinates)
self.frame = frame
#Create an instance of FacePosition by assigning an id to each face passed
for i in range(0, len(coordinates)):
self.faces[i] = FacePosition(FaceFrame.faceCount, coordinates[i])
FaceFrame.faceCount += 1
#A function for adding faces in a frame later
def append(self, faceId, coordinate):
self.faces.append(FacePosition(faceId, coordinate))
Now you can maintain the correspondence between the frame and the face.
The ** FrameManager ** class, which is the heart of the game. From the outside, this class works as follows. ■ When the coordinate information of the frame and face is passed, the frame information (Face Frame) that allocates the ID and complements the recognition failure is returned.
For that purpose, the received frame is temporarily stored in an array, and the ID is assigned and completed is returned. The length of the array can be changed by changing LIST_SIZE, but here it is 5. The processing flow is as follows. ・ Receives frame and face coordinate information (s) -Store in an array. At this time, the oldest element in the array is the return value. (・ Separated by the previous frame (frameFs) and the subsequent frame (frameBs) with the frame in the middle of the array (frameC) as the boundary) ・ Check the position and size of the faces of frameF and frameC, and assign the same ID if they are considered to be continuous. -Compare frameF and frameB, and if there is a continuous face, but it does not exist in frameC, complement it to frameC. -Repeat for the combination of frameFs and frameBs. The tolerance when judging that the faces are continuous is specified by ALLOWED_GAP, but this time it is set to 5%. (Since frameF and frameB have multiple frames, the presence or absence of s indicates whether they are individual frames or the entire frame group.) Below is the source.
frame_manager.py(FrameManager)
class FrameManager:
'''
A class that complements the continuity of the face and the missing face based on the passed frame and face recognition result.
Assign the same ID to consecutive faces.
'''
#Specify how many FaceFrames to check the continuity of the face
LIST_SIZE = 5
CENTER_INDEX = int(LIST_SIZE/2)
#How much difference in position and size should be allowed when determining whether the faces between frames are the same.%Specified by
ALLOWED_GAP = 5
def __init__(self, height, width):
'''
Specify the height and width of the video to be handled
'''
FrameManager.FRAME_HEIGHT = height
FrameManager.FRAME_WIDTH = width
self.__frames = [None]*self.LIST_SIZE
def put(self, frame, coordinates):
'''
Add a frame based on the passed frame and face recognition result
When adding, assign ID, check continuity, complement missing faces, LIST_Returns an instance of the SIZE th FaceFrame
As the processing at the end, after processing all the frames, LIST_Since SIZE frames remain in the Frame Manager, keep adding None until you finish putting out the remaining frames.
return:An instance of FaceFrame. However, LIST_Returns None if there is no FaceFrame instance at the SIZE th.
'''
#Since None is passed when outputting the last remaining frame, in that case faceFrame is also set to None.
if frame is None:
faceFrame = None
else:
faceFrame = FaceFrame(frame, coordinates)
#Move the list forward one by one and add an argument frame at the end. Since there are many random accesses in internal processing, I think it is desirable to manage them in an array.
returnFrame = self.__frames[0]
for i in range(0,len(self.__frames)-1):
self.__frames[i] = self.__frames[i+1]
self.__frames[FrameManager.LIST_SIZE-1] = faceFrame
#Check continuity from the front and back frames
# CENTER_Before that with INDEX as the boundary(i)rear(j)Check the continuity of the face with each combination
for i in range(0, FrameManager.CENTER_INDEX):
for j in range(FrameManager.CENTER_INDEX+1, FrameManager.LIST_SIZE):
#Skip the None part
if self.__frames[i] is not None and self.__frames[FrameManager.CENTER_INDEX] is not None and self.__frames[j] is not None:
#Check continuity and complement all frames in between
for k in range(i+1, j):
self.connectFrame(self.__frames[i], self.__frames[k], self.__frames[j])
return returnFrame
def connectFrame(self, frameF, frameC, frameB):
# frameF.faces and frameC.If there are consecutive faces in faces, give the same id.
#TODO It is possible that the same id can be assigned to multiple faces. In the first place, in this case, the current design does not work, so I put it on hold.
frontFaceNum = len(frameF.faces)
centerFaceNum = len(frameC.faces)
backFaceNum = len(frameB.faces)
for i in range(0, frontFaceNum):
#Keeps if the i-th face in the previous frame matches any of the faces in frame C
matched = False
for j in range(0, centerFaceNum):
#If it is judged to be the same face, use the same ID
if self.compare(frameF.faces[i], frameC.faces[j]) == True:
frameC.faces[j].id = frameF.faces[i].id
matched = True
break
#Even if it is not in frameC, if it is in both frameF and frameB, it is considered that the face is also in framC in between and complemented.
if matched == False:
for k in range(0, backFaceNum):
if self.compare(frameF.faces[i], frameB.faces[k]):
#Add a face to the position / size between frameF and frameB
frameC.append(frameF.faces[i].id, ((frameF.faces[i].coordinate + frameB.faces[k].coordinate)/2).astype(np.int))
#Increase the number of faces by 1.(In case another face is found in the later process)
centerFaceNum += 1
#Infinite loop prevention
if(centerFaceNum>10):
break
def compare(self, face1, face2):
'''
Compare whether face1 and face2 are continuous.
return:True if same, False if different
'''
result = True
#Check if the difference in coordinates and face size is within the margin of error, and all errors(ALLOWED_GAP)Judge that they have the same face if they are inside
#If the TODO frames are far apart, it is better to increase the tolerance accordingly.
for i in range(0,4):
if i%2 == 0:
gap = ((float(face1.coordinate[i])-float(face2.coordinate[i]))/FrameManager.FRAME_HEIGHT)*100
else:
gap = ((float(face1.coordinate[i])-float(face2.coordinate[i]))/FrameManager.FRAME_WIDTH)*100
if (-1*FrameManager.ALLOWED_GAP < gap < FrameManager.ALLOWED_GAP) == False:
result = False
break
return result
To use it with this, if you create an instance of FrameManager and enter the frame and face information, It will return a FaceFrame with an ID.
In addition, when I review it, I check the continuity of IDs multiple times between the same frames, which makes it redundant. However, I close my eyes for the reason described later.
Incorporate the created FrameManager class into the overlay_movie.py created last time. After face recognition, first put the recognized face in FrameManager, and write the ID to the found face based on the output FaceFrame instance.
overlay_movie2.py
# -*- coding:utf-8 -*-
import cv2
import datetime
import numpy as np
from PIL import Image
import frame_manager
def overlay_movie2():
#Specify the video to be input and the output path.
target = "target/test_input.mp4"
result = "result/test_output2.m4v" #.I get an error if I don't use m4v
#Loading videos and getting video information
movie = cv2.VideoCapture(target)
fps = movie.get(cv2.CAP_PROP_FPS)
height = movie.get(cv2.CAP_PROP_FRAME_HEIGHT)
width = movie.get(cv2.CAP_PROP_FRAME_WIDTH)
#Specify MP4V as the format
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
#Open the output file
out = cv2.VideoWriter(result, int(fourcc), fps, (int(width), int(height)))
#Acquire the features of the cascade classifier
cascade_path = "haarcascades/haarcascade_frontalface_alt.xml"
cascade = cv2.CascadeClassifier(cascade_path)
#Creating a FrameManager
frameManager = frame_manager.FrameManager(height, width)
#Specify the color of the rectangle that surrounds the recognized face. White here.
color = (255, 255, 255)
#Read the first frame
if movie.isOpened() == True:
ret,frame = movie.read()
else:
ret = False
count = 0
#Continue to export frames while successfully reading frames
while ret:
#Convert to grayscale
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
#Perform face recognition
facerecog = cascade.detectMultiScale(frame_gray, scaleFactor=1.1, minNeighbors=1, minSize=(1, 1))
#Put the recognized face in FrameManager
managedFrame = frameManager.put(frame, facerecog)
#After the 5th time, the frame will be returned from Frame Manager, so file output
if managedFrame is not None:
#Add a number to the recognized face
for i in range(0,len(managedFrame.faces)):
#Variables are prepared for easy handling
tmpCoord = managedFrame.faces[i].coordinate
tmpId = managedFrame.faces[i].id
print("Number of recognized faces(ID) = "+str(tmpId))
#Surround with a rectangle
cv2.rectangle(managedFrame.frame, tuple(tmpCoord[0:2]),tuple(tmpCoord[0:2]+tmpCoord[2:4]), color, thickness=2)
#Write face ID
cv2.putText(managedFrame.frame,str(tmpId),(tmpCoord[0],tmpCoord[1]),cv2.FONT_HERSHEY_TRIPLEX, 2, (100,200,255), thickness=2)
out.write(managedFrame.frame)
if count%10 == 0:
date = datetime.datetime.now().strftime("%Y/%m/%d %H:%M:%S")
print(date + 'Current number of frames:'+str(count))
count += 1
ret,frame = movie.read()
#End halfway
if count > 200 :
break
print("Number of output frames:"+str(count))
if __name__ == '__main__':
overlay_movie2()
You can safely assign an ID to your face,
It is now possible to identify a specific person from multiple people by ID, and it is now possible to identify consecutive faces by one ID.
All you have to do is enter the ID of the face you want to overwrite and overwrite the corresponding face. I would like to say, but that was not the case. I've made it so far, but this program can't serve its purpose. The recognition accuracy of the face in the video to be recognized is poor, and even if the gap is complemented, it cannot be handled. (I was looking away while feeling thin) Therefore, I would like to explore another policy for this program as a storehouse. It will continue to the next.
Recommended Posts