Create miscellaneous Photoshop videos with Python + OpenCV ④ Deal with issues

0. Introduction

I was able to create a miscellaneous Photoshop video last time (Part ③ Create miscellaneous Photoshop video), but it is not yet practical. Generally, there are the following issues. A. If there are multiple people, overlay them on everyone's face (you cannot select only a specific person) B. Misrecognition of non-face objects (recognition accuracy) C. Face may not be recognized (recognition accuracy) We aim to solve these problems.

1. Policy

First, about A and B. The purpose of creating the miscellaneous collage this time is to "overwrite the face of another person only on the face of a specific person". However, it seems impossible to automatically select a specific person from the recognized faces, so we will use human hands here. The way is ・ Video output by assigning an ID to the recognized face ・ Visually check the ID of the face you want to overwrite ・ Enter the ID It is realized by the method. However, if you simply assign IDs to all the faces found, the number of IDs to enter will be very large, so I would like to deal with this together with the solution of C. In order to reduce the number of IDs, the front and back frames are used to judge whether the face is continuous from the position and size, and if they are continuous, the same ID is assigned. Furthermore, in the frame that continues from X → Y → Z, if the face is recognized by X and Z, but the face is not recognized by Y, it is complemented as Y also has the face.

2. Class implementation (FacePostion, FaceFrame)

This time I will create a class for the first time for implementation. All the classes are created in a file called frame_manager.py. First, create the ** FacePosition ** class. Although it is called a class, it is just a structure that holds the coordinates and ID of the face.

`frame_manager.py(FacePosition)`


class FacePosition:
    '''
Class for holding face position and ID as a set
Just a structure with ID and coordinates as face coordinates and size
    '''

    def __init__(self, id, coordinate):
        self.id = id
        self.coordinate = coordinate

Use this to retain face information.

Next, create a class ** FaceFrame ** to hold the information of the frame and the faces that exist in it. If you pass the frame and face coordinates (s), the initial ID will be assigned to the face and stored. Count the IDs assigned so far in the variable for Static access so that the initial ID is not covered.

`frame_manager.py(FaceFrame)`


class FaceFrame:
    '''
Class for holding the face recognized in each frame
Since faceCount is a variable for counting the number of IDs used so that the IDs are not covered by the entire application,
Always use FaceFrame.Access with faceCount
    '''
    
    faceCount = 0

    def __init__(self, frame, coordinates):

        '''
Pass the coordinates and size of the face recognized as a frame.
Create instances of FacePoint class for the number of faces
        coodinates:An array of face recognition results. cascade.Pass the result of detectMultiScale as it is
        '''
        
        #Secure an array for a few minutes of the face
        self.faces = [None]*len(coordinates)
        self.frame = frame

        #Create an instance of FacePosition by assigning an id to each face passed
        for i in range(0, len(coordinates)):
            self.faces[i] = FacePosition(FaceFrame.faceCount, coordinates[i])
            FaceFrame.faceCount += 1

    #A function for adding faces in a frame later
    def append(self, faceId, coordinate):
        self.faces.append(FacePosition(faceId, coordinate))

Now you can maintain the correspondence between the frame and the face.

3. Class implementation (FrameManager)

The ** FrameManager ** class, which is the heart of the game. From the outside, this class works as follows. ■ When the coordinate information of the frame and face is passed, the frame information (Face Frame) that allocates the ID and complements the recognition failure is returned.

For that purpose, the received frame is temporarily stored in an array, and the ID is assigned and completed is returned. The length of the array can be changed by changing LIST_SIZE, but here it is 5. The processing flow is as follows. ・ Receives frame and face coordinate information (s) -Store in an array. At this time, the oldest element in the array is the return value. (・ Separated by the previous frame (frameFs) and the subsequent frame (frameBs) with the frame in the middle of the array (frameC) as the boundary) ・ Check the position and size of the faces of frameF and frameC, and assign the same ID if they are considered to be continuous. -Compare frameF and frameB, and if there is a continuous face, but it does not exist in frameC, complement it to frameC. -Repeat for the combination of frameFs and frameBs. The tolerance when judging that the faces are continuous is specified by ALLOWED_GAP, but this time it is set to 5%. (Since frameF and frameB have multiple frames, the presence or absence of s indicates whether they are individual frames or the entire frame group.) Below is the source.

`frame_manager.py(FrameManager)`


class FrameManager:
    
    '''
A class that complements the continuity of the face and the missing face based on the passed frame and face recognition result.
Assign the same ID to consecutive faces.
    '''
    #Specify how many FaceFrames to check the continuity of the face
    LIST_SIZE = 5
    CENTER_INDEX = int(LIST_SIZE/2)
    #How much difference in position and size should be allowed when determining whether the faces between frames are the same.%Specified by
    ALLOWED_GAP = 5
        
    def __init__(self, height, width):
        '''
Specify the height and width of the video to be handled
        '''
        FrameManager.FRAME_HEIGHT = height
        FrameManager.FRAME_WIDTH = width

        self.__frames = [None]*self.LIST_SIZE
        
    
    def put(self, frame, coordinates):
        '''
Add a frame based on the passed frame and face recognition result
When adding, assign ID, check continuity, complement missing faces, LIST_Returns an instance of the SIZE th FaceFrame
As the processing at the end, after processing all the frames, LIST_Since SIZE frames remain in the Frame Manager, keep adding None until you finish putting out the remaining frames.

        return:An instance of FaceFrame. However, LIST_Returns None if there is no FaceFrame instance at the SIZE th.
        '''
        #Since None is passed when outputting the last remaining frame, in that case faceFrame is also set to None.
        if frame is None:
            faceFrame = None
        else:
            faceFrame = FaceFrame(frame, coordinates)

        #Move the list forward one by one and add an argument frame at the end. Since there are many random accesses in internal processing, I think it is desirable to manage them in an array.
        returnFrame = self.__frames[0]
        for i in range(0,len(self.__frames)-1):
            self.__frames[i] = self.__frames[i+1]
        self.__frames[FrameManager.LIST_SIZE-1] = faceFrame

        #Check continuity from the front and back frames
        # CENTER_Before that with INDEX as the boundary(i)rear(j)Check the continuity of the face with each combination
        for i in range(0, FrameManager.CENTER_INDEX):
            for j in range(FrameManager.CENTER_INDEX+1, FrameManager.LIST_SIZE):
                #Skip the None part
                if self.__frames[i] is not None and self.__frames[FrameManager.CENTER_INDEX] is not None and self.__frames[j] is not None:

                    #Check continuity and complement all frames in between
                    for k in range(i+1, j):
                        self.connectFrame(self.__frames[i], self.__frames[k], self.__frames[j])

        return returnFrame
        
        
    def connectFrame(self, frameF, frameC, frameB):               
        # frameF.faces and frameC.If there are consecutive faces in faces, give the same id.
        #TODO It is possible that the same id can be assigned to multiple faces. In the first place, in this case, the current design does not work, so I put it on hold.
        frontFaceNum = len(frameF.faces)
        centerFaceNum = len(frameC.faces)
        backFaceNum = len(frameB.faces)
        for i in range(0, frontFaceNum):
            #Keeps if the i-th face in the previous frame matches any of the faces in frame C
            matched = False
            for j in range(0, centerFaceNum):
                #If it is judged to be the same face, use the same ID
                if self.compare(frameF.faces[i], frameC.faces[j]) == True:
                    frameC.faces[j].id = frameF.faces[i].id
                    matched = True
                    break
                
            #Even if it is not in frameC, if it is in both frameF and frameB, it is considered that the face is also in framC in between and complemented.
            if matched == False:
                for k in range(0, backFaceNum):
                    if self.compare(frameF.faces[i], frameB.faces[k]):
                        #Add a face to the position / size between frameF and frameB
                        frameC.append(frameF.faces[i].id, ((frameF.faces[i].coordinate + frameB.faces[k].coordinate)/2).astype(np.int))
                        #Increase the number of faces by 1.(In case another face is found in the later process)
                        centerFaceNum += 1

                        #Infinite loop prevention
                        if(centerFaceNum>10):
                            break

       
    def compare(self, face1, face2):
        '''
Compare whether face1 and face2 are continuous.
        return:True if same, False if different
        '''
        result = True
        #Check if the difference in coordinates and face size is within the margin of error, and all errors(ALLOWED_GAP)Judge that they have the same face if they are inside
        #If the TODO frames are far apart, it is better to increase the tolerance accordingly.
        for i in range(0,4):
            if i%2 == 0:
                gap = ((float(face1.coordinate[i])-float(face2.coordinate[i]))/FrameManager.FRAME_HEIGHT)*100
            else:
                gap = ((float(face1.coordinate[i])-float(face2.coordinate[i]))/FrameManager.FRAME_WIDTH)*100
            if (-1*FrameManager.ALLOWED_GAP < gap < FrameManager.ALLOWED_GAP) == False:
                result = False
                break
        return result

To use it with this, if you create an instance of FrameManager and enter the frame and face information, It will return a FaceFrame with an ID.

In addition, when I review it, I check the continuity of IDs multiple times between the same frames, which makes it redundant. However, I close my eyes for the reason described later.

4. Built-in FrameManager

Incorporate the created FrameManager class into the overlay_movie.py created last time. After face recognition, first put the recognized face in FrameManager, and write the ID to the found face based on the output FaceFrame instance.

`overlay_movie2.py`


# -*- coding:utf-8 -*-

import cv2
import datetime
import numpy as np
from PIL import Image

import frame_manager

def overlay_movie2():

    #Specify the video to be input and the output path.
    target = "target/test_input.mp4"
    result = "result/test_output2.m4v"  #.I get an error if I don't use m4v

    #Loading videos and getting video information
    movie = cv2.VideoCapture(target) 
    fps    = movie.get(cv2.CAP_PROP_FPS)
    height = movie.get(cv2.CAP_PROP_FRAME_HEIGHT)
    width  = movie.get(cv2.CAP_PROP_FRAME_WIDTH)

    #Specify MP4V as the format
    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    
    #Open the output file
    out = cv2.VideoWriter(result, int(fourcc), fps, (int(width), int(height)))

    #Acquire the features of the cascade classifier
    cascade_path = "haarcascades/haarcascade_frontalface_alt.xml"
    cascade = cv2.CascadeClassifier(cascade_path)
    
    #Creating a FrameManager
    frameManager = frame_manager.FrameManager(height, width)

    #Specify the color of the rectangle that surrounds the recognized face. White here.
    color = (255, 255, 255) 
    
    #Read the first frame
    if movie.isOpened() == True:
        ret,frame = movie.read()
    else:
        ret = False

    count = 0

    #Continue to export frames while successfully reading frames
    while ret:
        
        #Convert to grayscale
        frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

        #Perform face recognition
        facerecog = cascade.detectMultiScale(frame_gray, scaleFactor=1.1, minNeighbors=1, minSize=(1, 1))
            
        #Put the recognized face in FrameManager
        managedFrame = frameManager.put(frame, facerecog)

        #After the 5th time, the frame will be returned from Frame Manager, so file output
        if managedFrame is not None:

            #Add a number to the recognized face
            for i in range(0,len(managedFrame.faces)):

                #Variables are prepared for easy handling
                tmpCoord = managedFrame.faces[i].coordinate
                tmpId = managedFrame.faces[i].id
                    
                print("Number of recognized faces(ID) = "+str(tmpId))
                                        
                #Surround with a rectangle
                cv2.rectangle(managedFrame.frame, tuple(tmpCoord[0:2]),tuple(tmpCoord[0:2]+tmpCoord[2:4]), color, thickness=2)
                    
                #Write face ID
                cv2.putText(managedFrame.frame,str(tmpId),(tmpCoord[0],tmpCoord[1]),cv2.FONT_HERSHEY_TRIPLEX, 2, (100,200,255), thickness=2)
    
            out.write(managedFrame.frame)
        if count%10 == 0:
            date = datetime.datetime.now().strftime("%Y/%m/%d %H:%M:%S")
            print(date + 'Current number of frames:'+str(count))
        
        count += 1
        ret,frame = movie.read()

        #End halfway
        if count > 200 :
            break

    print("Number of output frames:"+str(count))

    
if __name__ == '__main__':
    overlay_movie2()

5. Result

You can safely assign an ID to your face, ID割り振り1人.JPG

It is now possible to identify a specific person from multiple people by ID, and it is now possible to identify consecutive faces by one ID. ID割り振り3人.JPG

6. Finally

All you have to do is enter the ID of the face you want to overwrite and overwrite the corresponding face. I would like to say, but that was not the case. I've made it so far, but this program can't serve its purpose. The recognition accuracy of the face in the video to be recognized is poor, and even if the gap is complemented, it cannot be handled. (I was looking away while feeling thin) Therefore, I would like to explore another policy for this program as a storehouse. It will continue to the next.