As a programming practice, I decided to implement something and decided to detect smiles. Deep Learning is making a lot of noise today, but it is implemented in a fairly old-fashioned way. Since it is an appropriate system that I made with my own ideas, I would appreciate it if you could comment if you have any suggestions.
The flow of smile detection is as follows.
To use the cascade classifier, you will need haarcascade_frontalface_alt.xml from here, so please clone it.
One Class SVM is often used for problems where teacher data is difficult to collect. This time, I will use it as an anomaly detection problem. In other words, the state of smiling is defined as "normal", and the other states are defined as "abnormal".
In a normal SVM (Support Vector Machine), supervised learning is performed to find the identification boundary. So, speaking of the smile detection problem, it is an image of learning data such as "true face" in addition to "smile". One Class SVM, on the other hand, performs unsupervised learning to determine whether it is "normal" or "abnormal." Therefore, if you have the learning data of "smile", you can distinguish "smile" from "other than that".
There are three phases.
main.py
import numpy as np
import cv2
from skimage.feature import hog
from sklearn.svm import OneClassSVM
from sklearn.decomposition import PCA
import pickle
n = 3
n_dim = 4
alpha = - 1.0e+6
th = 20 #3
nu = 0.2 #0.1 #Percentage of outliers in input data
font = cv2.FONT_HERSHEY_COMPLEX
train_data = "./dataset/train/train.csv"
weights = "./dataset/weights/weights.sav"
weights_pca = "./dataset/weights/weights_pca.sav"
f_ = cv2.CascadeClassifier() # "./cascades/haarcascade_fullbody.xml"
f_.load(cv2.samples.findFile("./cascades/haarcascade_frontalface_alt.xml"))
def preprocess(image):
frame = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
frame = cv2.equalizeHist(frame)
return frame
def data_collect():
feature = []
capture = cv2.VideoCapture(0)
while (True):
ret, frame = capture.read()
frame = preprocess(frame)
face = f_.detectMultiScale(frame) # ,scaleFactor=1.2
for rect in face:
cv2.rectangle(frame, tuple(rect[0:2]), tuple(rect[0:2] + rect[2:4]), (255, 255, 0), thickness=2)
face_frame = frame[rect[1]:rect[1] + rect[3], rect[0]:rect[0] + rect[2]]
face_frame = cv2.resize(face_frame, (60, 60))
hog_f_, im = hog(face_frame, visualise=True,transform_sqrt=True)
feature = np.append(feature,hog_f_)
np.savetxt(train_data,feature.reshape(-1,2025), delimiter=",")
cv2.putText(frame, "please smile for collecting data!", (10, 100), font,
1, (255, 255, 0), 1, cv2.LINE_AA)
cv2.waitKey(1)
cv2.imshow("face", frame)
def train():
x_train = np.loadtxt(train_data,delimiter=",")
pca = PCA(n_components=n_dim)
clf = OneClassSVM(nu=nu, gamma=40/n_dim)#1/n_dim
z_train = pca.fit_transform(x_train)
clf.fit(z_train)
pickle.dump(pca, open(weights_pca, "wb"))
pickle.dump(clf,open(weights,"wb"))
def main():
clf = pickle.load(open(weights,"rb"))
pca = pickle.load(open(weights_pca, "rb"))
capture = cv2.VideoCapture(0)
while(True):
ret,frame = capture.read()
frame = preprocess(frame)
face = f_.detectMultiScale(frame)
for rect in face:
cv2.rectangle(frame,tuple(rect[0:2]),tuple(rect[0:2]+rect[2:4]),(255,255,0),thickness=2)
face_frame = frame[rect[1]:rect[1]+rect[3],rect[0]:rect[0]+rect[2]]
face_frame = cv2.resize(face_frame,(60,60))
feature , _ = hog(face_frame,visualise=True,transform_sqrt=True)
z_feature = pca.transform(feature.reshape(1,2025))
score = clf.predict(z_feature.reshape(1,n_dim))
if score[0]== 1:
cv2.putText(frame, "smile!", (10, 100), font,
1, (255, 255, 0), 1, cv2.LINE_AA)
cv2.waitKey(1)
cv2.imshow("face",frame)#Rainy day v,Specified by u
if __name__ == '__main__':
data_collect() #First, collect smile data by turning only with this function
#train() #Smile data(csv)Read and learn svm
#main() #Smile detection with the learned model
(Please refer to here for the exact implementation. One Class SVM used the scikit-learn implementation. When actually using it, create a directory as follows. I would appreciate it if you could point out any mistakes in the code. OneClassSVM/ ├ dataset/ │ └ train/ │ └ weights/ ├ cascades/ └ main.py
After cutting out the face image with a cascade classifier, it is resized to 60x60. (The size was decided appropriately.) Before learning SVM, PCA is reducing the dimension of HOG features. The reason for this is that I was worried about the curse because the dimension of HOG exceeds 2000. You can specify the number of dimensions after reduction with n_dim. This time it was set to 4. Also, in learning One Class SVM, it took some time to adjust the hyperparameters. There are nu and gamma in the fit argument of OneClass SVM. It seems that nu is the rate of abnormality in the training data, and gamma is the number of dimensions of 1 / feature quantity. However, since there are no abnormal values in the training data in the first place, as a result of worrying about what to do, I ended up looking for a good parameter while turning main () and doing train (). Finally, nu = 0.3, gamma = 50 / feature quantity dimension number. You may need to adjust nu and gamma when using it.
I experimented by turning main (). The accuracy when the face is facing the front is reasonable (about 80% perceived), but since I put a smile level smile in the learning data, it became difficult to learn the boundary between the true face and the smile. It may have been done.
Face detection by OpenCV Haar-Cascade Abnormal value detection using One Class SVM