[DOCKER] Extract face images from videos with ffmpeg and OpenVINO

Tried before OpenVINO's [interactive_face_detection_demo](https://docs.openvinotoolkit.org/2020.4/omz_demos_interactive_face_detection_demo_READme. In cooperation, only the face image is extracted from the video. スクリーンショット 2020-10-03 024809.png

1. Preparation of video

It is assumed that the file name is ʻinput.mp4` and it is located in the download folder of Windows 10.

2. Modify video to 1 fps

interactive_face_detection_demo is inference for all frames, so it is convenient for shortening the processing time and when extracting with ffmpeg described later. Convert input.mp4` to 1 frame per second.

cd /cygdrive/c/Users/${USER}/Downloads/
ls -l input.mp4

mkdir -p output/
ffmpeg -i input.mp4 -r 1 output/input_r1.mp4 -y

3. Face detection with OpenVINO demo

Almost the same as the one I tried before, but I'm getting the raw output with the -r option.

echo 'source ${INTEL_OPENVINO_DIR}/bin/setupvars.sh

cd ${INTEL_CVSDK_DIR}/inference_engine/demos/
sed -i "s/*)/interactive_face_detection_demo)/g" CMakeLists.txt
./build_demos.sh

${INTEL_CVSDK_DIR}/deployment_tools/tools/model_downloader/downloader.py \
  --name face-detection-adas-0001 \
  --output_dir /content/model/ \
  --precisions FP32

echo `date`: start detection

/root/omz_demos_build/intel64/Release/interactive_face_detection_demo \
  -i /Downloads/output/input_r1.mp4 \
  -m /content/model/intel/face-detection-adas-0001/FP32/face-detection-adas-0001.xml \
  -no_show \
  -no_wait \
  -async \
  -r > /Downloads/output/raw.txt

echo `date`: end detection' | docker run -v /c/Users/${USER}/Downloads:/Downloads -u root -i --rm openvino/ubuntu18_dev:2020.4

4. Extract from raw.txt with ffmpeg

raw.txt


~~
[116,1] element, prob = 0.0198675    (-4,209)-(48,48)
[117,1] element, prob = 0.0198515    (444,146)-(68,68)
[0,1] element, prob = 0.999333    (222,115)-(205,205) WILL BE RENDERED!
[1,1] element, prob = 0.0601832    (405,393)-(94,94)
~~

As mentioned above, in raw.txt, candidates are output in order of proximity to the face for each frame, and face-like candidates (evaluation value of 0.5 or more) are marked with WILL BE RENDERED!.

THRESHOLD=0.9
perl -ne '$i++ if m{^\[0,1\]}; printf "ffmpeg -loglevel error -ss ".($i-1)." -i input_r1.mp4 -vframes 1 -vf crop=$4:$5:$2:$3 %05d.jpg -y\n", ++$j if m{([0-9.]+)\s+\((\d+),(\d+)\)-\((\d+),(\d+)\)} and $1 > '${THRESHOLD} raw.txt > ffmpeg.sh

Since the number of [0,1] is the number of frames, pass it to the -ss option later (since it is a 1-frame video per second, you can pass the number of frames to the -ss option that passes the number of seconds) .. Only those with a high facial evaluation value (WILL BE RENDERED! Is 0.5 or more, but since it is quite a monkey, the above is high), from the coordinates crop .html # crop) Get the filter parameters and print the ffmpeg command (it always seems to be square).

ffmpeg.sh


ffmpeg -loglevel error -ss 32 -i input_r1.mp4 -vframes 1 -vf crop=36:36:109:178 00001.jpg -y
ffmpeg -loglevel error -ss 36 -i input_r1.mp4 -vframes 1 -vf crop=34:34:107:177 00002.jpg -y
ffmpeg -loglevel error -ss 37 -i input_r1.mp4 -vframes 1 -vf crop=32:32:108:178 00003.jpg -y
ffmpeg -loglevel error -ss 39 -i input_r1.mp4 -vframes 1 -vf crop=32:32:109:179 00004.jpg -y
ffmpeg -loglevel error -ss 40 -i input_r1.mp4 -vframes 1 -vf crop=37:37:97:178 00005.jpg -y
ffmpeg -loglevel error -ss 41 -i input_r1.mp4 -vframes 1 -vf crop=34:34:46:176 00006.jpg -y
ffmpeg -loglevel error -ss 44 -i input_r1.mp4 -vframes 1 -vf crop=64:64:552:236 00007.jpg -y

Since a file containing the above commands is created,

sh ffmpeg.sh

By executing it in the shell, each face image is output.

5. Resize as needed

mogrify -resize 128x128! *.jpg

It seems convenient to have the same size with ImageMagick mogrify.

Recommended Posts

Extract face images from videos with ffmpeg and OpenVINO
FFmpeg Split screen and combine multiple videos
How to set up computer vision for tracking images and videos with TrackingJs
Extract subqueries from Relation objects assembled with ActiveRecord
Just input and output images with Spring MVC
[Beginner] Upload images and files with Spring [Self-satisfaction]
Easily edit performance videos with ffmpeg using Ruby
Load images from external sites / domains with Playframework2