We have summarized DCGAN using the Microsoft Cognitive Toolkit (CNTK).
Part 1 prepares you to train DCGAN using the Microsoft Cognitive Toolkit.
This time, I will use DCGAN to train the face generation model of my favorite artist. We need a training dataset to do this, so we use Microsoft Azure's Bing Image Search API v7 to scrape the images.
I will introduce them in the following order.
Not limited to this theme, the training dataset is the bottleneck when trying out deep learning.
And when you think of collecting data, scraping comes to mind, but it seems that scraping often causes you to stumble.
Azure Cognitive Services from Microsoft Azure, run by Microsoft, offers a relatively good solution to this problem. Among them, this time we will utilize Bing Image Search for collecting images.
Go to the page above and click Try Bing Image Search to see your plans. You can create a Microsoft Azure account on this machine, or you can get a credit card-free 7-day trial API key.
By signing in with one of Microsoft, Facebook, LiknedIn, or GitHub account, you will receive an email saying that you have obtained an API key to the email address associated with the account you used for registration, and you will receive the API key you obtained. You can check it.
Once you have the Azure Cognitive Services API key, you can use the Bing Image Search REST API to get the URL of the image.
Refer to Quickstart: Search for images using the Bing Image Search REST API and Python I created a program.
The requests library is used for HTTP operations.
For more information on API parameters, see Image Search API v7 reference. ..
In this DCGAN, we want to focus on the face, so we need to cut out only the face from the acquired image.
Therefore, we will cut out only the face using face detection by Haar Cascades [1], which is also introduced in the OpenCV tutorial. Haar Cascades distinguishes by the rectangular features shown in the figure below and AdaBoost [2].
Face detection is itself a machine learning-based detector that requires a lot of face images, but since the trained XML file was included when you installed the opencv-contrib-python package, It can be processed at high speed.
Finally, as before, create a text file for ImageDeserializer that loads the images used for training, and you are ready to go. However, we will not use category labels this time, so leave all labels at 0.
ImageDeserializer is introduced in Computer Vision: Image Classification Part1 --Understanding COCO dataset.
・ CPU Intel (R) Core (TM) i7-7700K 3.60GHz
・ Windows 10 Pro 1909 ・ Python 3.6.6 ・ Opencv-contrib-python 4.1.1.26 ・ Requests 2.22.0
The implemented program is published on GitHub.
dcgan_scraping.py
I will extract and supplement some parts of the program to be executed.
The obtained API key is given to the argument subscription_key, and the keyword of the image to be collected is given to the argument search_term as a character string. There is no problem even if the search keyword is Japanese.
dcgan_scraping.py
subscription_key = "your-subscription-key"
search_url = "https://api.cognitive.microsoft.com/bing/v7.0/images/search"
search_term = "your-search-keyword"
The bing_image_search function saves the images downloaded from the URL obtained in the BingImageSearch directory. This time, I used 1000 as a guide.
It is necessary to implement various exception handling in the scraping program, but this time only ConnectionError and Timeout are implemented.
bing_image_search
except ConnectionError:
print("ConnectionError :", image_url)
continue
except Timeout:
print("TimeoutError :", image_url)
continue
After downloading the image, it is better to check the image once here to see if the saved image is what you want. This is because it may contain images that are completely unrelated.
The XML file used for face cropping takes the PATH of haarcascade_frontalface_default.xml in the data directly under cv2 as an argument.
face_detection
face_cascade = cv2.CascadeClassifier(path)
The cropped face image is saved in the faces directory. The minimum face size to detect is set to 50x50. Face detection by Haar Cascades can sometimes fail, so it's a good idea to make sure you're able to cut out your face again.
The function flip_augmentation performs a left-right flip to add or subtract training data.
The function dcgan_mapfile creates a text file for ImageDeserializer.
When you run the program, it gets the URL of the image, follows each URL to download the image, and then applies face detection to generate the face image.
./BingImageSearch/image_0000.jpg
./BingImageSearch/image_0001.jpg
...
I wanted to get 1000 images, but I could only prepare 612 face images even if I performed inversion.
Now that we have created a real image and a text file to use for training, Part 2 will train DCGAN with CNTK.
Bing Image Search Quickstart: Search for images using the Bing Image Search REST API and Python Image Search API v7 reference Requests: HTTP for Humans™ Face Detection using Haar Cascades
Computer Vision : Image Classification Part1 - Understanding COCO dataset
Recommended Posts