GANs: The technology of hostile generation networks is advancing day by day. Even I, who works in a non-IT system (engineer in the manufacturing industry), is a technology that is of great interest. It's best to implement and play with it to gain a deeper understanding of the technology. Therefore, this time, I would like to implement an algorithm announced in 2019 called ** Single GANs **. Since it is an algorithm that generates a composite image from a single image, it is an article that I actually tried to move.
However, implementing this latest paper had some hurdles for me in itself, so I am writing an article focusing on those difficulties.
Here is the paper I want to implement this time.
SinGAN: Learning a Generative Model from a Single Natural Image https://arxiv.org/abs/1905.01164
Only a single image can be used as teacher data, and a new image close to that teacher data can be generated. In addition, you can generate an image close to the original image from a handwritten image (Paint to image), or superimpose another image and convert it to the same style (Harmonization).
I cannot fully understand the detailed algorithm explanation, so I would appreciate it if you could see the explanations of other people.
[Paper commentary] SinGAN: Learning a Generative Model from a Single Natural Image https://qiita.com/takoroy/items/27f918a2fe54954b29d6
When I read SinGAN's paper, it was amazing https://qiita.com/yoyoyo_/items/81f0b4ca899152ac8806
Well, in order to implement it, I first downloaded the set of programs from this github and unzipped the zip file. https://github.com/tamarott/SinGAN
By the way, when implementing the content of the dissertation, it is often instructed by a command from the terminal as shown in the image below.
Terminal
python -m pip install -r requirements.txt
When I first tried to install the required libraries with this, I got the following error message.
This means that python.exe cannot be started by the command python = the path is not passed. Therefore, it is necessary to make settings for passing the path.
Right-click on the Windows icon ⇒ Make settings (our Windows 10 Home).
Then, type environment in the search field, and the edit field for system environment variables will appear.
Then click Environment Variables from System Properties.
Edit the Path here. Select New and enter the path to the folder where python.exe is stored. This will allow the path to pass and solve the problem.
If successful, you can confirm it like this.
Next, I thought that the python command passed, and when I proceeded to the next, such a command appeared. I understand that I run random_samples.py, but it's a command with a double hyphen after that. Upon examination, this uses a module called Argument Parser that allows you to specify arguments from terminal commands.
Reference URL https://qiita.com/kzkadc/items/e4fc7bc9c003de1eb6d0
Terminal
python random_samples.py --input_name <training_image_file_name> --mode random_samples --gen_start_scale <generation start scale number>
It is convenient to be able to specify it from the command line, but what if you want to use it with the kernel on vs code or jupyter? It was described in detail at this URL. http://flat-leon.hatenablog.com/entry/python_argparse
# 3.Parse startup parameters using ArgumentParser object
args = parser.parse_args()
It seems that the parameters at startup are analyzed here, so it seems that it can be started on the kernel by creating a list etc. and passing it here.
Now that you understand the file path and argument, let's run it. However, I found that it takes a lot of time to learn on a small PC.
In the first learning this time, about 3 hours passed in the cross section where 6 times (in scale 5) out of 9 times (scale 8) were executed. After all, you can see that the calculation of image processing including GAN takes a very long time. Therefore, I decided to use the GPU of Google Colab obediently here.
Upload this folder together on Google Drive. Then, first move the directory to that storage folder.
GoogleColab
cd /content/drive/My Drive/SinGAN-master
Now you can move .py files etc. by using Linux commands.
We will learn to resemble the original image from the noise image. The initial learning process starts with a very small image size and gradually grows to the original image size.
GoogleColab
!python main_train.py --input_name cows.png
When using Linux commands, it works by putting! First. If you move it, you can proceed very quickly. The calculation was completed in about 30 minutes. Installing the library is also very easy, so the one that takes a lot of processing time is Google Colab. ..
Now, let's compare the generated image with the original image. The increase in the number of scales is the result of the increase in the number of calculations. ** Hmm, it's indistinguishable from the real thing. ** The image size is small when the number of scales is low, but it is the same size for easy comparison. By doing this, you can see that the image gradually changes to a clearer image closer to the original image. Furthermore, you can see that not only the image quality improves, but also the placement of the cows is different each time. You can see that it is not a process that simply improves the image quality.
Next, run a program that resembles the teacher data image from the handwritten image. When doing this, the teacher data you want to resemble must first be trained above.
GoogleColab
!python paint2image.py --input_name volacano.png --ref_name volacano3.png --paint_start_scale 1
Let's see the result. ** I couldn't reproduce it well. ** The original image is at the bottom right. And the smaller the value of start_scale, the higher the number of trainings. This time, I feel that start_scale3 and 4 are the closest.
Probably, it seems difficult to imitate if the original image is not similar in handwriting **.
Next, the size of the image is changed based on the original image.
GoogleColab
!python random_samples.py --input_name cows.png --mode random_samples_arbitrary_sizes --scale_h 5 --scale_v 4
scale_h is the horizontal scale and 1 is 1x. Also, scale_v is the vertical scale.
As a test, I made a large image. ** But it feels bad. .. It has become an image of cows crowded in the prairie. Excuse me. .. ** **
Finally, the process is to modify it according to the style of the original image. For this process as well, you need to train with train first. This time, I tried to combine the photo I took with the free image of the fish.
GoogleColab
!python harmonization.py --input_name fish.png --ref_name fish1.png --harmonization_start_scale 1
What a big fish turned into a school of light blue like Swimmy (or Pokemon Yowashi). It must have been processed based on the original vermilion school of fish. It's a very interesting algorithm.
I actually moved my hands and played with Single GANs, the latest paper of GAN. It turned out to be very easy to use.
I learned a lot about building the environment when implementing it. In particular, I was impressed by the fact that Google Colab makes it easy to move even models with a high computational load and see the results. I felt the greatness of Google again.
This time, I focused on implementing and playing with it, so I would like to deepen my understanding of the theoretical contents. I have already published some derived papers, so I would like to learn about their relevance.
Recommended Posts