Have you ever wanted to convert a low resolution image to a high resolution image?
In June 2020, Duke University announced a method called ** PULSE ** that raises the high-resolution conversion from 8x to 64x. This makes it possible to convert a mosaic-like face image that does not make the eyes and mouth clear ** with high image quality that allows you to see pores and texture. ** **
This time, I will introduce the result of trying the source code published on Github. The code is created by Google Colab and posted on Github, so if you want to try it yourself, this ** "link" ** Click /Pulse_test.ipynb) and click the ** "Colab on Web" ** button at the top of the displayed sheet to move it.
The algorithm uses the ** trained GAN model ** to generate a ** "image converted from a high resolution image to a low resolution" **, and the difference between this and the ** "low resolution image" **. As ** loss **, we want a high resolution image that minimizes ** loss **.
What is worrisome is that since the trained GAN model is used, the images used for training can be converted to high resolution well, but the images not used for training also work well.
So, this time, I would like to intentionally create a low resolution image (32 x 32 pixels) from various face images and see how accurately this can be converted to a high resolution (1024 x 1024 pixels). The number of optimizations (steps) is 1000.
The trained GAN model used in ** PULSE ** is StyleGAN. Therefore, the FFHQ dataset is used for learning, and this facial image should at least work. Now let's take three images from the FFHQ dataset.
Low Resolution is a low resolution image (32 x 32), High Resolution is a high resolution converted image (1024 x 1024), and Real is the original image that created Low Resolution. Therefore, the accuracy of high resolution conversion can be determined by how close ** High Resolution is to Real. ** Obviously, the trained image can be converted to high resolution without any problem.
Then, what about unlearned facial images (foreigners)? Let's try with the face images of three foreigners obtained by Web search.
Hmmm, if you are a foreigner, even a face image that you have not used for learning seems to work.
Then, what about unlearned facial images (Japanese)? Let's try with the face images of three Japanese people obtained by Web search. Certainly, there weren't many Japanese in the FFHQ dataset, so what happens?
It's better than you think. I think that High Resolution has a little different eyes and wrinkles on the face compared to Real, but it's OK. Increasing the number of optimizations (steps) seems to improve a little more.
By the way, until now, we have conducted a test to intentionally reduce a high-quality image to 32 x 32 pixels and convert it to a high-quality image, but what we want to do in the real world is to create a low-quality image from the beginning. I want to convert it to high image quality. Let's do that test here.
Former Nogizaka46's first captain Reika Sakurai's [Wekipedia](https://ja.wikipedia.org/wiki/Reika Sakurai) has an image like this at the beginning. The image size is 190 x 253 pixels, and if it is only the face part, it will be about 90 x 90 pixels. Now, let's cut out the face image from this image and convert it to high quality. I will try optimizing the number of steps (steps) with 6000.
Oh! Don't you think this is pretty good?
I was surprised that it was possible to convert high image quality beyond expectations. If you think about it, a 32x32 pixel image has the same amount of information as a 1024-dimensional vector, so there is plenty of potential to bring out a high-quality image.
Even so, I thought about Duke's great fun. The method called PULSE, which is completely different from the conventional method, is interesting.
Recommended Posts