[GAN] I saw the darkness trying to further evolve the final evolution of Pokemon

Pokemon generative model

Let's further evolve the final evolution system of Pokemon by using the Pokemon generative model using StyleGAN2 summarized in Past article. I am thinking about the bad thing. The ones I implemented recently often work, and I came up with the idea when I was thinking of becoming a god.

This time, I would like to introduce the story that challenged the evolution of Pokemon as an application example while explaining the technology to be used in a little more detail.

The specific method is as follows. This time, I hypothesized that Pokemon that evolve on a straight line are lined up. Pikachu → Raichu →? I will explain assuming the update of the final evolution.

--Load the trained StyleGAN2 model --Estimate latent variables that generate Pikachu and Raichu --Calculate a dimensional vector that passes through two points, Pikachu and Raichu --If you move it in that vector direction, Pikachu-derived Pokemon will appear (should)

evolution.png

Introduction of related research

Image2StylyGAN Well-learned and expressive generative models such as PGGAN and StyleGAN are known to produce new faces that do not exist in the dataset. The concept of Image2StyleGAN is to use this property to estimate latent variables that produce arbitrary images in a wide latent variable space.

ss01.png

There are two possible methods for estimating the latent variable corresponding to a given image.

--Learning the Encoder corresponding to the generative model (Decoder) --Search the latent variable space to minimize the similarity between the generated image and the desired image as a loss function.

The former seems to be empirically known to be unsuccessful, and Image2StyleGAN uses the latter optimization method.

Perceptual Loss is used as the loss function, and the similarity between images is calculated as loss by comparing the features obtained by inputting images into the VGG16 model trained by ImageNet. By optimizing the perceptual loss with Adam, it is possible to find the generated image that matches the perceptual features seen by VGG16. Zhang, 2018 seems to be famous as a Perceptual Model, and I will use it this time as well.

If you just want to calculate the similarity between images, you may relearn something FaceNet that is being trained for that purpose and create your own model. I don't know.

As a reproduction experiment, I replaced the model with a trained model of StyleGAN2 (config-f) and reconstructed Prime Minister Abe.

shinzo.pngcompare.png

InterFaceGAN This study is also about the behavior of images in latent variable space in a well-learned generative model. This study suggests that a separated hyperplane exists in the latent variable space for a certain attribute. If the separated hyperplane can be estimated, its attributes can be changed by moving the latent variable in the direction of its normal vector.

ss02.png

The method for estimating the separation hyperplane is simple and quite muddy. Take the following steps. As an example, consider estimating the hyperplane of the glasses.

--Learning a model that estimates the presence or absence of glasses with a score of 0-1 --Randomly generate tens of thousands of sample images from the generated model --Calculate the score of the glasses for all sample images and associate the latent variable coordinates with the score. --Calculate the hyperplane that best separates the image with glasses and the image without glasses with SVM --Change the attribute value by moving the latent variable in the direction of the normal vector of the estimated separation hyperplane.

By the way, when I replaced it with the StyleGAN2 model and changed the attributes in the reproduction experiment, it looked like this.

sex gender.png

age age.png

It has been long, but thanks to this research, we can see that the separation hyperplane of the attribute is formed in the learning in the latent variable space. In other words, it seems that the evolution of Pokemon is also expressed in a certain dimensional direction. However, unlike the attributes of the human face, there are too many types of Pokemon, so I think that this is true only for the evolutionary lineage of one race.

Try

The model used this time learned StyleGAN2 (config-f) with the dataset of MosnterGAN. This is a model. The image size is 64x64, and about 15,000 images are learned for 1120kimg.

The generated image looks like this fakes001200.png

To be honest, the quality is not good, but I'm using it because I think it's possible to handle the images contained in the data set. (Since it is stagnant with a FID score of about 50, I will stop here)

Image Embedding with Image2StyleGAN

First, check if there are latent variables that can reproduce Pikachu and Raichu.

Pikachu pikachu_compare.png

Raichu raichu_compare.png

It's completely Pachimon ... At this point, I'm desperate for the low expressiveness of the model, but I'll try to the end.

Move in the dimensional direction passing between the two points before and after evolution

I tried to linearly interpolate from Pikachu to the evolution direction little by little.

pikachu_interp.png

The result seems to evolve around the evil / electric type. Since the shape gradually collapses, I think that it exceeds the range that can be expressed in the latent variable space. It's lack of expressiveness.

It's frustrating, but since I've come this far, I've tried various things.

diguda_interp.png

pikusi_interp.png

It's almost like a high degree of mental pollution ...

Summary

I tried to evolve Pokemon using the StyleGAN2 Pokemon generation model that I learned by myself, but it ended up with a subtle feeling.

As a solution, we should consider improving the learning model. Due to lack of expressiveness, it is possible that the amount of data required to capture various Pokemon has not been reached, so I am thinking of expanding the data and increasing the weight of the dataset in the future.

I put in different color data, but if different colors are allowed, I think it's okay to put in a lot of images that have undergone color tone conversion of the entire image. It was a theme that I would like to try again by increasing the number of images to about 50,000 by expanding the data and learning!

nuo-.png

If you have any advice, such as those who are familiar with it, we will be happy to cry!

Recommended Posts

[GAN] I saw the darkness trying to further evolve the final evolution of Pokemon
The story of trying to reconnect the client
I tried to touch the API of ebay
I tried to correct the keystone of the image
I want to customize the appearance of zabbix
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I can't find the clocksource tsc! ?? The story of trying to write a kernel patch
I want to grep the execution result of strace
I want to fully understand the basics of Bokeh
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to classify the voices of voice actors
I want to increase the security of ssh connections
I tried to summarize the string operations of Python
I tried to find the entropy of the image with python
[Horse Racing] I tried to quantify the strength of racehorses
I tried to find the average of the sequence with TensorFlow
I want to use only the normalization process of SudachiPy
I want to get the operation information of yahoo route
I made a function to check the model of DCGAN
[Python] I tried to visualize the follow relationship of Twitter
I want to judge the authenticity of the elements of numpy array
[Machine learning] I tried to summarize the theory of Adaboost
I tried to fight the Local Minimum of Goldstein-Price Function
Keras I want to get the output of any layer !!
I want to know the legend of the IT technology world
What I saw by analyzing the data of the engineer market
I sent the data of Raspberry Pi to GCP (free)