This time, I learned about GAN (Generative adversarial networks), which is an algorithm that is indispensable in the field of deep learning. GAN is a technology created by an American researcher called lan Goodfellow in 2014. By training two networks hostilely, a network that generates data that is indistinguishable from the real thing. That is. In Japanese, it is literally translated as hostility generation network, but ** the name is like kitchen 2, and I like it because it is cool. ** **
This time, I have summarized my own interpretation of DCGAN, which is mentioned as a tutorial in the URL of tensorflow. https://www.tensorflow.org/tutorials/generative/dcgan?hl=ja This tutorial deals with an algorithm that generates MNIST (handwritten numbers) using a technique called DCGAN (Deep Convolutional GAN).
DCGAN is a generative model proposed in a paper presented at ICLR2016 (AI field conference). The difference from the so-called GAN is that it uses Deep Convolutional = convolution without performing a fully connected layer (Affine). The fully connected layer model has a characteristic that the weighting coefficient is very large and overfitting is likely to occur, but it seems that overfitting can be prevented by configuring only convolution. On the other hand, convergence tends to be slow.
URL that was helpful for GAN in general https://blog.negativemind.com/2019/09/07/deep-convolutional-gan/
gan.py
from __future__ import absolute_import, division, print_function, unicode_literals
In Python 2.6 and later, this code allows you to change the behavior of some functions and instructions to the behavior of Python 3 series. I understand that it is read when you want to use 3 series functions in Python 2 series.
gan.py
import tensorflow as tf
import glob
import imageio
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
from tensorflow.keras import layers
import time
from IPython import display
I loaded imageio and PIL (Pillow) as libraries for processing images. OpenCV is famous as an image processing library, but Pillow is simpler and easier to understand. I also loaded glob as a module that can get the file pathname. You can use special characters such as wildcards * to get paths such as file names and folder names that meet the conditions in a list or iterator.
https://note.nkmk.me/python-glob-usage/
Next is the IPython.display module, which has the ability to embed audio and video on a Notebook.
https://knowledge.sakura.ad.jp/17727/
gan.py
(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1]
This time, MNSIT (handwritten digit image) is read from the keras dataset. They are listed as a sequence and are standardized.
gan.py
BUFFER_SIZE =60000
BATCH_SIZE =256
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
Dataset
https://qiita.com/Suguru_Toyohara/items/820b0dad955ecd91c7f3
gan.py
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 256)))
assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size
model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
assert model.output_shape == (None, 7, 7, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 28, 28, 1)
return model
gan.py
def make_discriminator_model():
model = tf.keras.Sequential()
model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
input_shape=[28, 28, 1]))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(0.3))
model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(0.3))
model.add(layers.Flatten())
model.add(layers.Dense(1))
return model
https://www.slideshare.net/HiroyaKato1/gandcgan-188544721 https://www.hellocybernetics.tech/entry/2018/05/28/180012 https://keras.io/ja/getting-started/sequential-model-guide/
gan.py
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)
Generally, to train a neural network model, adjust the weight parameter so that the gradient of the weight parameter of this loss function becomes small. Define a function that increases the ability of the Discriminator to distinguish between genuine and fake, and increases the ability of the Generator to deceive the Discriminator. This part is the definition that represents the technology peculiar to GAN.
gan.py
EPOCHS =50
noise_dim = 100
num_examples_to_generate = 16
seed= tf.random.normal([num_examples_to_generate,noise_dim])
def train(dataset, epochs):
for epoch in range(epochs):
start = time.time()
for image_batch in dataset:
train_step(image_batch)
# Produce images for the GIF as we go
display.clear_output(wait=True)
generate_and_save_images(generator,
epoch + 1,
seed)
# Save the model every 15 epochs
if (epoch + 1) % 15 == 0:
checkpoint.save(file_prefix = checkpoint_prefix)
print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))
# Generate after the final epoch
display.clear_output(wait=True)
generate_and_save_images(generator,
epochs,
seed)
After 10 epochs
After 30 epochs
After 50 epochs
Since it has exceeded 30 epochs, the image looks quite numerical. In my environment *, it took about 5 minutes for 1 epoch, so it took about 250 minutes. Take advantage of Google Colab, which has a GUI. ..
This is the first GAN I implemented, but I enjoyed it because it is a program that fully demonstrates the calculation functions that computers are good at, including thinking and calculations. Many models have been proposed for generators, classifiers, and the concept of loss functions, so I would like to learn and summarize them one by one.
I put all the codes here. https://github.com/Fumio-eisan/dcgan20200306
Recommended Posts