What are GAN and DCGAN?

This time, I learned about GAN (Generative adversarial networks), which is an algorithm that is indispensable in the field of deep learning. GAN is a technology created by an American researcher called lan Goodfellow in 2014. By training two networks hostilely, a network that generates data that is indistinguishable from the real thing. That is. In Japanese, it is literally translated as hostility generation network, but ** the name is like kitchen 2, and I like it because it is cool. ** **

This time, I have summarized my own interpretation of DCGAN, which is mentioned as a tutorial in the URL of tensorflow. https://www.tensorflow.org/tutorials/generative/dcgan?hl=ja This tutorial deals with an algorithm that generates MNIST (handwritten numbers) using a technique called DCGAN (Deep Convolutional GAN).

DCGAN is a generative model proposed in a paper presented at ICLR2016 (AI field conference). The difference from the so-called GAN is that it uses Deep Convolutional = convolution without performing a fully connected layer (Affine). The fully connected layer model has a characteristic that the weighting coefficient is very large and overfitting is likely to occur, but it seems that overfitting can be prevented by configuring only convolution. On the other hand, convergence tends to be slow.

URL that was helpful for GAN in general https://blog.negativemind.com/2019/09/07/deep-convolutional-gan/

What is the future module?

`gan.py`



from __future__ import absolute_import, division, print_function, unicode_literals

In Python 2.6 and later, this code allows you to change the behavior of some functions and instructions to the behavior of Python 3 series. I understand that it is read when you want to use 3 series functions in Python 2 series.

About loaded libraries and modules

`gan.py`


import tensorflow as tf
import glob
import imageio
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
from tensorflow.keras import layers
import time

from IPython import display

I loaded imageio and PIL (Pillow) as libraries for processing images. OpenCV is famous as an image processing library, but Pillow is simpler and easier to understand. I also loaded glob as a module that can get the file pathname. You can use special characters such as wildcards * to get paths such as file names and folder names that meet the conditions in a list or iterator.

https://note.nkmk.me/python-glob-usage/

Next is the IPython.display module, which has the ability to embed audio and video on a Notebook.

https://knowledge.sakura.ad.jp/17727/

Data set loading

`gan.py`


(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data()

train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5 # Normalize the images to [-1, 1]

This time, MNSIT (handwritten digit image) is read from the keras dataset. They are listed as a sequence and are standardized.

`gan.py`



BUFFER_SIZE =60000
BATCH_SIZE =256
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(BUFFER_SIZE).batch(BATCH_SIZE)

Dataset

https://qiita.com/Suguru_Toyohara/items/820b0dad955ecd91c7f3

Generator model definition

`gan.py`



def make_generator_model():
    model = tf.keras.Sequential()
    model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Reshape((7, 7, 256)))
    assert model.output_shape == (None, 7, 7, 256) # Note: None is the batch size

    model.add(layers.Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False))
    assert model.output_shape == (None, 7, 7, 128)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False))
    assert model.output_shape == (None, 14, 14, 64)
    model.add(layers.BatchNormalization())
    model.add(layers.LeakyReLU())

    model.add(layers.Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
    assert model.output_shape == (None, 28, 28, 1)

    return model

Discriminator model definition

`gan.py`



def make_discriminator_model():
    model = tf.keras.Sequential()
    model.add(layers.Conv2D(64, (5, 5), strides=(2, 2), padding='same',
                                     input_shape=[28, 28, 1]))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Conv2D(128, (5, 5), strides=(2, 2), padding='same'))
    model.add(layers.LeakyReLU())
    model.add(layers.Dropout(0.3))

    model.add(layers.Flatten())
    model.add(layers.Dense(1))

    return model

https://www.slideshare.net/HiroyaKato1/gandcgan-188544721 https://www.hellocybernetics.tech/entry/2018/05/28/180012 https://keras.io/ja/getting-started/sequential-model-guide/

Definition of the loss function

`gan.py`



def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

Generally, to train a neural network model, adjust the weight parameter so that the gradient of the weight parameter of this loss function becomes small. Define a function that increases the ability of the Discriminator to distinguish between genuine and fake, and increases the ability of the Generator to deceive the Discriminator. This part is the definition that represents the technology peculiar to GAN.

Definition of training function

`gan.py`



EPOCHS =50
noise_dim = 100
num_examples_to_generate = 16
seed= tf.random.normal([num_examples_to_generate,noise_dim])

def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()

    for image_batch in dataset:
      train_step(image_batch)

    # Produce images for the GIF as we go
    display.clear_output(wait=True)
    generate_and_save_images(generator,
                             epoch + 1,
                             seed)

    # Save the model every 15 epochs
    if (epoch + 1) % 15 == 0:
      checkpoint.save(file_prefix = checkpoint_prefix)

    print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))

  # Generate after the final epoch
  display.clear_output(wait=True)
  generate_and_save_images(generator,
                           epochs,
                           seed)

Image after calculation

After 10 epochs

After 30 epochs

After 50 epochs

Since it has exceeded 30 epochs, the image looks quite numerical. In my environment *, it took about 5 minutes for 1 epoch, so it took about 250 minutes. Take advantage of Google Colab, which has a GUI. ..

Current PC environment PC:Windows 10 Home CPU:Intel Core i7 3.6GHz RAM:8GB

At the end

This is the first GAN I implemented, but I enjoyed it because it is a program that fully demonstrates the calculation functions that computers are good at, including thinking and calculations. Many models have been proposed for generators, classifiers, and the concept of loss functions, so I would like to learn and summarize them one by one.

I put all the codes here. https://github.com/Fumio-eisan/dcgan20200306

MNIST image generation program creation by DCGAN (tensorflow tutorial)

What are GAN and DCGAN?

What is the future module?

gan.py

About loaded libraries and modules

gan.py

Data set loading

gan.py

gan.py

Generator model definition

gan.py

Discriminator model definition

gan.py

Definition of the loss function

gan.py

Definition of training function

gan.py

Image after calculation

At the end

`gan.py`

`gan.py`

`gan.py`

`gan.py`

`gan.py`

`gan.py`

`gan.py`

`gan.py`