On New Year's Eve when I returned to my parents' house, I had nothing to do, so I came up with the idea of creating an image of ** delicious ** meat using DCGAN. (I did it in 4 hours, so the quality is insanely low ...)
** It looks delicious ** I collected it from the following site to collect image data of meat.
When this site came out on Twitter before, I wanted to use it, so I used it this time. We obtained 60 images from this site and used them as real images.
(If you do it properly, you should scrape and collect a lot of images, but I don't do it because I don't have time.)
My implementation is here. Please refer to those who want to do it properly. The implementation is done by referring to the articles of Official DCGAN Tutorial and hkthirano of Pytorch.
In the preprocessing, all the image sizes are cropped to 64 * 64. I also did the pre-processing as described in the official tutorial.
image_size = 64
batch_size = 2
workers = 0
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#Also perform preprocessing
dataset = datasets.ImageFolder(IMG_DIR,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
]))
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=workers)
If you take a peek at the preprocessed image
#Take a look at the dataloader
real_batch = next(iter(dataloader))
plt.figure(figsize=(8,8))
plt.axis('off')
plt.title('Training image')
plt.imshow(np.transpose(vutils.make_grid(real_batch[0].to(device)[:64], padding=2, normalize=True).cpu(),(1,2,0)))
It feels good that the pretreatment is done properly. However, since I lowered the resolution, at this point it looks like ** delicious ** meat is no longer visible ...
Generator
The purpose of Generator is to create a fake image that is difficult for Discriminator to distinguish. The latent variable is set to 100 dimensions, and a 64 * 64 * 3 image is generated from it.
class Generator(nn.Module):
def __init__(self):
super().__init__()
self.main = nn.Sequential(
nn.ConvTranspose2d(
in_channels=100,
out_channels=256,
kernel_size=4,
stride=1,
padding=0,
bias=False
),
nn.BatchNorm2d(256),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(256, 128, 4, 2, 1, bias=False),
nn.BatchNorm2d(128),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(128, 64, 4, 2, 1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(64, 32, 4, 2, 1, bias=False),
nn.BatchNorm2d(32),
nn.ReLU(inplace=True),
nn.ConvTranspose2d(32, 3, 4, 2, 1, bias=False),
nn.Tanh()
)
def forward(self, x):
return self.main(x)
Discriminator
The purpose of Discriminator is to properly distinguish between fake images and real images generated by Generator.
Outputs a scalar value of whether it is genuine (1 or 0) from a 64 * 64 * 3 image.
class Discriminator(nn.Module):
def __init__(self):
super().__init__()
self.main = nn.Sequential(
nn.Conv2d(3, 32, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(32, 64, 4, 2, 1, bias=False),
nn.BatchNorm2d(64),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(64, 128, 4, 2, 1, bias=False),
nn.BatchNorm2d(128),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(128, 256, 4, 2, 1, bias=False),
nn.BatchNorm2d(256),
nn.LeakyReLU(0.2, inplace=True),
nn.Conv2d(256, 1, 4, 1, 0, bias=False),
)
def forward(self, x):
return self.main(x).squeeze()
Just as Generator creates fake images that are difficult to discriminate, Discriminator learns so that they can be discriminated properly.
By repeating the following loop many times, you will learn hostilely.
#Training function
def train_dcgan(model_G, model_D, params_G, params_D, dataloader):
log_loss_G = []
log_loss_D = []
for real_img, _ in dataloader:
batch_len = len(real_img)
# ===Train Generator===
#Generate fake image
z = torch.randn(batch_len, nz, 1, 1).to(device)
fake_img = model_G(z)
#Temporarily save fake images
#To avoid generating fake images twice
fake_img_tensor = fake_img.detach()
#Calculate to deceive fake images as real
out = model_D(fake_img)
loss_G = loss_f(out, ones[: batch_len])
log_loss_G.append(loss_G.item())
#I will update
model_D.zero_grad()
model_G.zero_grad()
loss_G.backward()
params_G.step()
# ==Discriminator training===
#Real image
real_img = real_img.to(device)
#Find the loss so that you can calculate the real image
real_out = model_D(real_img)
loss_D_real = loss_f(real_out, ones[:batch_len])
#Fake image saved earlier
fake_img = fake_img_tensor
#Ask for loss so that fake images can be identified as fake
fake_out = model_D(fake_img_tensor)
loss_D_fake = loss_f(fake_out, zeros[:batch_len])
#Total the loss of genuine and fake
loss_D = loss_D_real + loss_D_fake
log_loss_D.append(loss_D.item())
#I will update
model_D.zero_grad()
model_G.zero_grad()
loss_D.backward()
params_D.step()
return mean(log_loss_G), mean(log_loss_D)
We learned with a batch size of 2,1000 epochs. Below is a gif that summarizes the learning results for each 100 epochs. how is it? Doesn't it look like meat? ??
At first, it looks like noise, so at 300-500 epochs, I think there is meat on a white plate on a white background. However, after 500 epochs, it has returned to just having meat on a black background ... (Does the 500 epoch look the most like a real image?)
Postscript (2021/1/1) We learned with a batch size of 8,5000 epochs. It looks like the meat is more meaty than last time. However, only similar images are generated and mode collapse occurs. The cause is that the vector of latent features is weak in 100 dimensions? As a result, I can't make meat without diversity, but I feel that it is closer to real meat.
The quality of the image and the number of images are considered to be the reasons why it was not generated cleanly. Even though I collected ** delicious ** images, I thought it was a waste to reduce the resolution because of learning. Also, the number of sheets was collected from one site, so I think that it is overwhelmingly insufficient.
It was great to be able to produce meat by the end of 2020, starting from the idea. DCGAN is amazing because it can generate meaty things from only 60 images! !!
If I have time, I would like to produce more quality ** delicious ** meat!
Recommended Posts