Implemented pix2pix in Windows environment (with kind command line execution results and common error examples)

https://github.com/phillipi/pix2pix

Precautions when implementing this in a Windows environment.

First copy the file

Open the command line and type the following command

command


C:\Users\hoge>git clone https://github.com/phillipi/pix2pix

result


Cloning into 'pix2pix'...
remote: Enumerating objects: 22, done.
remote: Counting objects: 100% (22/22), done.
remote: Compressing objects: 100% (20/20), done.
remote: Total 479 (delta 5), reused 8 (delta 2), pack-reused 457R
Receiving objects: 100% (479/479), 2.45 MiB | 1.02 MiB/s, done.
Resolving deltas: 100% (255/255), done.

This will copy the project to your home directory (often C: \ Users \ hoge \). Since this project is written in the lua language, I will also copy the files implemented by kind person in python.

command


C:\Users\hoge>git clone https://github.com/tdeboissiere/DeepLearningImplementations.git

result


Cloning into 'DeepLearningImplementations'...
remote: Enumerating objects: 1616, done.
remote: Total 1616 (delta 0), reused 0 (delta 0), pack-reused 1616 eceiving objects:  96% (1552/1616), 50.29 MiB | 1.13 Receiving objects: 100% (1616/1616), 50.34 MiB | 1.10 MiB/s, done.

Resolving deltas: 100% (754/754), done.

Merge these two files. On Linux, there is a convenient command called rsync, but at the command prompt, you can do the same with the command robocopy. Convenient!

command


robocopy /E DeepLearningImplementations\pix2pix\ pix2pix\

result


-------------------------------------------------------------------------------
   ROBOCOPY     ::Windows robust file copy
-------------------------------------------------------------------------------

start:July 6, 2020 18:47:00
Original: C:\Users\hoge\DeepLearningImplementations\pix2pix\
Copy to: C:\Users\hoge\pix2pix\

File: *.*

option: *.* /S /E /DCOPY:DA /COPY:DAT /R:1000000 /W:30

------------------------------------------------------------------------------

(Omitted because it is long)

------------------------------------------------------------------------------

Total copied skip mismatch failure Extras
directory:        10         7         3         0         0         5
File:        16        16         0         0         0         9
Part-Time Job:   560.2 k   560.2 k         0         0         0    61.4 k
Times of Day:   0:00:00   0:00:00                       0:00:00   0:00:00


speed:30196526 bytes/Seconds
speed:            1727.859 MB/Minutes
End:July 6, 2020 18:47:01

Dataset DL

Datasets must be downloaded individually. However, there is a problem here, and I need to run a file called download_dataset.sh to download the dataset, but this format file ** basically cannot be run ** in a Windows environment. This file is just a script, it just contains multiple commands, because the commands are for Linux.

Various measures such as rewriting the contents of the command to Windows format and inserting Cygwin can be considered, but here as a minimal solution, we will use wsl as a means to reproduce the Linux environment on Windows. For a description of wsl, click here (https://qiita.com/Aruneko/items/c79810b0b015bebf30bb). If wsl is installed, you can enter the Linux environment by typing wsl on the command line.

command


C:\Users\hoge>cd pix2pix
C:\Users\hoge\pix2pix>cd datasets
C:\Users\hoge\pix2pix\datasets>wsl
hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$

If you run the file here, it will be solved! …… It does not become

command


hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash download_dataset.sh facades

result


download_dataset.sh: line 2: $'\r': command not found
download_dataset.sh: line 17: syntax error: unexpected end of file

An error will occur. This is a problem due to the ** line feed code being different ** between Windows and Linux. At the stage of git clone in the Windows environment, the cloned file also became Windows specification! Therefore, if you run it in a Linux environment on Windows, an error will occur. It's complicated!

There is no help for it, so change the file on Windows to the Linux specification.

command


hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ tr -d '\r' <download_dataset.sh> win2linux.sh
hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash win2linux.sh facades

But I get an error again.

result


Specified [facades]
win2linux.sh: line 13: wget: command not found
tar (child): ./datasets/facades.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove './datasets/facades.tar.gz': No such file or directory

This is because the instruction wget is not installed. Install it. It's annoying.

command


hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ sudo apt install wget

result


Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  liblua5.1-0-dev libreadline-dev libtinfo-dev libtool-bin lua-any lua-sec lua-socket unzip zip
Use 'sudo apt autoremove' to remove them.
The following NEW packages will be installed:
  wget
0 upgraded, 1 newly installed, 0 to remove and 81 not upgraded.
Need to get 316 kB of archives.
After this operation, 954 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 wget amd64 1.19.4-1ubuntu2.2 [316 kB]
Fetched 316 kB in 2s (190 kB/s)
Selecting previously unselected package wget.
(Reading database ... 79492 files and directories currently installed.)
Preparing to unpack .../wget_1.19.4-1ubuntu2.2_amd64.deb ...
Unpacking wget (1.19.4-1ubuntu2.2) ...
Setting up wget (1.19.4-1ubuntu2.2) ...
Processing triggers for install-info (6.5.0.dfsg.1-2) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...

Run again

command


hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ bash win2linux.sh facades

result


Specified [facades]
WARNING: timestamping does nothing in combination with -O. See the manual
for details.

--2020-07-06 19:33:36--  http://efrosgans.eecs.berkeley.edu/pix2pix/datasets/facades.tar.gz
Resolving efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)... 128.32.189.73
Connecting to efrosgans.eecs.berkeley.edu (efrosgans.eecs.berkeley.edu)|128.32.189.73|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30168306 (29M) [application/x-gzip]
Saving to: ‘./datasets/facades.tar.gz’

./datasets/facades.tar.gz     100%[=================================================>]  28.77M  1.13MB/s    in 26s
(Omitted because it is long)

Once this is done, the Linux environment is obsolete. Let's get out.

command


hoge@DESKTOP-EPGPMTG:/mnt/c/Users/hoge/pix2pix/datasets$ exit

result


logout

C:\Users\hoge\pix2pix\datasets>

Data set processing

Some dataset processing is required. From here, we will use python. python is installed, but it's supposed to be vanilla. Here, Anaconda prepares a new environment called pix and reproduces it.

text:command:


C:\Users\hoge\pix2pix\datasets>cd ..
C:\Users\hoge\pix2pix>conda create -n pix python=3.6
(Omitted in the middle)
(pix) C:\Users\hoge\pix2pix>

I need various libraries, so I will install them.

command


(pix) C:\Users\hoge\pix2pix>conda install numpy
(pix) C:\Users\hoge\pix2pix>conda install keras
(pix) C:\Users\hoge\pix2pix>conda install -c conda-forge parmap
(pix) C:\Users\hoge\pix2pix>conda install matplotlib
(pix) C:\Users\hoge\pix2pix>conda install tqdm
(pix) C:\Users\hoge\pix2pix>conda install opencv
(pix) C:\Users\hoge\pix2pix>conda install h5py
(pix) C:\Users\hoge\pix2pix>conda install tensorflow-gpu

parmap and ʻopencvneed a little attention.parmap cannot be installed with conda, so the above command is required. It seems that the environment of Anaconda will be messed up if you rush to install it with pip. I have no experience, but ... OpenCV is ʻopencv-python in pip, but it seems that ʻopencv is fine in conda`.

command


(pix) C:\Users\hoge\pix2pix\src\data>python make_dataset.py ../../datasets/datasets/facades/ 3 --img_size 256

result


100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.22it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.12it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  3.25it/s]

Learning

command


(pix) C:\Users\hoge\pix2pix\src\data>cd ..
(pix) C:\Users\hoge\pix2pix\src>cd model
(pix) C:\Users\hoge\pix2pix\src\model>python main.py 64 64 --backend tensorflow --nb_epoch 10

result


Using TensorFlow backend.
2020-07-06 20:02:48.456473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.764917: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-07-06 20:02:51.835745: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s
2020-07-06 20:02:51.842601: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.848476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-06 20:02:51.854786: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-06 20:02:51.859952: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-06 20:02:51.867498: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-06 20:02:51.874415: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-06 20:02:51.884355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-06 20:02:51.887931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-06 20:02:51.890744: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-07-06 20:02:51.895380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 SUPER computeCapability: 7.5
coreClock: 1.815GHz coreCount: 48 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 462.00GiB/s
2020-07-06 20:02:51.901639: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2020-07-06 20:02:51.904406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2020-07-06 20:02:51.908518: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2020-07-06 20:02:51.911282: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2020-07-06 20:02:51.915043: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2020-07-06 20:02:51.918430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2020-07-06 20:02:51.921086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-07-06 20:02:51.924511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-07-06 20:02:52.470406: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-06 20:02:52.474907: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
2020-07-06 20:02:52.477401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
2020-07-06 20:02:52.480235: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6267 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
Traceback (most recent call last):
  File "main.py", line 73, in <module>
    launch_training(**d_params)
  File "main.py", line 8, in launch_training
    train.train(**kwargs)
  File "C:\Users\hoge\pix2pix\src\model\train.py", line 72, in train
    do_plot)
  File "C:\Users\hoge\pix2pix\src\model\models.py", line 310, in load
    model = generator_unet_upsampling(img_dim, bn_mode, model_name=model_name)
  File "C:\Users\hoge\pix2pix\src\model\models.py", line 92, in generator_unet_upsampling
    if K.image_dim_ordering() == "channels_first":
AttributeError: module 'keras.backend' has no attribute 'image_dim_ordering'

I was angry. Actually, this is due to the difference in the version of keras, and in the new keras, part of the api (command system) is different. trap! Actually, I knew that this would happen and I dared to install it without specifying the version, but for example, Google Colaboratory has the latest keras and tensorflow built-in, so it would be pitiful if anyone stepped on this trap without knowing it. I thought it was a demonstration (?). As an aside, even when I get angry that there is no contrib command in tensorflow, it seems that it will be managed by downgrading.

That's why the version will be downgraded.

command


(pix) C:\Users\hoge\pix2pix\src\model>conda install keras==2.0.8

result


Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: C:\Users\hoge\Anaconda3\envs\pix

  added / updated specs:
    - keras==2.0.8


The following packages will be REMOVED:

  keras-applications-1.0.8-py_0
  keras-base-2.3.1-py36_0
  keras-preprocessing-1.1.0-py_1
  tensorflow-estimator-2.1.0-pyhd54b08b_0

The following packages will be SUPERSEDED by a higher-priority channel:

  tensorboard        pkgs/main/noarch::tensorboard-2.2.1-p~ --> pkgs/main/win-64::tensorboard-1.10.0-py36he025d50_0

The following packages will be DOWNGRADED:

  cudatoolkit                           10.1.243-h74a9793_0 --> 9.0-1
  cudnn                                    7.6.5-cuda10.1_0 --> 7.6.5-cuda9.0_0
  keras                                             2.3.1-0 --> 2.0.8-py36h65e7a35_0
  tensorflow                       2.1.0-gpu_py36h3346743_0 --> 1.10.0-gpu_py36h3514669_0
  tensorflow-base                  2.1.0-gpu_py36h55f5790_0 --> 1.10.0-gpu_py36h6e53903_0
  tensorflow-gpu                           2.1.0-h0d30ee6_0 --> 1.10.0-hf154084_0


Proceed ([y]/n)? y

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

You can see that it has done a good job of matching the version of the related tensorflow (which also supports gpu!). Then execute it again.

command


(pix) C:\Users\hoge\pix2pix\src\model>python main.py 64 64 --backend tensorflow --nb_epoch 10

result


(Omitted)
Start training
2020-07-06 20:11:14.114015: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2020-07-06 20:11:14.256531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.815
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.55GiB
2020-07-06 20:11:14.264273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2020-07-06 20:11:14.649123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-06 20:11:14.653675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2020-07-06 20:11:14.656771: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2020-07-06 20:11:14.660571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6286 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
396/400 [============================>.] - ETA: 0s - D logloss: 0.7649 - G tot: 12.9429 - G L1: 1.1987 - G logloss: 0.9557
Epoch 1/10, Time: 85.63616681098938
396/400 [============================>.] - ETA: 0s - D logloss: 0.7681 - G tot: 12.1516 - G L1: 1.1377 - G logloss: 0.7749
Epoch 2/10, Time: 80.52165269851685
396/400 [============================>.] - ETA: 0s - D logloss: 0.7439 - G tot: 11.6468 - G L1: 1.0932 - G logloss: 0.7151
Epoch 3/10, Time: 28.542045831680298
396/400 [============================>.] - ETA: 0s - D logloss: 0.7270 - G tot: 11.7284 - G L1: 1.0991 - G logloss: 0.7370
Epoch 4/10, Time: 28.93921661376953
396/400 [============================>.] - ETA: 0s - D logloss: 0.7150 - G tot: 11.4890 - G L1: 1.0733 - G logloss: 0.7556
Epoch 5/10, Time: 28.971989393234253
396/400 [============================>.] - ETA: 0s - D logloss: 0.7448 - G tot: 11.6123 - G L1: 1.0889 - G logloss: 0.7231
Epoch 6/10, Time: 29.158592700958252
396/400 [============================>.] - ETA: 0s - D logloss: 0.7278 - G tot: 11.3660 - G L1: 1.0649 - G logloss: 0.7168
Epoch 7/10, Time: 28.717032432556152
396/400 [============================>.] - ETA: 0s - D logloss: 0.7282 - G tot: 11.4358 - G L1: 1.0718 - G logloss: 0.7182
Epoch 8/10, Time: 29.007369995117188
396/400 [============================>.] - ETA: 0s - D logloss: 0.7159 - G tot: 11.3449 - G L1: 1.0643 - G logloss: 0.7019
Epoch 9/10, Time: 29.249063730239868
396/400 [============================>.] - ETA: 0s - D logloss: 0.7063 - G tot: 10.9784 - G L1: 1.0294 - G logloss: 0.6847
Epoch 10/10, Time: 29.128608226776123

I was able to run the sample safely. You did it!

Postscript

After that, I tried to create an environment instantly on another computer, but if I follow the same procedure, the GPU is recognized, but just before learning, the error CUDNN_STATUS_ALLOC_FAILED appears and it stops. It was. In the end, updating the GPU driver from NVIDIA fixed it.

Helpful page (thanks)

-Try running pix2pix on Google Colaboratory. -How to change the line feed code from CR + LF (Windows) to LF (Linux)

Recommended Posts

Implemented pix2pix in Windows environment (with kind command line execution results and common error examples)
[RHEL7 / CentOS7] LWP execution error in the environment where Perl is installed with the yum command
Error when entering virtual environment with workon command
Notify error and execution completion by LINE [Python]
Build PyPy and Python execution environment with Docker