In my graduation research, I needed to execute the code written in tensorflow
.
The program consumes a lot of memory, and the computer in the laboratory screamed and didn't work, so I cried GCP I had to build an execution environment using = angelic-turbine-257310 & hl = ja) ... uh ....
Moreover, there was a suspicion (?!) That I would not be in time for my graduation without using GCP (it is difficult). I built a Python execution environment so that GCP can also be used ...
At first, I didn't know how to do it at all, and spent days crying and squirming (not to mention that my research wasn't progressing well: innocent :)! !!
For those who are having trouble with similar things, I would like to start with the idea that even a little ...!
This time, we will build the environment with the following settings.
It seems that TensolFlow needs to match the version with CUDA depending on the version ...: neutral_face: If this version is out of sync, GCP may not be recognized properly, so We recommend that you check from the following. (By the way, I made a few mistakes, yes.)
--Account registration with CCP --Enable VM Instance billing --Create VM instance
Reference site: How to do deep learning (NVIDIA DIGITS) using GCP GPU (NVIDIA Tesla K80) for free
Execute the following command on the VM instance to install CUDA and driver
$ curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1604_9.0.176-1_amd64.deb
$ sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda-9-0
In addition, run the following command to optimize GPU performance.
$ sudo nvidia-smi -pm 1
Create a developer account with From here and Download the following three files of cuDNN. / cudnn-download).
version : ubuntu 16.04 cuda-9.0 version
Once the download is complete, upload the three files to Starage.
Here, the bucket name is cuda_9
(change it to your liking!).
When the upload is complete, use the gsutil command to transfer it to the instance as it is. Please choose the directory to upload.
$ cd {UP_LOAD_PATH}
$ gsutil cp gs://cuda_9/libcudnn7_7.6.4.38-1+cuda9.0_amd64.deb .
$ gsutil cp gs://cuda_9/libcudnn7-dev_7.6.4.38-1+cuda9.0_amd64.deb .
$ gsutil cp gs://cuda_9/libcudnn7-doc_7.6.4.38-1+cuda9.0_amd64.deb .
After the transfer is complete, unzip the file and install it.
$ sudo dpkg -i *.deb
If you do not have a swap file, you may leak memory when you run the program. When I create a Linux virtual machine with GCE, whether it's mUbuntu or CentOS, the virtual machine is created without a swap file ... it seems ... (I didn't know that even a millimeter, so I got stuck here)
So, first check the existence of swap with the free command.
$ free -m
If it looks like the following, Swap: is zero, so you need to create a swap file.
total used free shared buff/cache available
Mem: 581 148 90 0 342 336
Swap: 0 0 0
Creating a swap file. The capacity of the swap file is your choice (10G this time)
$ sudo fallocate -l 10G /swapfile
$ sudo chmod 600 /swapfile
$ sudo mkswap /swapfile
$ sudo swapon /swapfile
Check swap file
$ free -m
total used free shared buff/cache available
Mem: 581 148 88 0 344 336
Swap: 1023 0 10023
** Tips **: To automatically mount the swap file on reboot, it seems that you should add the following to / etc / fstab
.
/swapfile none swap sw 0 0
We will set up CUDA.
$ echo "export PATH=/usr/local/cuda-9.0/bin\${PATH:+:\${PATH}}" >> ~/.bashrc
$ source ~/.bashrc
$ sudo /usr/bin/nvidia-persistenced
Then check if the GPU is recognized.
$ nvidia-smi
If you get the following response, GPU setting is complete!
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.72 Driver Version: 410.72 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |
| N/A 42C P0 65W / 149W | 0MiB / 11441MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Finally, we will build a Python environment with Anadonda. (I usually use Anaconda, and the program didn't work with other methods, so I chose Anaconda this time.)
Download Anaconda with wget
.
$ wget https://repo.anaconda.com/archive/Anaconda3-5.3.1-Linux-x86_64.sh
$ sh ./Anaconda3-5.3.1-Linux-x86_64.sh
$ echo ". /home/{USER_NAME}/anaconda3/etc/profile.d/conda.sh" >> ~/.bashrc
$ source ~/.bashrc
Next, build the Anaonda disguise environment. Python version and `ʻENV_NAMEare your choice. (This time I want to use
tensorflow == 1.12.0, so
Python3.6.5``)
$ conda create -n {ENV_NAME} python=3.6.5
$ conda activate {ENV_NAME}
install from conda (I feel relieved when I come here ...)
$ conda install tensorflow-gpu==1.12.0
Execute the following program, and when GPU
appears, tensorflow-gpu
recognizes the GPU.
test.py
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 2319180638018740093
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 11324325888
locality {
bus_id: 1
}
incarnation: 13854674477442207273
physical_device_desc: "device: 0, name: Tesla K80, pci bus id: 0000:00:04.0, compute capability: 3.7"
]
In addition, if there is a necessary library, install it from conda. (↓ like this)
$ conda install numpy==1.15.4
$ conda install scipy==1.1.0
$ conda install scikit-learn==0.20.0
You should now be able to run your Python program using GCP on an instance of the compute engine ...! Thank you for your hard work...!!
If you have any mistakes, please leave a comment: bow_tone2:
Recommended Posts