Here are the steps to install PyCUDA on your Mac. Those who fall under the following can be targeted. --Use Python 3.x --I want to calculate on GPU without using Deep Learning libraries such as Chainer, Tensorflow, Keras --I want to do GPGPU programming using Python on Mac --I own a Mac / Macbook Pro with an NVIDIA GPU
The specifications of the Mac used are as follows.
Xcode7 Installation At the time of writing, Xcode 7.3.1 was the latest in Xcode 7. From Downloads for Apple Developer , select ** See more downloads ** and type Xcode 7 in the search box. Download ** Xcode 7.3.1 ** and ** Command Line Tools for Xcode 7.3.1 **.
Open the downloaded dmg file and install Xcode 7.3.1. Then install Command Line Tools for Xcode 7.3.1 as well.
Then open the Command Line Tools dmg and install.
If you already have Xcode8 installed, copy Xcode7 as Xcode7.app to / Applications. To activate Xcode 7
$ sudo mv /Applications/Xcode.app /Applications/Xcode8.app
$ sudo mv /Applications/Xcode7.app /Applications/Xcode.app
$ sudo xcode-select -s /Applications/Xcode.app
Conversely, to activate Xcode 8
$ sudo mv /Applications/Xcode.app /Applications/Xcode7.app
$ sudo mv /Applications/Xcode8.app /Applications/Xcode.app
$ sudo xcode-select -s /Applications/Xcode.app
*** Activate Xcode 7 to use CUDA / PyCUDA. CUDA8 does not support Xcode8. *** ***
CUDA8 Installation CUDA is an integrated development environment for libraries, compilations, etc. for GPGPU provided by NVIDIA. In short, it is a group of tools for parallel calculation using NVIDIA GPU. Install CUDA8 because PyCUDA requires CUDA. CUDA8 Installation Select the OS and version from CUDA Toolkit Download | NVIDIA Developer and download the package. Install as instructed by the installer.
After installation, select ** CUDA ** from ** System Preferences ** on MacOS and click ** Install CUDA Update ** to update CUDA.
cuDNN Installation Download cudnn from NVIDIA cuDNN | NVIDIA Developer . You may need to register as a member.
Unzip the downloaded cudnn-8.0-osx-x64-v6.0.tgz.
$ tar zxvf cudnn-8.0-osx-x64-v6.0.tgz
Deploy the file to your system.
```$ sudo mv <Unzip folder>/lib/libcudnn* /usr/local/cuda/lib```
#### **`$ sudo mv <Unzip folder>/include/cudnn.h /usr/local/cuda/include`**
Set ** DYLD_LIBRARY_PATH ** and ** PATH ** Add it to .bashrc.
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/Developer/NVIDIA/CUDA-8.0/lib:$DYLD_LIBRARY_PATH
#### **`$PATH`**
```export path=/developer/nvidia/cuda-8.0/bin
Reflect the changes.
#### **`$ source .bashrc`**
```bashrc
### CUDA sample program
You can check the installation with the sample program. However, it takes a long time to compile, so you can skip it.
#### **`$ /Developer/NVIDIA/CUDA-8.0/bin/cuda-install-samples-8.0.sh ~/Downloads`**
$ cd ~/downloads
$ make -c 1_utilities/devicequery
$ 1_utilities/devicequery/devicequery
$ make -c 5_simulations/smokeparticles
$ 5_simulations/smokeparticles/smokeparticles
If you get a compile error, check Xcode switching, DYLD_LIBRARY_PATH and PATH settings, ** System Preferences ** ** Energy Saving Settings ** (** Automatic graphics switching ** off).
Homebrew Installation
If you haven't installed Homebrew yet
$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”
If you haven't installed Python 3 yet
$ brew install python3
Python3 confirmation
$ python3 --version
pip Installation
If you haven't installed pip yet
$ curl -kL https://raw.github.com/pypa/pip/master/contrib/get-pip.p
Numpy Installation
If you haven't installed numpy yet
$ pip3 install numpy
$ pip3 install scipy
*** Note) Even if Numpy is installed on Python 2.x, it must be installed for Python 3.x. *** ***
numpy confirmation
$ python3
>>> import numpy as np
PyCUDA Installation
It's finally time to install PyCUDA.
$ pip3 install pycuda
PyCUDA confirmation
$ python3 -c "import pycuda.autoinit
If there is no error, it is OK for the time being. If you get an error, suspect switching to Xcode7, DYLD_LIBRARY_PATH and PATH settings, ** System Preferences ** ** Energy Saving Settings ** (** Automatic graphics switching ** turned off) ..
Then install scikit-cuda (skcuda).
$ pip3 install scikit-cuda
import pycuda.autoinit
import pycuda.driver as cuda
from pycuda import gpuarray
from skcuda import linalg
import numpy as np
linalg.init()
a = np.random.rand(2, 1).astype(np.float32)
b = np.ones(2).astype(np.float32).reshape(1, 2)
a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)
c_gpu = linalg.dot(a_gpu, b_gpu)
c = c_gpu.get()
print(a)
print(b)
#The result of calculating the inner product with the CPU
print(np.dot(a, b))
#The result of calculating the inner product with GPU
print(c)
Save the above script as test_gpu.py and run it.
$ python3 test_gpu.py
[[ 0.85600704]
[ 0.02441464]]
[[ 1. 1.]]
[[ 0.85600704 0.85600704]
[ 0.02441464 0.02441464]]
[[ 0.85600704 0.85600704]
[ 0.02441464 0.02441464]]
Describes the test script test_gpu.py.
First, prepare for initialization processing etc. to use GPU with Python.
import pycuda.autoinit
...
linalg.init()
Create Numpy ndarray once before processing with GPU. The data type uses float32.
a = random.rand(2, 1).astype(np.float32)
b = np.ones(2).astype(np.float32).reshape(1, 2)
Then convert the ndarray to a gpuarray.
a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)
In this process, array data is converted and transferred from the memory for the CPU to the memory for the GPU (CPU → GPU).
GPU inner product operation equivalent to Numpy's np.dot.
c_gpu = linalg.dot(ga, gb)
You need to switch back from gpuarray (GPU) to ndarray (CPU) to see the result.
c = c_gpu.get()
print(c)
This time it is the flow of GPU → CPU.
In programming with PyCUDA, it is important to always be aware of whether the variable is ndarray or gpuarray because it goes back and forth between CPU and GPU. Also, data conversion between CPU / GPU has a significant impact on performance and should be kept to a minimum.
Increasing the gpuarray size on a Mac with low GPU memory can make the system extremely sluggish and easily inoperable. When running a script on a machine with low GPU memory, it is safer to close the window and quit the application completely. Please note that Safari consumes a lot of memory.
gpuarray and linalg have the following useful functions. However, let x, y, and z be gpuarray. --linarg.dot (x, y): Inner product --linalg.transpose (x): Transpose matrix --gpuarray.max (x): maximum value --abs (x): Absolute value --x.shape: Same as shape of ndarray --gpuarray.if_positive (x> z, x, z): relu if z is ** 0 ** --linalg.mdot (x, y): Can be used as a cross product by reshape
API Docs
Recommended Posts