Introduction

Target person

Here are the steps to install PyCUDA on your Mac. Those who fall under the following can be targeted. --Use Python 3.x --I want to calculate on GPU without using Deep Learning libraries such as Chainer, Tensorflow, Keras --I want to do GPGPU programming using Python on Mac --I own a Mac / Macbook Pro with an NVIDIA GPU

Development environment

The specifications of the Mac used are as follows.

Macbook Pro, 2014 Mid
MacOS Sierra 10.12.6 --Core i7 2.8GHz / 16GB memory
NVIDIA GeForce GT 750M 2GB GDDR5 --Xcode 7.3.1 & Command Line Tools (Xcode 8.x is not supported by CUDA8) As a supplement to the last Xcode, there is a way to make it coexist with Xcode 7 even if Xcode 8 is already installed, so I will explain it later.

PyCUDA environment construction

Xcode7 Installation At the time of writing, Xcode 7.3.1 was the latest in Xcode 7. From Downloads for Apple Developer , select ** See more downloads ** and type Xcode 7 in the search box. Download ** Xcode 7.3.1 ** and ** Command Line Tools for Xcode 7.3.1 **.

Open the downloaded dmg file and install Xcode 7.3.1. Then install Command Line Tools for Xcode 7.3.1 as well.

Then open the Command Line Tools dmg and install.

If Xcode 8 is already installed

If you already have Xcode8 installed, copy Xcode7 as Xcode7.app to / Applications. To activate Xcode 7 $ sudo mv /Applications/Xcode.app /Applications/Xcode8.app $ sudo mv /Applications/Xcode7.app /Applications/Xcode.app $ sudo xcode-select -s /Applications/Xcode.app

Conversely, to activate Xcode 8 $ sudo mv /Applications/Xcode.app /Applications/Xcode7.app $ sudo mv /Applications/Xcode8.app /Applications/Xcode.app $ sudo xcode-select -s /Applications/Xcode.app

*** Activate Xcode 7 to use CUDA / PyCUDA. CUDA8 does not support Xcode8. *** ***

CUDA8 Installation CUDA is an integrated development environment for libraries, compilations, etc. for GPGPU provided by NVIDIA. In short, it is a group of tools for parallel calculation using NVIDIA GPU. Install CUDA8 because PyCUDA requires CUDA. CUDA8 Installation Select the OS and version from CUDA Toolkit Download | NVIDIA Developer and download the package. Install as instructed by the installer.

After installation, select ** CUDA ** from ** System Preferences ** on MacOS and click ** Install CUDA Update ** to update CUDA.

cuDNN Installation Download cudnn from NVIDIA cuDNN | NVIDIA Developer . You may need to register as a member.

Unzip the downloaded cudnn-8.0-osx-x64-v6.0.tgz.

`$ tar zxvf cudnn-8.0-osx-x64-v6.0.tgz`



 Deploy the file to your system.
```$ sudo mv <Unzip folder>/lib/libcudnn* /usr/local/cuda/lib```

#### **`$ sudo mv <Unzip folder>/include/cudnn.h /usr/local/cuda/include`**

Set ** DYLD_LIBRARY_PATH ** and ** PATH ** Add it to .bashrc.

`export DYLD_LIBRARY_PATH=/usr/local/cuda/lib:/Developer/NVIDIA/CUDA-8.0/lib:$DYLD_LIBRARY_PATH`



#### **`$PATH`**
```export path=/developer/nvidia/cuda-8.0/bin


 Reflect the changes.

#### **`$ source .bashrc`**
```bashrc


### CUDA sample program

 You can check the installation with the sample program. However, it takes a long time to compile, so you can skip it.


#### **`$ /Developer/NVIDIA/CUDA-8.0/bin/cuda-install-samples-8.0.sh ~/Downloads`**

$ cd ~/downloads $ make -c 1_utilities/devicequery $ 1_utilities/devicequery/devicequery $ make -c 5_simulations/smokeparticles $ 5_simulations/smokeparticles/smokeparticles

If you get a compile error, check Xcode switching, DYLD_LIBRARY_PATH and PATH settings, ** System Preferences ** ** Energy Saving Settings ** (** Automatic graphics switching ** off).

Homebrew Installation If you haven't installed Homebrew yet $ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)”

If you haven't installed Python 3 yet $ brew install python3

Python3 confirmation $ python3 --version

pip Installation If you haven't installed pip yet $ curl -kL https://raw.github.com/pypa/pip/master/contrib/get-pip.p

Numpy Installation If you haven't installed numpy yet $ pip3 install numpy $ pip3 install scipy

*** Note) Even if Numpy is installed on Python 2.x, it must be installed for Python 3.x. *** ***

numpy confirmation $ python3 >>> import numpy as np

PyCUDA Installation It's finally time to install PyCUDA. $ pip3 install pycuda

PyCUDA confirmation $ python3 -c "import pycuda.autoinit

If there is no error, it is OK for the time being. If you get an error, suspect switching to Xcode7, DYLD_LIBRARY_PATH and PATH settings, ** System Preferences ** ** Energy Saving Settings ** (** Automatic graphics switching ** turned off) ..

Then install scikit-cuda (skcuda). $ pip3 install scikit-cuda

GPGPU programming with PyCUDA

Test script test_gpu.py

import pycuda.autoinit
import pycuda.driver as cuda
from pycuda import gpuarray
from skcuda import linalg
import numpy as np

linalg.init()

a = np.random.rand(2, 1).astype(np.float32)
b = np.ones(2).astype(np.float32).reshape(1, 2)

a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)

c_gpu = linalg.dot(a_gpu, b_gpu)
c = c_gpu.get()

print(a)
print(b)
#The result of calculating the inner product with the CPU
print(np.dot(a, b))
#The result of calculating the inner product with GPU
print(c)

Save the above script as test_gpu.py and run it.

$ python3 test_gpu.py

[[ 0.85600704]
 [ 0.02441464]]
[[ 1.  1.]]
[[ 0.85600704  0.85600704]
 [ 0.02441464  0.02441464]]
[[ 0.85600704  0.85600704]
 [ 0.02441464  0.02441464]]

Describes the test script test_gpu.py.

First, prepare for initialization processing etc. to use GPU with Python.

import pycuda.autoinit
...
linalg.init()

Create Numpy ndarray once before processing with GPU. The data type uses float32.

a = random.rand(2, 1).astype(np.float32)
b = np.ones(2).astype(np.float32).reshape(1, 2)

Then convert the ndarray to a gpuarray.

a_gpu = gpuarray.to_gpu(a)
b_gpu = gpuarray.to_gpu(b)

In this process, array data is converted and transferred from the memory for the CPU to the memory for the GPU (CPU → GPU).

GPU inner product operation equivalent to Numpy's np.dot.

c_gpu = linalg.dot(ga, gb)

You need to switch back from gpuarray (GPU) to ndarray (CPU) to see the result.

c = c_gpu.get()
print(c)

This time it is the flow of GPU → CPU.

PyCUDA programming notes

In programming with PyCUDA, it is important to always be aware of whether the variable is ndarray or gpuarray because it goes back and forth between CPU and GPU. Also, data conversion between CPU / GPU has a significant impact on performance and should be kept to a minimum.

Increasing the gpuarray size on a Mac with low GPU memory can make the system extremely sluggish and easily inoperable. When running a script on a machine with low GPU memory, it is safer to close the window and quit the application completely. Please note that Safari consumes a lot of memory.

Convenient function

gpuarray and linalg have the following useful functions. However, let x, y, and z be gpuarray. --linarg.dot (x, y): Inner product --linalg.transpose (x): Transpose matrix --gpuarray.max (x): maximum value --abs (x): Absolute value --x.shape: Same as shape of ndarray --gpuarray.if_positive (x> z, x, z): relu if z is ** 0 ** --linalg.mdot (x, y): Can be used as a cross product by reshape

API Docs

From PyCUDA environment construction to GPGPU programming on Mac (MacOS 10.12 Sierra)