background

――I decided to try popular ensemble learning at work, and tried LightGBM, which is rumored to be faster than XGBoost, in Python ――In fact, it is easy to use and learning is completed in a shorter time than neural networks, so I would like to try more. --The Light GBM I put in from conda-forge is not GPU enabled, so I try to compile it manually -Compile according to Method introduced by others → Error

The flow. The following is the solution in my environment, but this problem seems to occur when a different version of OpenCL is installed, so it is recommended to try the method described in the above URL first.

Also, this article does not explain how to use LightGBM. Qiita already has Great commentary on LightGBM, so please refer to it there.

Installation environment

Ubuntu 16.04
GIGABYTE GeForce GTX 1060 WINDFORCE 2X OC 6GB GDDR5
CUDA 8.0
cuDNN 5.1 --Anaconda 4.3 (via pyenv)

I installed CUDA according to the official method.

Flow from manual installation to compilation error

First, try installing LightGBM using the method in the link above.

# http://yutori-datascience.hatenablog.com/entry/2017/07/07/162509 as it is
sudo apt-get update
sudo apt-get install --no-install-recommends nvidia-375
sudo apt-get install --no-install-recommends nvidia-opencl-icd-375 nvidia-opencl-dev opencl-headers

sudo init 6

sudo apt-get install --no-install-recommends git cmake build-essential libboost-dev libboost-system-dev libboost-filesystem-dev

cd ~/tmp/
git clone --recursive https://github.com/Microsoft/LightGBM
cd LightGBM
mkdir build ; cd build
cmake -DUSE_GPU=1 .. 
make -j$(nproc)
cd ..

So, the compilation of LightGBM itself is completed here. So, check the operation

~/tmp/LightGBM/examples/binary_classification
../../lightgbm config=train.conf data=binary.train valid=binary.test device=gpu

After executing, if Log like this is output, it is successful.

So far, everything went well, but when I tried to compile a Python module,

cd python-package/
python setup.py install --gpu

I encountered a situation where compilation stopped with the following error

OSError: /usr/local/lib/python3.5/dist-packages/lightgbm/lib_lightgbm.so: symbol clCreateCommandQueueWithProperties, version OPENCL_2.0 not defined in file libOpenCL.so.1 with link time reference

Solution

As a result of various googles, I found a person reporting a similar problem on the official forum of github.

Following the discussion there, it seems that this is because two types of OpenCL drivers are installed (normal Version 2.0 and version 1.2 from Nvidia). However, it seems that the setting of LD_LIBRARY_PATH is different from the version linked with ** LightGBM, and an error occurs because the Nvidia driver is loaded first **.

This can be resolved in the following ways:

Uninstall the installed LightGBM

** Uninstall with pip **

pip uninstall lightgbm

** Or uninstall manually ** (The following is just an example)

# remove manually installed LightGBM
rm -rf /home/so1/.pyenv/versions/anaconda3-4.3.0/lib/python3.6/site-packages/lightgbm
rm -rf /home/so1/.pyenv/versions/anaconda3-4.3.0/lightgbm
# also remove downloaded LightGBM source, just in case
rm -rf ~/tmp/LightGBM

Recompile LightGBM by linking the correct version (Nvidia) of OpenCL

git clone --recursive https://github.com/Microsoft/LightGBM
cd ./LightGBM
mkdir build; cd build
sudo cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda-8.0/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda-8.0/include/ ..
sudo make -j$(nproc)
cd ../python-package/
python setup.py install --precompile

By the way, I think that cmake is different depending on your environment, so please rewrite it as appropriate. Also important here is that the last line compiles without the ** --gpu option ** (because the wrong version of OpenCL will be linked).

Operation check

If it seems to be installed without any problem, check the operation with Python. For example, borrow the test code from this commentary article and try it.

# https://analyticsai.wordpress.com/2017/04/04/lightgbm/Borrow more
import numpy as np
from sklearn import datasets, metrics, cross_validation
from lightgbm.sklearn import LGBMRegressor
import os
diabetes = datasets.load_diabetes()
x = diabetes.data
y = diabetes.target
clf = LGBMRegressor(max_depth=50,
                       num_leaves=21,
                       n_estimators=3000,
                       min_child_weight=1,
                       learning_rate=0.001,
                       nthread=24,
                       subsample=0.80,
                       colsample_bytree=0.80,
                       seed=42)

x_t, x_test, y_t, y_test = cross_validation.train_test_split(x, y, test_size=0.2)
clf.fit(x_t, y_t, eval_set=[(x_test, y_test)])
print("Mean Square Error: ", metrics.mean_squared_error(y_test, clf.predict(x_test)))

If the following log is output here, the installation is successful.

[1]	valid_0's multi_logloss: 1.83493
[2]	valid_0's multi_logloss: 1.73867
[3]	valid_0's multi_logloss: 1.6495
[4]	valid_0's multi_logloss: 1.56938
[5]	valid_0's multi_logloss: 1.49485
[6]	valid_0's multi_logloss: 1.42784
[7]	valid_0's multi_logloss: 1.36532
 ....

Please refer to this article for comparison of learning time when using GPU and CPU.

in conclusion

Happy Machine Learning!

Install GPU-enabled LightGBM (Ubuntu 16.04)