――I decided to try popular ensemble learning at work, and tried LightGBM, which is rumored to be faster than XGBoost, in Python ――In fact, it is easy to use and learning is completed in a shorter time than neural networks, so I would like to try more. --The Light GBM I put in from conda-forge is not GPU enabled, so I try to compile it manually -Compile according to Method introduced by others → Error
The flow. The following is the solution in my environment, but this problem seems to occur when a different version of OpenCL is installed, so it is recommended to try the method described in the above URL first.
Also, this article does not explain how to use LightGBM. Qiita already has Great commentary on LightGBM, so please refer to it there.
I installed CUDA according to the official method.
First, try installing LightGBM using the method in the link above.
# http://yutori-datascience.hatenablog.com/entry/2017/07/07/162509 as it is
sudo apt-get update
sudo apt-get install --no-install-recommends nvidia-375
sudo apt-get install --no-install-recommends nvidia-opencl-icd-375 nvidia-opencl-dev opencl-headers
sudo init 6
sudo apt-get install --no-install-recommends git cmake build-essential libboost-dev libboost-system-dev libboost-filesystem-dev
cd ~/tmp/
git clone --recursive https://github.com/Microsoft/LightGBM
cd LightGBM
mkdir build ; cd build
cmake -DUSE_GPU=1 ..
make -j$(nproc)
cd ..
So, the compilation of LightGBM itself is completed here. So, check the operation
~/tmp/LightGBM/examples/binary_classification
../../lightgbm config=train.conf data=binary.train valid=binary.test device=gpu
After executing, if Log like this is output, it is successful.
So far, everything went well, but when I tried to compile a Python module,
cd python-package/
python setup.py install --gpu
I encountered a situation where compilation stopped with the following error
OSError: /usr/local/lib/python3.5/dist-packages/lightgbm/lib_lightgbm.so: symbol clCreateCommandQueueWithProperties, version OPENCL_2.0 not defined in file libOpenCL.so.1 with link time reference
As a result of various googles, I found a person reporting a similar problem on the official forum of github.
Following the discussion there, it seems that this is because two types of OpenCL drivers are installed (normal Version 2.0 and version 1.2 from Nvidia). However, it seems that the setting of LD_LIBRARY_PATH
is different from the version linked with ** LightGBM, and an error occurs because the Nvidia driver is loaded first **.
This can be resolved in the following ways:
** Uninstall with pip **
pip uninstall lightgbm
** Or uninstall manually ** (The following is just an example)
# remove manually installed LightGBM
rm -rf /home/so1/.pyenv/versions/anaconda3-4.3.0/lib/python3.6/site-packages/lightgbm
rm -rf /home/so1/.pyenv/versions/anaconda3-4.3.0/lightgbm
# also remove downloaded LightGBM source, just in case
rm -rf ~/tmp/LightGBM
git clone --recursive https://github.com/Microsoft/LightGBM
cd ./LightGBM
mkdir build; cd build
sudo cmake -DUSE_GPU=1 -DOpenCL_LIBRARY=/usr/local/cuda-8.0/lib64/libOpenCL.so -DOpenCL_INCLUDE_DIR=/usr/local/cuda-8.0/include/ ..
sudo make -j$(nproc)
cd ../python-package/
python setup.py install --precompile
By the way, I think that cmake
is different depending on your environment, so please rewrite it as appropriate. Also important here is that the last line compiles without the ** --gpu option ** (because the wrong version of OpenCL will be linked).
If it seems to be installed without any problem, check the operation with Python. For example, borrow the test code from this commentary article and try it.
# https://analyticsai.wordpress.com/2017/04/04/lightgbm/Borrow more
import numpy as np
from sklearn import datasets, metrics, cross_validation
from lightgbm.sklearn import LGBMRegressor
import os
diabetes = datasets.load_diabetes()
x = diabetes.data
y = diabetes.target
clf = LGBMRegressor(max_depth=50,
num_leaves=21,
n_estimators=3000,
min_child_weight=1,
learning_rate=0.001,
nthread=24,
subsample=0.80,
colsample_bytree=0.80,
seed=42)
x_t, x_test, y_t, y_test = cross_validation.train_test_split(x, y, test_size=0.2)
clf.fit(x_t, y_t, eval_set=[(x_test, y_test)])
print("Mean Square Error: ", metrics.mean_squared_error(y_test, clf.predict(x_test)))
If the following log is output here, the installation is successful.
[1] valid_0's multi_logloss: 1.83493
[2] valid_0's multi_logloss: 1.73867
[3] valid_0's multi_logloss: 1.6495
[4] valid_0's multi_logloss: 1.56938
[5] valid_0's multi_logloss: 1.49485
[6] valid_0's multi_logloss: 1.42784
[7] valid_0's multi_logloss: 1.36532
....
Please refer to this article for comparison of learning time when using GPU and CPU.
Happy Machine Learning!
Recommended Posts