A note on how to install faiss in a CentOS7 environment without Anaconda. I'm not sure if Anaconda can be installed on CentOS7 without any problems, but ... I couldn't install it smoothly with pip, so I recorded the steps that can be installed.
I can't install it in CentOS7 environment for some reason! From a global perspective, it seems that there are some people who have encountered similar problems, but there is no information that clearly finds a solution such as "This is it!" ... Spent more than half a day. Oops. I have confirmed that it cannot be installed in multiple CentOS7 environments as well, so I thought it was a problem specific to CentOS7 and wanted to establish a procedure.
Faiss is a library that implements a high-speed algorithm for similarity search (and clustering) published by Facebook. I'm trying to build a Semantic Search mechanism from the values vectorized by SentenceBERT ... Initially, the implementation was to simply calculate and sort the Cos similarity between the vector you want to search and the vector to be searched to obtain the vector with the highest similarity. However, when the number of search targets of this Semantic Search became large, I felt that the calculation cost would be dangerous.
After a little research, I found a library called "Faiss" that can index vectors (?) And calculate them at low cost (in a short time). I immediately tried it on Google Colab!
!pip3 install faiss-cpu
import numpy as np
import faiss
d = max([len(v) for v in sentence_vectors])
index = faiss.IndexFlatL2(d)
index.add(np.array(sentence_vectors).astype('float32'))
closest_n = 1
D, I = index.search(np.array(query_embeddings).astype('float32'), closest_n)
It's still a test with a small number of vectors to search (100 or less), so I don't feel any dramatic changes ... Actually, the search time was shortened, and the similarity search was not different from the extraction result by Cos similarity, so I started to try to incorporate it into the actual mechanism.
Install with pip like Colab. Faiss seems to be mainly installed with Anaconda, but I don't use Anaconda, so it's pip. In the case of pip, it seems that the module name faiss-cpu / faiss-gpu is specified ... When switching between cpu and gpu, it seems to uninstall and reinstall with either one.
https://pypi.org/project/faiss-cpu/ https://pypi.org/project/faiss-gpu/
$ sudo pip3 install faiss-cpu
Collecting faiss-cpu
Downloading https://files.pythonhosted.org/packages/8b/3e/d64ff22504a70fb15457de8fb2f5fd84e35448fdcd9958880ae8d0438a82/faiss-cpu-1.6.4.post2.tar.gz
Building wheels for collected packages: faiss-cpu
Running setup.py bdist_wheel for faiss-cpu ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmp2c2gltlxpip-wheel- --python-tag cp36:
running bdist_wheel
running build
running build_py
running build_ext
building 'faiss._swigfaiss' extension
swigging faiss/faiss/python/swigfaiss.i to faiss/faiss/python/swigfaiss_wrap.cpp
swig -python -c++ -Doverride= -I/usr/local/include -Ifaiss -DSWIGWORDSIZE64 -o faiss/faiss/python/swigfaiss_wrap.cpp faiss/faiss/python/swigfaiss.i
unable to execute 'swig': No such file or directory
error: command 'swig' failed with exit status 1
----------------------------------------
Failed building wheel for faiss-cpu
Running setup.py clean for faiss-cpu
Failed to build faiss-cpu
Installing collected packages: faiss-cpu
Running setup.py install for faiss-cpu ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-q0l4dufw-record/install-record.txt --single-version-externally-managed --compile:
running install
running build
running build_py
running build_ext
building 'faiss._swigfaiss' extension
swigging faiss/faiss/python/swigfaiss.i to faiss/faiss/python/swigfaiss_wrap.cpp
swig -python -c++ -Doverride= -I/usr/local/include -Ifaiss -DSWIGWORDSIZE64 -o faiss/faiss/python/swigfaiss_wrap.cpp faiss/faiss/python/swigfaiss.i
unable to execute 'swig': No such file or directory
error: command 'swig' failed with exit status 1
----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-i9sic395/faiss-cpu/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-q0l4dufw-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-i9sic395/faiss-cpu/
By the way, I tried it on my local Mac. Maybe Colab alone was the way to go? !! No, the installation is completed with faiss-cpu on my local Mac without any problems ... This smells bad!
I searched on the Web in various ways, but I couldn't find any decisive measures ... Among them, I found a substitute called faiss-centos, which is a combination of the seed words of my current worries. This is cool! !!
https://pypi.org/project/faiss-centos/
I'm enthusiastic here! Look! !!
$ sudo pip3 install faiss-centos
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting faiss-centos
Could not find a version that satisfies the requirement faiss-centos (from versions: )
No matching distribution found for faiss-centos
I've been wandering around the Web in various ways, but I wonder if the solution was as follows ... I'm not sure. https://github.com/facebookresearch/faiss/issues/866
Among them, faiss-centos is an egg, not a wheel, so try dropping the pip version to 8 ... Try unzipping the egg file ... Try installing openblas-serial or gmp-devel ...
However, I can't find _swigfaiss or something, without worrying about my anguish! I will say a difficult problem. I'm tired ...
Take a break ... or do something else to distract, drink tea, get sick ...
Well, I took a rest, I got rid of my brain fatigue, and it's already night! Log in to the CentOS7 server again ...
$ python3
Python 3.6.8 (default, Apr 2 2020, 13:34:55)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faiss
>>>
Before the break, it was an error ... Did seven Kobitos finally come to me? ??
So move to the desired folder and again ... error. Go back to the route and try again ... Cool! What's the difference? ??
Somehow, I downloaded from "https://pypi.org/project/faiss-centos/", unzipped faiss_centos-1.5.2-py3.6.egg, made the faiss / folder directly under it, and imported it. It is. A ray of light ...
If so ... What if I copy this faiss / folder to site-packages /? ?? ??
After that, I identified the libraries that need to be installed additionally, and identified the procedure for installing faiss on CentOS7.
Once I knew it, it was just this ...
$ wget https://files.pythonhosted.org/packages/f6/8b/ab69a201ea1b8be759ba16f172f92d1fb935a8f4a94f02fe52c7d8ec579f/faiss_centos-1.5.2-py3.6.egg
$ unzip faiss_centos-1.5.2-py3.6.egg
$ sudo cp -r ./faiss /usr/local/lib/python3.6/site-packages
(Or ...$ sudo cp -r ./faiss /usr/lib/python3.6/site-packages according to the environment ...)
$ sudo yum install openblas-serial
$ sudo yum install gmp gmp-devel
$ python3
Python 3.6.8 (default, Apr 2 2020, 13:34:55)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import faiss
>>>
If anyone has a similar problem, I would be grateful if you could refer to it.
Recommended Posts