For Ubuntu 14.04, write a memo about the environment construction to enable CaboCha, which performs Japanese natural sentence dependency analysis, to be used from Python 2.7.
First, MeCab, which performs morphological analysis under CaboCha, is required first. On Ubuntu 14.04 you can install MeCab 0.996 with apt-get.
$ sudo apt-get install build-essential mecab libmecab-dev mecab-ipadic mecab-ipadic-utf8 python-mecab
$ mecab --version
mecab of 0.996
$ mecab-config --version
0.996
$ mecab
Of the thighs and thighs
Plum noun,General,*,*,*,*,Plum,Plum,Plum
Also particles,Particle,*,*,*,*,Also,Mo,Mo
Peach noun,General,*,*,*,*,Peaches,peach,peach
Also particles,Particle,*,*,*,*,Also,Mo,Mo
Peach noun,General,*,*,*,*,Peaches,peach,peach
Particles,Attributive,*,*,*,*,of,No,No
Noun,Non-independent,Adverbs possible,*,*,*,home,Uchi,Uchi
EOS
Then build CRF ++-0.58.tar.gz. CRF ++ must be installed first as the CaboCha library seems to refer to it. Is MeCab using the * conditional random field * (CRF) internally using the crf_learn
command, or is it necessary to have a header or lib for compilation?
$ tar zxvf CRF++-0.58.tar.gz
$ cd CRF++-0.58/
$ ./configure
$ make
$ sudo make install
$ sudo ldconfig
Download cabocha-0.69.tar.gz from the official website and build CaboCha 0.69 + cabocha-python.
$ tar zxvf cabocha-0.69.tar.gz
$ cd cabocha-0.69
$ ./configure --with-mecab-config=`which mecab-config` --with-charset=UTF8
$ make
$ sudo make install
$ cabocha --version
cabocha of 0.69
$ cabocha
Of the thighs and thighs
Thigh-D
Peach---D
Thigh-D
home
EOS
$ cd python
$ python setup.py install #sudo/May enter usr
$ python -c "import CaboCha; p=CaboCha.Parser(); print(p.parseToString('Of the thighs and thighs'))"
Thigh-D
Peach---D
Thigh-D
home
EOS
Note that setup.py
does not support Python 3 syntax constraints, so if you can say it to Python 3, you need to fix it (search for it).
that's all.
Recommended Posts