Make the morphological analysis engine "MeCab" available from Python 3 installed in pyenv on Mac.
Basically, the contents of the existing summary article are the same, but the patch was applied to the official repository of GitHub, and the work of manually applying the patch in the original article is just to modify the binding code of ~~ Python by one line. It was in good condition. ~~ (2016 / 3/2 revision) All are no longer needed.
I've just compiled the information from the original article, but I've retried the installation several times, so I'll leave the steps behind.
Official site http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html Repository https://github.com/taku910/mecab
Original article -Make MeCab available from Python 3 -MeCab with Python3 -Use MeCab from Python3 (follow-up) (Article of the person who pulled the patch)
git clone https://github.com/taku910/mecab.git
cd mecab/mecab
./configure --enable-utf8-only
make
make check
sudo make install
After installation, mecab will be deployed.
/usr/local/etc/mecabrc
/usr/local/bin/mecab
/usr/local/bin/mecab-config
~~ If you start mecab from the console and then enter Japanese, the morphological analysis result will be displayed. ~~ _2016 / 3/2 postscript In the first edition, I wrote the explanation using the mecab command here, but I could not use it until I installed the dictionary. _
~~ Download "IPA Dictionary" from the official website. ~~ ~~http://taku910.github.io/mecab/#install~~ ~~http://taku910.github.io/mecab/#download~~
tar zxfv mecab-ipadic-2.7.0-20070801.tar.gz
cd mecab-ipadic-2.7.0-20070801
./configure --with-charset=utf8
make
sudo make install
_2016/3/2 Addendum 2 Please skip here as well.
It was included in the git project without having to download it.
cd ../mecab-ipadic
./configure --with-charset=utf8
make
sudo make install
_2016/3/2 Addendum 2 This is the latest
At this point, start mecab from the console and continue to enter Japanese, and the morphological analysis results will be displayed.
$ mecab
MeCab is free software
MeCab noun,Proper noun,Organization,*,*,*,*
Is a particle,Particle,*,*,*,*,Is,C,Wow
Free noun,General,*,*,*,*,free,free,free
Software noun,General,*,*,*,*,software,software,software
Auxiliary verb,*,*,*,Special Death,Uninflected word,is,death,death
EOS
_2016 / 3/2 postscript There was an easier method than the first edition. Please skip it for a while. _
~~ Next, prepare to use MeCab from Python. Since bindings of various languages are prepared in the directory that was git cloned earlier, move to the python directory. ~~
cd [MeCab git cloned directory]
cd mecab/mecab/python
#2016/3/2 Addendum Please skip here
~~ Now, we need to modify the code in setup.py by one line. Be careful not to erase the tab before return. ~~
~~ This article "MeCab with Python 3" ~~
vi setup.py
def cmd2(str):
return string.split (cmd1(str))
Changed to
def cmd2(str):
return cmd1(str).split()
#2016/3/2 Addendum Please skip here as well
~~ After fixing, install it. ~~
python setup.py build
sudo python setup.py install
#2016/3/2 Addendum Please skip here as well
_ 2016/3/2 postscript _ _ There was a simpler procedure. As described in the article below, you can use it from Python 3 with the pip command. _
pip install mecab-python3
Try running the Python sample on the official website. The original is the code for Python2, so only print is changed.
import sys
import MeCab
m = MeCab.Tagger ("-Ochasen")
print(m.parse ("I have to do it today"))
Execution result
Today Kyo Today noun-Adverbs possible
Momo particle-Particle
Verb-Independence Sahen / Suru imperfect form
No Nai No Auxiliary verb Special / Nai Basic form
And to and particles-Connection particle
Nene Nene Particles-Final particle
EOS
Please let me know if the procedure is wrong.
Recommended Posts