I wanted to escape from the Unicode problem, so I chose Python3, but I get a UnicodeDecodeError because mecab-python3 is not working properly.
Moreover, when you run the test, it appears randomly
import MeCab
mecab = MeCab.Tagger()
node = mecab.parseToNode("Of the thighs and thighs")
while node:
print(node.surface)
node = node.next
Then
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-11-1f88b1ec9c08> in <module>()
1 while node:
----> 2 print(node.surface)
3 node = node.next
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte
The error is ...
It's hard to cure, so it's said to do mecab.parse ("")
first.
import MeCab
mecab = MeCab.Tagger()
mecab.parse("") #add to
node = mecab.parseToNode("Of the thighs and thighs")
while node:
print(node.surface)
node = node.next
Then
Plum
Also
Peaches
Also
Peaches
of
home
did it. I'm not sure, but it started working.
Recommended Posts