It is listed below in a bulleted list.
The following is the installation method on google colab.
If you want to try it locally, change the installation method accordingly.
#Install MeCab
!apt install mecab libmecab-dev mecab-ipadic-utf8
!pip install mecab-python3
# mecab-ipadic-Install NEologd
!apt install git make curl xz-utils file
!git clone --depth 1 https://github.com/neologd/mecab-ipadic-neologd.git
!echo yes | mecab-ipadic-neologd/bin/install-mecab-ipadic-neologd -n -a
# Ref: https://qiita.com/Fulltea/items/90f6ebe6dcceaf64eaef
# Ref: https://qiita.com/SUZUKI_Masaya/items/685000d569452585210c
!ln -s /etc/mecabrc /usr/local/etc/mecabrc
# Ref: https://qiita.com/Naritoshi/items/8f55d7d5cce9ce414395
#Library for sentiment analysis
!pip install asari oseti pymlask
The text used as input for sentiment analysis is from Aozora Bunko.
"Puppet use" Hans Christian Andersen (Translated by Genkuro Yazaki)
I picked it up from.
list_text = [
'This person must be the happiest person in the world.',
'The playhouse was wonderful and the audience was wonderful.',
'If it was in the Middle Ages, it would probably have been burned at the stake.',
'When it came to everyone's annoyance, it was as if flies were buzzing in the bottle.',
'If we humans can come up with these things, we should be able to live longer before they are buried in the earth.'
]
asari
#Simple operation check
from asari.api import Sonar
sonar = Sonar()
res = sonar.ping(text="Too many ads ♡")
res
{'classes': [{'class_name': 'negative', 'confidence': 0.9086981552962491},
{'class_name': 'positive', 'confidence': 0.0913018447037509}],
'text':'Too many ads ♡', 'top_class': 'negative'}
list(map(sonar.ping, list_text))
[{'classes': [{'class_name': 'negative', 'confidence': 0.10382535749585702},
{'class_name': 'positive', 'confidence': 0.896174642504143}],
'text':' This person must be the happiest person in the world. ', 'top_class': 'positive'}, {'classes': [{'class_name': 'negative', 'confidence': 0.035517582235360945}, {'class_name': 'positive', 'confidence': 0.964482417764639}], 'text':' The playhouse was wonderful and the audience was wonderful. ', 'top_class': 'positive'}, {'classes': [{'class_name': 'negative', 'confidence': 0.5815274190768989}, {'class_name': 'positive', 'confidence': 0.41847258092310113}], 'text':' If it was the Middle Ages, it would probably have been burned at the stake. ', 'top_class': 'negative'}, {'classes': [{'class_name': 'negative', 'confidence': 0.2692695045573754}, {'class_name': 'positive', 'confidence': 0.7307304954426246}], 'text':' When it comes to everyone's annoyance, it was as if flies were buzzing in the bottle. ', 'top_class': 'positive'}, {'classes': [{'class_name': 'negative', 'confidence': 0.050528495655525495}, {'class_name': 'positive', 'confidence': 0.9494715043444746}], 'text':'If we humans can come up with these things, we should be able to live longer before they are buried in the earth', 'top_class': 'positive'}]
The sentence, "When it comes to everyone's annoyance, it was as if a fly was buzzing in a bottle." Intuitively, it was a negative impression, but it was judged to be positive.
There seems to be a reasonable judgment for other examples.
oseti
#Simple operation check
import oseti
analyzer = oseti.Analyzer()
analyzer.analyze('I'm waiting in heaven.')
[1.0]
list(map(analyzer.analyze, list_text))
[[0.0], [1.0], [0], [0], [1.0]]
The second sentence, "The playhouse was wonderful and the customers were wonderful."
When
Fifth sentence "If we humans can come up with this, we should be able to live longer before we are buried in the earth."
Only positive (+1) judgment, neutral judgment for other sentences.
After all, the impression that dictionary-based is weak against words that are not included in the dictionary.
pymlask
The author of the package is the same as oseti.
#Simple operation check
import mlask
emotion_analyzer = mlask.MLAsk()
emotion_analyzer.analyze('I don't hate him!(;´Д`)')
# => {'text': 'I don't hate him!(;´Д`)',
# 'emotion': defaultdict(<class 'list'>,{'yorokobi': ['Hate*CVS'], 'suki': ['Hate*CVS']}),
# 'orientation': 'POSITIVE',
# 'activation': 'NEUTRAL',
# 'emoticon': ['(;´Д`)'],
# 'intension': 2,
# 'intensifier': {'exclamation': ['!'], 'emotikony': ['´Д`', 'Д`', '´Д', '(;´Д`)']},
# 'representative': ('yorokobi', ['Hate*CVS'])
# }
{'activation': 'NEUTRAL',
'emoticon': ['(;´Д`)'],
'emotion': defaultdict (list, {'suki': ['dislike * CVS'],'yorokobi': ['dislike * CVS']}),
'intensifier': {'emotikony': ['´Д', 'Д
', '´Д', '(;´Д)'], 'exclamation': ['!']}, 'intension': 2, 'orientation': 'POSITIVE', 'representative': ('yorokobi', ['dislike * CVS']), 'text':'I don't hate him! (; ´Д
)'}
#It's a big deal, so I'll try using the neologd dictionary
# mecab-ipadic-Find out where to install neologd
import subprocess
cmd='echo `mecab-config --dicdir`"/mecab-ipadic-neologd"'
path = (subprocess.Popen(cmd, stdout=subprocess.PIPE,
shell=True).communicate()[0]).decode('utf-8')
emotion_analyzer = mlask.MLAsk('-d {0}'.format(path)) # Use other dictionary
list(map(emotion_analyzer.analyze, list_text))
[{'activation': 'NEUTRAL',
'emoticon': None,
'emotion': defaultdict (list, {'yorokobi': ['happiness']}), 'intensifier': {}, 'intension': 0, 'orientation': 'POSITIVE', 'representative': ('yorokobi', ['happy']), 'text':' This person must be the happiest person in the world. '}, {'emotion': None,'text':' The playhouse was wonderful and the audience was wonderful. '}, {'emotion': None,'text':' If it was the Middle Ages, it would probably have been burned at the stake. '}, {'emotion': None,'text':'When it comes to everyone's annoyance, it was like a fly buzzing in a bottle. '}, {'emotion': None, 'text':'If we humans can come up with this, we should be able to live longer before we are buried in the earth'}]]
This method is also judged to be positive if there is a word (happiness) in the dictionary, but it is impossible to judge if it is not in the dictionary.
Impression that the result is not good.
I tried a tool that makes it easy to analyze the emotions of Japanese sentences.
Thank you for publishing these tools.
If you want to do serious sentiment analysis and get more reasonable results, you will probably need to add processing according to the sentence category that suits your purpose, or use neural network techniques (in that case, the data set). It's hard to create).
-[[27 posted] Dataset summary that can be used for sentiment analysis of sentences, facial expressions, and voice | Lionbridge AI](https://lionbridge.ai/ja/datasets/15-free-sentiment-analysis-datasets-for-machine -learning /) --Links to resources and polarity dictionaries, etc. -[Natural language processing] How to proceed with sentiment analysis & points that are easy to get hooked on --Qiita -Story of making and packaging Japanese Sentiment Analyzer --Ahogrammer -Sentiment analysis of corporate word-of-mouth data of job change meetings using deep learning --Qiita -I tried to analyze the emotions of the whole novel "Weathering with You" ☔️ --Qiita -Sentiment Analysis library oseti for Python using Japanese evaluation polarity dictionary has been released --Qiita -Sentiment analysis of text with ML-Ask --Qiita
-SNOW D18: Japanese Emotional Expression Dictionary-Nagaoka University of Technology Natural Language Processing Laboratory --Nagaoka University of Technology Natural Language Processing Laboratory ――Approximately 2,000 expressions are recorded, and each expression is given 48 categories of emotions that we have defined independently.
Recommended Posts