When I'm surfing the net, I'm fluttering with taunts that jump out of the words Masakari and Ahat Ahat. If such stray bullets fly here, how should we spend for mental health?
One solution is to pass on such bad mental health sentences by converting them "slowly".
** Slow translation ** http://needtec.sakura.ne.jp/yukkuri_translator/
Let's say you were told, "Don't just make shit videos, this de morons." However, if it is converted to "Yeah, don't make it, this Dote-san.", You won't get angry.
Here, we will use MeCab to perform morphological analysis, and slowly convert sentences that are bad for mental health to eliminate discomfort, but rather to make them feel at home.
yukkuri_translator.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import MeCab
import jctconv
import sys
import codecs
reload(sys)
sys.setdefaultencoding('utf-8')
sys.stdout = codecs.getwriter('utf-8') (sys.stdout)
converter = {
'eat' : 'Mush Mush',
'eat' : 'Squint',
'sleep' : 'Suyasu',
'sleep' : 'Suyasu',
'Sleep' : 'Suyasuyashi',
'Sleep' : 'Suyasuyashi',
'shit' : 'Yes Yes',
'Stool' : 'Yes Yes',
'Flight' : 'Yes Yes',
'urine' : 'Shishi',
'Piss' : 'Shishi',
'Sun' : 'Sun',
'Sanctions' : 'At all',
'Confectionery' : 'Fair',
'candy' : 'Fair',
'sugar' : 'Fair',
'juice' : 'Fair',
'Coordination' : 'Coordination',
'pregnancy' : 'Ninshin'
}
class MarisaTranslator:
def __init__(self, user_dic):
self.mecab = MeCab.Tagger("-u " + user_dic)
def _check_san(self, n):
"""
Judgment whether to add "san"
"""
f = n.feature.split(',')
if f[0] == 'noun':
if f[1] == 'Proper noun' or f[1] == 'General':
if n.next:
#Check the next word
nf = n.next.feature.split(',')
if nf[0] in ['noun', 'Auxiliary verb']:
#If the noun follows, do not add "san" here
return False
else:
if n.surface.endswith('Mr.'): # Mr.でおわる場合は付与しない
return False
if n.surface == 'Mr' or n.surface == 'Sama': # Mrでおわる場合は付与しない
return False
return True
else:
return True
return False
def _check_separator(self, n):
"""
Judgment whether to add ","
"""
f = n.feature.split(',')
if f[0] == 'Particle':
if n.next:
#Check the next word
nf = n.next.feature.split(',')
if nf[0] in ['symbol', 'Particle']:
return False
return True
return False
def _get_gobi(self, n):
if n.next:
f_next = n.next.feature.split(',')
if n.next.surface == '、':
return None
if f_next[0] == 'BOS/EOS' or f_next[0] == 'symbol':
f = n.feature.split(',')
if f[0] in ['Particle', 'noun', 'symbol', 'Interjection']:
return None
if f[5] in ['Command e', 'Continuous form']:
return None
if n.surface in ['Is']:
return 'What'
else:
return n.surface + 'Noze'
return None
def translate(self, src):
n = self.mecab.parseToNode(src)
text = ''
pre_node = None
while n:
f = n.feature.split(',')
if n.surface in converter:
text += converter[n.surface]
elif len(f) > 8:
gobi = self._get_gobi(n)
if gobi is not None:
text += gobi
elif f[8] != '*':
text += f[8]
else:
text += n.surface
else:
text += n.surface
if self._check_san(n):
text += 'Mr.'
elif self._check_separator(n):
text += '、'
n = n.next
pre_node = n
return jctconv.kata2hira(text.decode('utf-8')).encode('utf-8')
Example of using the above class:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from yukkuri_translator import MarisaTranslator
if __name__ == "__main__":
t = MarisaTranslator('yukkuri.dic')
print t.translate('Don't just make shit videos, this de morons.')
By making all the characters into hiragana, it becomes a line like a bean paste brain.
To do this, first perform a morphological analysis in MeCab. This will give you the reading of each word. This applies to 8 features (starting with 0). Since this reading is in katakana, use jctconv to convert everything to hiragana.
It may be misread, but it's ** specification ** because it's just bean paste.
Due to the slow specifications, hiragana will be used a lot. Therefore, in order to improve readability, add "," after the particle as much as possible. See "_check_separator" for more information on this condition.
By adding "san" to the end of the noun, you can express the slowness. If the noun follows, there are conditions such as exclusion, so please check "_check_san" for details.
The ending of Slow Marisa has a characteristic, and in many cases the end of the sentence ends with "Noze" or "Nanoze", so I reproduced it.
An example is as follows.
Managing payments and spending is a matter of course
If there is
It's natural to manage spending and spending.
It will be.
See "_get_gobi" for ending conditions.
Try to replace some words. For example, the dirty word "feces" is replaced with "yes" to stabilize the user's mind. This replacement is performed according to the contents registered in the converter variable.
By using MeCab's morphological analysis, it was confirmed that sentences that are bad for mental health can be disguised as if they were talking slowly and cutely.
By applying this, it is thought that translations into sentences such as "Slow Reimu", "Slow Youmu", and "Yaruo" can be performed.
The application that runs on the Web and its code are attached below.
** Slow translation ** http://needtec.sakura.ne.jp/yukkuri_translator/ https://github.com/mima3/yukkuri_translator
that's all.
Recommended Posts