New Year's cards sent every year ... When I was organizing my room, I got a lot of New Year's cards from my seniors and juniors. I wondered if this could be useful.
Then I came up with the idea that I could visualize myself as seen by others through the New Year's card. I wondered if so-called self-analysis could be done through New Year's cards.
Come to think of it, when I write to another person, I write with that person's impressions and episodes from last year. I wondered if this was the same for others.
You should be able to extract your impression of yourself by morphologically analyzing the New Year's card ... I wanted to make it a word cloud and visualize my impression of others.
First, we need to collect the data to be analyzed, so we will summarize the contents of the New Year's card in Excel.
Like this, I entered it without the ridiculous greetings of Happy New Year and Kotoyoro. As much as possible, I tried to enter only words related to my impressions and episodes.
Next, combine the entered Excel into one data
python
import xlrd
wb = xlrd.open_workbook('/nenga2020.xlsx')
sheet = wb.sheet_by_name('Sheet1')
col_values = sheet.col_values(0)
text=""
for i in col_values:
text=text+i
print(text)
This means that the text contains all the text of the New Year's card.
Finally, morphological analysis and word cloud creation are done from here.
import MeCab
import wordcloud, codecs
m = MeCab.Tagger("")
text = text.replace('\r', '')
parsed = m.parse(text)
splitted = ' '.join(
[x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0] in ["noun","adjective","Adjectival noun"] ])
wordc = wordcloud.WordCloud(font_path='HGRGM.TTC',
background_color='white',
contour_color='steelblue',
contour_width=2).generate(splitted)
wordc.to_file('nenga2020.png')
This will solve the impression written on the New Year's card.
python
splitted = ' '.join(
[x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0] in ["noun","adjective","Adjectival noun"] ])
By the way, the part of speech is narrowed down to nouns, adjectives, and adjective verbs. This is because the purpose is to extract my impression.
Whole code
import xlrd
import MeCab
import wordcloud, codecs
wb = xlrd.open_workbook('/nenga2020.xlsx')
sheet = wb.sheet_by_name('Sheet1')
col_values = sheet.col_values(0)
text=""
for i in col_values:
text=text+i
m = MeCab.Tagger("")
text = text.replace('\r', '')
parsed = m.parse(text)
splitted = ' '.join(
[x.split('\t')[0] for x in parsed.splitlines()[:-1] if x.split('\t')[1].split(',')[0] in ["noun","adjective","Adjectival noun"] ])
wordc = wordcloud.WordCloud(font_path='HGRGM.TTC',
background_color='white',
contour_color='steelblue',
contour_width=2).generate(splitted)
wordc.to_file('nenga2020.png')
And here is the completed word cloud for self-analysis.
I got a lot of New Year's cards from the Kendo club, so there are many related words ...
Words such as funny, competent, and respect are considered to be one's impressions from others.
Recommended Posts