Hello, this is sunfish. Data analysis using Python has become popular these days, but it is difficult to master. The goal is to struggle with Python, and the business improvement that I originally wanted to achieve is here. .. .. I would like to introduce an example of analyzing data using the GUI tool "nehan" to solve such problems.
More than half a year has passed since the coronavirus became a social problem. Let's follow the number of occurrences of that word from the tweet data for the past two months.
nehan can directly import Twitter data, and this time I used that function. I will introduce it later. Every day from July 27, 2020 ** 3,000 tweets including "Corona" in the tweet text are accumulated and data for about 2 months is prepared. Click here for details of the data (https://sunfish.nehan.io/datasources_v2/3424)
port_2 = port_1[['Created_At', 'Text']]
port_3 = port_2.copy()
port_3['Created_At'] = pd.to_datetime(
port_3['Created_At'], errors='coerce', foramt=None)
port_3['Created_At'] = port_3['Created_At'].map(lambda x: x.date())
port_4 = port_3.copy()
port_4 = port_4.dropna(subset=None, how='any')
port_5 = port_4[(port_4['Text'].str.contains('cluster', na=False, regex=False))]
port_9 = port_5.copy()
port_9 = port_9.groupby(['Created_At']).agg(
{'Created_At': ['size']}).reset_index()
port_9.columns = ['Created_At', 'Line count']
The word "cluster" is widely recognized as a symbol of explosive infection. The reason why it flew on 8/9 is probably due to the [Cluster Festival] held in Shibuya (https://news.yahoo.co.jp/articles/76e47dc2ce6608e018fe37bc92be296e381f76fa?page=1).
I also looked at this word, which made me feel nostalgic.
A new lifestyle is taking root, but it seems that the self-restraint mood is not completely over. It looks like it is gradually decreasing.
In order to get an exact result, I really have to do more pre-processing, but I tried to process the data simply for a rough observation and an introduction to nehan. In addition, the above source code is a copy of the code output by nehan's python export function.
Recommended Posts