[Python] I analyzed the diary of a first-year member of society and made a positive / negative judgment on the life of a member of society.

Introduction

Hello everyone. He has been a member of society since April 2020. With the start of working life, I started to keep a ** diary. ** ** By keeping a diary of what I did and felt every day, I think I can grin if I review what I enjoyed and what I had a hard time in the future.

I kept writing a diary in a format that makes it easy to analyze data, as if I were to write a diary. I have a diary for about 80 days, so I would like to share the results of quantitative and qualitative evaluation of these!

Purpose

By quantitatively and qualitatively analyzing the contents of the diary, the satisfaction level (= positive negative degree) [^ 1] of working life is evaluated from multiple perspectives. [^ 1]: The definition of "satisfaction with working life" is fluffy, but it means whether you can spend your days positively. If you are positive, you can spend your days happily. If it is negative, it should be improved. If you get a high negative score, you should consider changing jobs and take measures. I'm sorry for the rough sketch.

What went

The analysis was conducted from the viewpoints of subjective evaluation and objective evaluation. First, I will introduce the format of the diary.

month start end worktime diary
20200408 900 1730 8.5 <Diary of April 8, 2020>
20200409 900 1730 8.5 <Diary of April 9, 2020>

The diary is saved in csv format for easy data analysis. There are 5 types of columns in total.

--month: Date of date --start: Work start time --end: work end time --worktime: working hours --diary: Diary

Currently, about 80 days' worth of diary is saved as csv.

Subjective evaluation

month start end worktime diary
20200509 900 1730 8.5 I made a careless mistake today and caused trouble to my seniors, and I was dented-40 However, I made a video call with my friends at night and got well 60 Let's do our best tomorrow.
20200510 900 1730 8.5 I've done everything on time! 80 years old ~!

The diary may have a score at the end of the sentence, as in the diary column above. These values are ** intuitive descriptions of how much emotion you had when you wrote the sentence at the time. ("Event score") ** The range of event score values was in the range of -100 to 100, and the higher the positive emotions, the closer to 100, while the higher the negative emotions, the closer to -100. ** Since the score is intuitively described, the evaluation performed using this value is defined as the subjective evaluation. ** **

_ What to do _ --Mean and standard deviation for all event scores --Daily average event score, standard deviation

Objective evaluation

Analyzing a diary with only intuitively described values can only measure the satisfaction level of working life on your own scale. To have a broader perspective, we use the Python library to evaluate the analysis of the diary text rather than using the scores in the text.

This time, we set this as an objective evaluation, and tried to evaluate the satisfaction level of working life from the scale of others (= external library) using an external library. As an objective evaluation, the following evaluation indexes were calculated using the event score and considered.

_ What to do _ --Evaluate your diary qualitatively using wordcloud (sometimes I wanted to use it lol) --Make a diary positive / negative judgment using sentiment analysis using the COTOHA API --Investigating whether there is a correlation between positive negatives in the diary and working hours

wordcloud is a library that outputs cool diagrams based on the frequency of use of words in sentences. COTOHA API is a suguremono that performs various analyzes of natural language processing. Anyone can use it once they have created an account, so please try it! (By the way, I also used it in Article I wrote before)

This time, I want to make a positive / negative judgment in my diary, so I will use the sentiment analysis API.

result

Subjective evaluation

(1) Mean value and standard deviation of event scores for the entire diary

Average value standard deviation
6.35 49.31

The histogram of the points is as follows. histfig_all.png

(2) Average daily event score, standard deviation

Average value standard deviation
9.99 33.71

The histogram of the points is as follows. histfig_day.png

** As a subjective evaluation, positive emotions seem to be stronger! ** ** But 6 points overall, 9 points on a daily basis lol I think I was satisfied with my life to some extent, but the score is so low. Certainly, it's not that there are days when you feel stressful.

Objective evaluation

① Qualitatively evaluate the diary using wordcloud

Below is the image output by wordcloud. wordcloud_2020-08-23_11:11:52_.png

** Since the names of two senior employees have been displayed (and are large), I changed to Jotaro and Polnareff. .. (I like JOJO ...) ** The person's name is displayed large, isn't it? The large display means that it appears frequently in the text. The names of senior employees were written in various situations, both good and bad, surely. ** I realized once again that human relationships are basic even at work. (Most of the time I felt stressed was the vague instructions of my seniors, and the guidance policy was too different for each person (darkness)) **

"Today" is so big because most of the diary entries were "Today is ...".

② Make a positive / negative judgment of the diary using sentiment analysis by COTOHA API

Using the sentiment analysis API of the COTOHA API, you can output the positive / negative state and its reliability score (area value is 0 <= x <= 1). There are four types of positive / negative states: "Positive", "Negative", "Neutral", and "Positive / Negative", but only "Positive" and "Negative" are handled. The result of performing "Positive" or "Negative" judgment on a daily basis and outputting the mean value and standard deviation is shown below.

Positive negative number Average value standard deviation
Positive 39 0.18 0.17
Negative 25 0.29 0.20

Many days have positive feelings, but negative days have a greater degree of depression than positive days (although the original meaning of COTOHA's score is "reliability"). I found out that. It's the opposite of subjective evaluation. .. .. ** There are many fun days, but when you feel unpleasant, you feel unpleasant, isn't it just like in the real world? Lol Amazingly realistic results. ** **

③ Investigate whether there is a correlation between positive negatives in the diary and working hours

It is the end. I was curious if there was a correlation between working hours on positive days and working hours on negative days, so I investigated it. The worktime column and positive / negative score described in the diary are output as csv in the following format.

Positive negative Score working time
Negative -0.60 8.5
Negative -0.28 8.5
Positive 0.35 8.5

Since I had never output the correlation coefficient, I investigated it as a beginner and used corr () of pandas to output the correlation coefficient. This time, we used Pearson's product moment correlation coefficient, which was set by default. (I don't understand)

** The correlation results should show you the following: ⬇️⬇️⬇️ Positive correlation: The higher the score, the higher the working time (= the more positive the working time increases) Negative correlation: The higher the score, the lower the working time (= the more positive the working time decreases) **

In other words, in general, I think that the less working hours, the less stress (you can feel positive), so if a negative correlation is output, you will be happy. So the graph below. fig_sca2.png

** The slope of the approximate straight line was -0.11. It has been proven (in my case) that many positive days have less working hours ~ **

Evaluation summary

--In the subjective evaluation (intuitively assigning a positive / negative score to each sentence in the diary and evaluating based on that score), the average value was slightly larger than the positive value. --Objective evaluation --In wordcloud, the names of senior employees were displayed in large size. ――There are many positive days in the positive / negative judgment using COTOHA, but there are many negative days in the score. --Positive negative score and correlation coefficient are negatively correlated (= working hours decrease as it becomes positive)

Conclusion

From the perspective of positive and negative scores, the results of subjective and objective evaluations have been reversed. For me, I thought I was happy with my life (although I'm dissatisfied, of course), but on a negative day as a system, I judged a fairly large value. I wonder if I'm strong against stress. .. ?? I don't think that's the case, but lol However, the system also determined that there were more positive days in terms of the number of days. Even if you check the histogram of the subjective evaluation, it looks like they are linked.

** In conclusion, my working life has many fun days, but on days when I feel unpleasant, I think I feel unpleasant lol ** Well, the degree of satisfaction is about 60 points, but at university it feels like the credits come to the limit.

Impressions

Thank you for reading to the end, despite the childish writing! I was thinking of turning my adult diary into a data analysis and article someday, so I'm glad it came true! I'm glad I wrote this article because I was able to do a lot of new trial and error and apply the tools I've used so far. It will be a learning experience, so I hope to continue writing articles on a regular basis. I want to keep my diary and add more data for analysis ...!

Finally, everyone in the first year of working life like me! We are said to be the Corona generation, but let's all do our best not to lose to such a negative word! (Personally, I hate the corona generation, though I can't help it.)

That's it, thank you for reading!

Recommended Posts

[Python] I analyzed the diary of a first-year member of society and made a positive / negative judgment on the life of a member of society.
I scraped the Organization member team and made a ranking
Life game with Python [I made it] (on the terminal & Tkinter)
Hannari Python At the LT meeting in December, I made a presentation on "Python and Bayesian statistics".
I made a program to check the size of a file in Python
I made a function to see the movement of a two-dimensional array (Python)
[Python] I made a web scraping code that automatically acquires the news title and URL of Nikkei Inc.
I made a script to record the active window using win32gui of Python
A discussion of the strengths and weaknesses of Python
[Python3] Take a screenshot of a web page on the server and crop it further
[Example of Python improvement] I learned the basics of Python on a free site in 2 weeks.
I compared the speed of the reference of the python in list and the reference of the dictionary comprehension made from the in list.
[Python & SQLite] I analyzed the expected value of a race with horses with a win of 1x ②
[Python / C] I made a device that wirelessly scrolls the screen of a PC remotely.
I made a Python3 environment on Ubuntu with direnv.
I checked out the versions of Blender and Python
I made a LINE BOT with Python and Heroku
I made a twitter app that identifies and saves the image of a specific character on the twitter timeline by pytorch transfer learning
A library that monitors the life and death of other machines by pinging from Python
I analyzed the rank battle data of Pokemon sword shield and visualized it on Tableau
I made a function to crop the image of python openCV, so please use it.
I just changed the sample source of Python a little.
I made a function to check the model of DCGAN
I made a dot picture of the image of Irasutoya. (part1)
Negative / positive judgment of sentences and visualization of grounds by Transformer
I made a VGG16 model using TensorFlow (on the way)
Negative / positive judgment of sentences by BERT and visualization of grounds
I want to know the features of Python and pip
I made a dot picture of the image of Irasutoya. (part2)
I made a Chatbot using LINE Messaging API and Python
I wrote AWS Lambda, and I was a little addicted to the default value of Python arguments
I made a Line bot that guesses the gender and age of a person from an image
I made a python text
I want to clear up the question of the "__init__" method and the "self" argument of a Python class.
I compared the speed of Hash with Topaz, Ruby and Python
I made a simple circuit with Python (AND, OR, NOR, etc.)
Save the result of the life game as a gif with python
[Python] I wrote the route of the typhoon on the map using folium
[Introduction to Python] I compared the naming conventions of C # and Python.
[Introduction to StyleGAN] I played with "The Life of a Man" ♬
I made a Nyanko tweet form with Python, Flask and Heroku
[Python] I thoroughly explained the theory and implementation of logistic regression
I made a lot of files for RDP connection with Python
[Python] I thoroughly explained the theory and implementation of decision trees
Let's make a positive / negative judgment tool into a band graph (Python)
I made a slack bot that notifies me of the temperature
I made a scaffolding tool for the Python web framework Bottle
I made a Chatbot using LINE Messaging API and Python (2) ~ Server ~
I did a lot of research on how Python is executed
Get the number of readers of a treatise on Mendeley in Python
[Kaggle] I made a collection of questions using the Titanic tutorial
Create a compatibility judgment program with the random module of python.
I made a web application that graphs the life log recorded on Google Home like a Gantt chart.
I made a POST script to create an issue on Github and register it in the Project
[Python] I made a bot that tells me the current temperature when I enter a place name on LINE
Use AWS lambda to scrape the news and notify LINE of updates on a regular basis [python]
The story of Python and the story of NaN
I made a Line-bot using Python!
I made a fortune with Python.
I made a daemon with Python
I replaced the numerical calculation of Python with Rust and compared the speed