Trigger

https://qiita.com/dely13/items/5e949a384161c961d8ce If you read this article and try it yourself after practicing ~~ play ~~, the result will be different → This article is 2017 So I tried to put out the latest (as of 10:00 on June 29, 2020)

The first half remains as it is

I will use @ dely13's article as it is

`dely13.py`


import pandas as pd
import requests
import numpy as np
import seaborn as sns
from scipy import stats
import matplotlib.pyplot as plt
%matplotlib inline

url = "http://api.syosetu.com/novelapi/api/"
#Specify API parameters in the dictionary
#Under this condition, output json format data in the order of comprehensive evaluation
payload = {'of': 't-gp-gf', 'order': 'hyoka','out':'json'}

st = 1
lim = 500

data = []
while st < 2000:
    payload = {'of': 't-gp-gf-n', 'order': 'hyoka',
          'out':'json','lim':lim,'st':st}
    r = requests.get(url,params=payload)
    x = r.json()
    data.extend(x[1:])
    st = st + lim
df = pd.DataFrame(data)

#Preprocessing('year'Add column,'title_len'Add column)
df['general_firstup'] = pd.to_datetime(df['general_firstup'])
df['year'] = df['general_firstup'].apply(lambda x:x.year)

df['title_len'] = df['title'].apply(len)

Please read the original article for details as it is really as it is

Main subject

In 2017

Interesting numbers. The average value is 17 characters, which is the same as the number of characters in haiku. In other words, the title of Naruro was haiku! Frog Poem and the sound of water jumping into a frog ...

I was told, but in 2020 ...?

df['title_len'].hist()

df['title_len'].describe()

Histogram diagram df ['title_len'] .hist () Data df ['title_len'] .describe ()

count 2000.000000 mean 24.179500 std 15.528356 min 2.000000 25% 12.000000 50% 21.000000 75% 32.000000 max 100.000000 Name: title_len, dtype: float64

Wwwwwww which increases 7 characters on average

And what's really interesting is from here

`per_year.py`


title_by_year = df.groupby('year')['title_len'].agg(['mean','count','std']).reset_index()
#plot
title_by_year.plot(x='year',y='mean') 
#data
title_by_year

Plot title_by_year.plot (x ='year', y ='mean') * mean = average

Aggregate title_by_year

year	mean	count	std
2008	7.500000	2	2.121320
2009	12.428571	7	8.182443
2010	10.882353	17	5.278285
2011	10.180000	50	4.684712
2012	13.294737	95	6.963237
2013	14.115942	138	8.541930
2014	16.065476	168	8.780176
2015	18.218009	211	9.701245
2016	21.577358	265	12.326472
2017	24.476015	271	11.750113
2018	29.425856	263	13.890288
2019	31.327327	333	15.861156
2020	40.483333	180	22.348053

Data for 2020 is data up to June 29

Conclusion

** The title of 2019 will be Tanka ** The person who guessed in the 2017 article is amazing. It's Don Pisha.

Digression 1

Since it's a big deal, I'll try to find the maximum and minimum

title_by_year = df.groupby('year')['title_len'].agg(['mean','min','max']).reset_index()
#plot
title_by_year.plot(x='year')
#data
title_by_year.plot

Plot title_by_year.plot (x ='year')

Data title_by_year

year	mean	min	max
2008	7.500000	6	9
2009	12.428571	5	25
2010	10.882353	2	23
2011	10.180000	4	26
2012	13.294737	3	40
2013	14.115942	3	54
2014	16.065476	4	63
2015	18.218009	3	59
2016	21.577358	2	77
2017	24.476015	4	69
2018	29.425856	5	74
2019	31.327327	4	100
2020	40.483333	4	100

Isn't this 100-character data exceeding the number of characters?

`max_100.py`


df[['ncode','title','year','title_len']].set_index('ncode').query('title_len==100')

ncode	title	year	title_len
N7855GF	I was treated as incompetent and was banished from my childhood friend party. I made full use of the gift "Translation"....	2020	100
N6203GE	A blacksmith who was exiled from the dictatorship, in fact, with the protection of "Blacksmith Goddess", suddenly with "Super Legendary" armor full equipment...	2020	100
N0533FS	[Series version] I witnessed the chasing idol walking with a handsome guy, so I bought a part-time job...	2019	100
N4571GF	In the 7th week of the loop, I learned that I was fitted with my believing friends, so I actively partyed on the 8th lap....	2020	100

... this isn't over 100 characters ...?

When I looked it up after writing the article, it was exactly 100 characters

Is there a character limit? That's what I'm fighting at the limit.

Digression 2

On the contrary, I was interested in short titles

`mini_len.py`


df.groupby('title_len')['title_len'].agg(['count']).head(9).T

List of correspondence between the number of characters and the number of works Since it has become longer, it is placed horizontally

title_len	2	3	4	5	6	7	8	9	10
count	2	8	18	35	41	38	64	75	89

`title2_4.py`


df[['title','year','title_len']].set_index('title').sort_values('title_len').query('title_len<5')

4 characters are excerpts

title	year	title_len
letter	2016	2
dawn	2010	2
Bow and sword	2013	3
The reason for water	2012	3
Tomb King!	2013	3
Childhood friend	2016	3
Searcher	2013	3
The shadow of the tower	2012	3
Extermination person	2015	3
Cat and dragon	2013	3
Oblivion saint	2020	4
Ｊ／５３	2012	4
Black Demon King	2011	4
My servant	2019	4
Mob love	2015	4
Wise man's grandson	2015	4
Seventh	2014	4

Even a few letters are famous. I was impressed that the former Moba people had a "title" in the four letters.

Impressions

Is it the influence of the animation of mobile novels that many beginners enter, if not as much as Moba (currently Ebu)? I was trained by Moba, so even if it's a little difficult to read, I'll read it if the content is interesting, but even so, the title is long. I'm addicted to this and this, which are rather long titles. (Ebudato this * Stemmer)

I wanted to try various things because I can narrow down the search conditions with the Naro API. What if you want to extract more than 2000 items ...

I tried the Naruro novel API