I found an interesting command while reading Narurou Novel API, so I will introduce and analyze it.

Conversation rate

|Parameters|value|Description| |:--|:--|:--| |kaiwaritu |int string |The conversation rate of the novel to be extracted%It can be specified in units. When specifying a range, hyphen the minimum and maximum numbers(-)Separate with a symbol.

I see. Conversation rate …… I wonder if it's just conversation or the part of the ground

Then immediately

Prepare for loading and load the library

`before_load.py`


import pandas as pd
import requests
import numpy as np
import seaborn as sns
from scipy import stats
import matplotlib.pyplot as plt
%matplotlib inline

url = "http://api.syosetu.com/novelapi/api/"

`narou_load.py`


st = 1
lim = 500

data = []
while st < 2000:
    payload = {'of': 't-gp-gf-n-ka', 'order': 'hyoka',
          'out':'json','lim':lim,'st':st}
    r = requests.get(url,params=payload)
    x = r.json()
    data.extend(x[1:])
    st = st + lim
df = pd.DataFrame(data)

df.head()

payload = {'of': 't-gp-gf-n You can load it by adding a part called ka to this part. (Added above) And the data that comes out

title	kaiwaritu(%)
When I was reincarnated, it was slime	14
The strongest in the world in a common profession	40
Wandering in another world with ridiculous skill	36
Mushoku Tensei-If you go to another world, you will get serious-	22
Another world fantasy song starting from Death March (web version)	38

I see. It's quite expensive (fan) However, I don't know how expensive this is in the first place, so try describe ()

	kaiwaritu
count	2000.00000
mean	38.00800
std	10.66831
min	0.00000
25%	31.00000
50%	38.00000
75%	45.00000
max	96.00000

I see. Is it about the average when the average is 38%? Or rather, the number of characters is so large that it is quite common?

Let's narrow down the number of characters a little.

Reading time

I dare to use the reading time without specifying the number of characters But what is the reading time?

|Parameters|value|Description| |:--|:--|:--| |time|int string|You can specify the reading time of the novel to be extracted. The reading time is the number of characters in the novel divided by 500. When specifying a range, hyphen the minimum and maximum characters(-)Separate with a symbol.|

As you can see, the number is proportional to the number of characters, so there should be no problem except that the number becomes smaller.

Add ti to ʻof of payload` and load immediately

Since it's a big deal, try describe () on time

	time
count	2000.000000
mean	1395.985500
std	1823.680635
min	11.000000
25%	434.750000
50%	889.500000
75%	1608.250000
max	26130.000000

It seems that there are at least 5001 characters. (... I don't think max is Summoner) df[['title','time']].sort_values('time').tail()

title	time
Magi Craft Meister	14868
Boundary Labyrinth and the Wizard of the Other World	16410
Cooking with Wild Game	17653
Summoner goes	25536
legend	26130

** No **

Relationship between reading time (number of characters) and conversation rate

`doku_kai.py`


#Quartile in time
df['part']=pd.qcut(df.time,4,labels=['D','C','B','A'])
#Average for each part
df.groupby('part').agg({'kaiwaritu':['mean']})

The number of characters is D <C <B <A

part	kaiwaritu(average:％)
D	36.990
C	38.180
B	38.322
A	38.540

This was a surprise. The conversation rate does not seem to change, especially whether it is a long story or a short story.

Stylistic style

I was disappointed, so I tried using another stylistic function. This seems to be still in the trial stage, and there are cases where data is not clearly output (it is ambiguous in the first place), and since it can not be set to ʻof`, I will make two types of data frame reading

|Parameters|value|Description| |:--|:--|:--| |buntai |int string|You can specify the style. hyphen(-)You can perform an OR search by separating them with a symbol. 1: Work that is not indented and has many continuous line breaks 2: Work that is not indented but has an average number of line breaks 4: Work that is appropriate for indentation but has many continuous line breaks 6: Work that is appropriate for indentation Works with an average number of line breaks|

First, divide into df1, df2, df4, and df6, respectively.

The strongest sage of disqualification crest-The strongest sage in the world has reincarnated to become stronger- Duke's daughter's taste Another world life of a reincarnated sage-I got a second profession and became the strongest in the world- I have reincarnated as a villain daughter who has only the ruin flag of the maiden game ... Live dungeon!

Isekai Shokudo Someone please explain this situation Hariko Maiden I will quietly disappear Mid-career (middle-aged) office worker relaxing different world industrial revolution

The strongest in the world in a common profession Mushoku Tensei-I'm serious when I go to another world- Another world fantasy song starting from Death March (web version) Re: Life in a different world starting from zero I want to be a powerful person in the shadow![Web version]

When I was reincarnated, it was slime Wandering in another world with ridiculous skill I said that the ability is an average value! It's a spider, but what is it? The magical power of the saint is versatile

There are some classifications that I don't understand, but I'll put up with it here.

	df1	df2	df4	df6
count	500.000000	500.000000	500.00000	500.000000
mean	36.506000	35.246000	38.74200	37.668000
std	11.489211	14.927396	9.70091	13.106691
min	1.000000	0.000000	6.00000	0.000000
25%	28.000000	25.000000	32.75000	30.000000
50%	36.000000	35.000000	39.00000	38.000000
75%	44.000000	44.250000	45.00000	46.000000
max	70.000000	98.000000	71.00000	96.000000

Looking at this result, although there was no big difference, df2 was small overall, and df6 was large. The population parameter is set to 500 each because the initial parameter was 2000, and when displayed in 2000 parameters, df2 dropped further to 34%.

Looking at this, the conversation rate does not seem to be related to the writing style. ~~ I wonder if it's a genre ~~

Impressions

The analysis result did not go very well, but I wondered if it was a practice for my future work. If I come up with an interesting data analysis, I would like to try it. When I read it back, I was surprised at the low conversation rate of Tosura. Is it because there are many conversations in my heart?

I tried the Naro novel API 2