A formula that gives an indication of the frequency distribution and the number of classes when creating a histogram. It can be calculated by the following formula, where n is the number of samples and k is the number of classes.
k = 1 + log_2N
Assuming that there is data with 40 samples (N = 40), the number of classes when creating a histogram is calculated from it.
1 + log_240 = 6.3219280948874 ≒ 6
From this, the number of classes 6 is set.
The number of classes obtained using the Starges formula is only a ** guideline **. (There is no absolute answer for setting the class number when creating a frequency distribution table / histogram)
sturges.py
import math
def sturges_rule(n):
u"""
Star Jess Official
"""
return round(1 + math.log2(n))
Check in the "Example" above.
>>> from sturges import sturges_rule
>>> sturges_rule(40)
6
Recommended Posts