Statistics for Programmers-Table of Contents
When calculating the average value, I think that the general method is to add all the values and divide by the number of elements. However, there are several other average values, such as weighted average and geometric mean, which can be used according to the purpose. The commonly known method of dividing the sum by the number of elements is called the arithmetic mean.
--Type of average value --Arithmetic mean
Below, we will explain all the calculation methods.
It is a well-known value added and divided by the number of data.
\bar{x}=\frac{x_1+x_2+x_3+・ ・ ・+x_n}{n}
When calculating the average value from the frequency distribution table, use the class value (representative value) and frequency. If you define each value as follows,
--k
Number of classes
--m
Class value
-- f
frequency
The following formula can be used to calculate the average value of the frequency distribution table.
\frac{\sum_{i=1}^{k}m_if_i}{\sum_{i=1}^{k}f_i} = \frac{m_1f_1 + m_2f_2 +・ ・ ・+ m_kf_k}{f_1 + f_2 +・ ・ ・f_k}
This is a method to calculate the average value by adding the weight of the data. For example, if you want to calculate the average from the average value of 100 data and the average value of 10 data, simply using the arithmetic mean will not give the correct average value. In such a case, it is necessary to calculate the average value in consideration of the number of data.
If you define it as follows,
--x
data
-- w
Data weight
It can be calculated by the following formula.
\frac{\sum_{i=1}^{n} x_iw_i}{\sum_{i=1}^{n} w_i} = \frac{x_1w_1 + x_2w_2 +・ ・ ・+ x_nw_n}{w_1 + w_2 +・ ・ ・+ w_n}
As a result of one test, the average score of Group A was 60 points, and the average score of Group B was 50 points. Assuming that the number of people in Group A is 20 and the number of people in Group B is 40, the average value of Group A and Group B combined can be calculated as follows.
53.3 \simeq \frac{60\times20 + 50\times40}{20 + 40}
Therefore, the average score of Group A and Group B combined is 53.3 points (rounded down to the second decimal place).
The geometric mean is the average of the products multiplied. It is used to calculate the average of growth rate and interest rate. Also, the geometric mean can only handle positive numbers.
If you define it as follows,
--ʻA Data --
n` Number of data
The geometric mean can be calculated by the following formula.
m_g = \sqrt[n]{a_1a_2a_3 ・ ・ ・ a_n}
Suppose a company's sales increased by 3% in the first year, 5% in the second year, and 10% in the third year. At that time, let's calculate the average annual sales growth rate of this company.
Year | Sales ratio to the previous year |
---|---|
2013 | - |
2014 | 3% |
2015 | 5% |
2016 | 10% |
The ratio of 3% to the previous year in 2014 means that sales were 103% compared to the previous year. Since it will be 105% in 2015 and 110% in 2016, the following holds.
--ʻA data (1.03, 1.05, 1.1) --
n` Number of data (3)
Apply these two to the formula above.
1.06 \simeq \sqrt[3]{1.03\times1.05\times1.1}
When this is calculated, it becomes 1.059594599927647 ・ ・ ・
, so the solution is about 1.06
,
In other words, the average annual sales growth rate is about 6%
.
This is used when you want to find the average speed for the entire outbound and inbound journey.
The formula for the harmonic mean is:
m_H = \frac{1}{\frac{1}{n}(\frac{1}{x_1} + \frac{1}{x_2} + \frac{1}{x_3} +・ ・ ・+ \frac{1}{x_n})} = \frac{n}{(\frac{1}{x_1} + \frac{1}{x_2} + \frac{1}{x_3} +・ ・ ・+ \frac{1}{x_n})}
Suppose you make a round trip over a distance of 10km under the following conditions.
Round trip | Speed | time |
---|---|---|
Outbound | 40km | 15 minutes |
Return trip | 4km | 150 minutes |
This problem can be solved without using a formula.
This is the same as traveling a distance of 20km in 165 minutes.
In other words, if x
is the velocity, then:
x \times \frac{165}{60} = 20
Solving this, x
is about 7.3km / h.
Now let's solve it using the formula.
If you define it as follows,
--x
data (movement speed)
-- n
Number of data
The harmonic mean can be calculated by the following formula.
x = \frac{2}{\frac{1}{40} + \frac{1}{4}}
Solving this, x
is about 7.3km / h, which is the same as the solution solved without the formula.
If you use the formula, you can get the average speed without knowing the distance traveled.
that's all
-I see, Statistics Academy High School -Types of average values and how to use them properly -Important official summary of harmonic mean
Recommended Posts