I read and thought about Bayesian statistics from the basics. Quoted from page 59
Employment Exam Questions: In the entrance exam of a company, 7 questions of the same difficulty are asked every year. Mr. x from X University had 3 correct answers and 4 incorrect answers. Let the correct answer rate be $ \ theta_x $. Mr. y from Y University had 4 correct answers and 3 incorrect answers. Let the correct answer rate be $ \ theta_y $. There are many test takers from University X and University Y every year. When I looked it up, The correct answer rate of examinees at University X is approximated by a beta distribution with an average of 0.8 and a variance of 0.04. It was found that the correct answer rate of the examinees of Y University is approximated by a beta distribution with an average of 0.4 and a variance of 0.04. If you estimate $ \ theta_x $ and $ \ theta_y $ and hire only one candidate with a large population parameter, Which is Mr. x or Mr. y?
** "If you evaluate not only the test results but also the abilities of the group to which the person belongs, you can make a more accurate estimate." ** It is a consideration about the pros and cons of that.
Here, instead of solving this problem as it is, we will consider changing the conditions as follows.
――The ability of each student is set as one value, and the purpose is whether or not the student with the higher ability value can be selected. --Suppose that the abilities of candidates from University X follow a normal distribution with a mean of 105 and a standard deviation of 10. ――Suppose that the ability of examinees from Y University follows a normal distribution with an average of 100 and a standard deviation of 10. ――It is assumed that the observed values, which are the results of the entrance examination, follow a normal distribution with an average of your own ability value and a standard deviation of 10. -** In the test result method , evaluation is based only on observed values. - In the group combination method **, we will evaluate with $ correction value \ equiv \ frac {group mean value + observed value} {2} $.
Calculate each value. 63.8% have high stats and 59.9% have high stats.
python3
import numpy as np, pandas as pd
np.random.seed(1)
n = 1000000 #Number of evaluations
xave, yave, std = 105, 100, 10 #Mean of X, mean of Y, standard deviation
gx = np.random.normal(xave, std, (n)) #Group X stats
gy = np.random.normal(yave, std, (n)) #Group Y stats
ox = np.random.normal(gx, std) #Group X observations
oy = np.random.normal(gy, std) #Group Y observations
ax = (xave + ox) / 2 #Group X correction value
ay = (yave + oy) / 2 #Group X correction value
gf = gx > gy #High X in stats
of = ox > oy #X is high in the observed value
af = ax > ay #X is high in the correction value
print(gf.sum() / n, of.sum() / n)
>>>
0.638318 0.598666
According to the observation value evaluation, the correct answer rate is 75.9%.
python3
print(pd.DataFrame([[(gf&of).sum(), ((~gf)&of).sum()],
[(gf&(~of)).sum(), ((~gf)&(~of)).sum()]],
columns=['Ability X', 'Ability Y'], index=['Observation X', 'Observation Y']) / n)
print('Correct answer rate= ', (gf==of).sum() / n)
>>>
Ability X Ability Y
Observation X 0.498147 0.100519
Observation Y 0.140171 0.261163
Correct answer rate= 0.75931
In the evaluation by the correction value, the correct answer rate is 76.8%.
python3
print(pd.DataFrame([[(gf&af).sum(), ((~gf)&af).sum()],
[(gf&(~af)).sum(), ((~gf)&(~af)).sum()]],
columns=['Ability X', 'Ability Y'], index=['Correction X', 'Correction Y']) / n)
print('Correct answer rate= ', (gf==af).sum() / n)
>>>
Ability X Ability Y
Correction X 0.549088 0.142374
Correction Y 0.089230 0.219308
Correct answer rate= 0.768396
Certainly, using not only my own abilities but also the abilities of my group, I was able to evaluate with high accuracy. But is such a method really good? For example, what about using gender instead of college? Perhaps it is problematic to change the pass or fail depending on the gender.
Therefore, we propose a new method. The method is as follows.
――You will take a separate test in advance. The result is used as the pre-observed value. ――If you don't like the results of the test (if the pre-observed value is less than your ability score), do nothing. ――If you like it, you will register yourself. If registered, both observed and pre-observed values must be used in the entrance exam. (If it is not registered, only the observed value is sufficient.) ――We have adopted a registration system in consideration of the possibility that not everyone can take the test separately.
python3
bx = np.random.normal(gx, std) #Pre-observed values for group X
by = np.random.normal(gy, std) #Pre-observed values of group Y
#If the pre-observed value is less than or equal to your ability value, use the original observed value.
px = ox*(bx < gx) + (ox+bx)/2*(bx >= gx) #Suggested value for group X 1
py = oy*(by < gy) + (oy+by)/2*(by >= gy) #Suggested value for group Y 1
pf = px > py #Proposed value 1 and X is high
print(pd.DataFrame([[(gf&pf).sum(), ((~gf)&pf).sum()],
[(gf&(~pf)).sum(), ((~gf)&(~pf)).sum()]],
columns=['Proposal 1X', 'Proposal 1Y'], index=['Observation X', 'Observation Y']) / n)
print('Correct answer rate= ', (gf==pf).sum() / n)
>>>
Proposal 1X Proposal 1Y
Observation X 0.518089 0.088829
Observation Y 0.120229 0.272853
Correct answer rate= 0.790942
The accuracy rate has improved to 79.1%. However, it costs money to take a separate test.
Consider a method that does not test separately.
--If the test results are above the average of your group: Ask them to use the observed values. --If the test score is below the average of your group: Ask them to use the correction value.
python3
qx = ox*(ox >= xave) + ax*(ox < xave) #Suggested value 2 for group X
qy = oy*(oy >= yave) + ay*(oy < yave) #Suggested value 2 for group Y
qf = qx > qy #Proposed value 2 and X is high
print(pd.DataFrame([[(gf&qf).sum(), ((~gf)&qf).sum()],
[(gf&(~qf)).sum(), ((~gf)&(~qf)).sum()]],
columns=['Proposal 2X', 'Proposal 2Y'], index=['Observation X', 'Observation Y']) / n)
print('Correct answer rate= ', (gf==qf).sum() / n)
>>>
Proposal 2X Proposal 2Y
Observation X 0.521504 0.119074
Observation Y 0.116814 0.242608
Correct answer rate= 0.764112
It's a little better than just the observations.
Even with Proposal Method 2, the superior abilities of the group to which they belong may be dissatisfied. How about the next method?
--Each user decides whether to register by himself / herself at the time of the test. --If the ability score is above the average of the group you belong to: Do not register. → Ask them to use the observed values. --If your stats are less than the average for your group: Register. → Ask them to use the correction value.
python3
rx = ox*(gx >= xave) + ax*(gx < xave) #Suggested value for group X 3
ry = oy*(gy >= yave) + ay*(gy < yave) #Proposed value of group Y 3
rf = rx > ry #Proposed value 3 and X is high
print(pd.DataFrame([[(gf&rf).sum(), ((~gf)&rf).sum()],
[(gf&(~rf)).sum(), ((~gf)&(~rf)).sum()]],
columns=['Proposal 3X', 'Proposal 3Y'], index=['Observation X', 'Observation Y']) / n)
print('Correct answer rate= ', (gf==rf).sum() / n)
>>>
Proposal 3X Proposal 3Y
Observation X 0.518967 0.119357
Observation Y 0.119351 0.242325
Correct answer rate= 0.761292
――It is a little better than the estimation of observed values alone. ――Because the examinees are actively involved, they are not discriminated against. --There is almost no cost.
Proposal 3 (or a hybrid of Proposal 1 and Proposal 3) seems good.
that's all
Recommended Posts