The story of doing a chi-square test to test independence between discrete variables in a crosstab is previously explained.
The larger the chi-square value χ ^ 2, the stronger the relationship between the two variables. However, the χ ^ 2 value depends on the size of the crosstab and the number of cases, and the maximum value is also different. Another aspect is that it is difficult to compare cross-tabulation tables with different numbers of rows and columns.
In Cramer's V coefficient, χ ^ 2 is converted by the following formula, and the state that is completely unrelated to any crosstabulation table is 0. , Derives a value with the fully related state as 1.
\Phi_c = \sqrt{\frac {\chi^2} {N(k-1)}}
Where N is the total frequency and k is the smaller number of rows or columns in the crosstabulation table.
The effect of the number of cases is corrected by using the total frequency, and the effect of the number of matrices is corrected by taking the smaller of the number of columns and the number of rows. Also, since the original is the chi-square value, it takes the square root.
Here has a reference code, so I will quote it.
import numpy as np
def det2x2(A, v=False):
if v: print('compute 2 x 2 det of')
if v: print(A)
assert A.shape == (2,2)
return A[0][0]*A[1][1] - A[0][1]*A[1][0]
def det3x3(A):
print('compute 3 x 3 det of')
print(A)
assert A.shape == (3,3)
a,b,c = A[0]
c1 = a * det2x2(A[1:3,[1,2]])
c2 = b * det2x2(A[1:3,[0,2]])
c3 = c * det2x2(A[1:3,[0,1]])
return c1 - c2 + c3
def solve(A):
print('solve')
print(A, '\n')
assert A.shape == (3,4)
D = det3x3(A[:,:3])
print('D = ', D, '\n')
if D == 0:
print('no solution')
return
Dx = det3x3(A[:,[3,1,2]])
print('Dx = ', Dx, '\n')
Dy = det3x3(A[:,[0,3,2]])
print('Dy = ', Dy, '\n')
Dz = det3x3(A[:,[0,1,3]])
print('Dz = ', Dz, '\n')
return Dx*1.0/D, Dy*1.0/D, Dz*1.0/D
def check(A,x,y,z):
print('check')
for i,r in enumerate(A):
print('row', i, '=', r)
pL = list()
for coeff,var in zip(r[:3],(x,y,z)):
c = str(round(coeff,2))
v = str(round(var,2))
pL.append(c + '*' + v)
print(' + '.join(pL), end=' ')
print(' =', r[0]*x + r[1]*y + r[2]*z, '\n')
When executed, it will be like this.
import numpy as np
import cramer
def run_cramer():
L = [2, 3, 0, 5,
1, 1, 1, 3,
2,-1, 3, 7]
A = np.array(L)
A.shape = (3,4)
result = cramer.solve(A)
if result:
x,y,z = result
print('solution')
print('x =', x)
print('y =', y)
print('z =', z, '\n')
cramer.check(A,x,y,z)
run_cramer()
# =>
# solve
# [[ 2 3 0 5]
# [ 1 1 1 3]
# [ 2 -1 3 7]]
#
# compute 3 x 3 det of
# [[ 2 3 0]
# [ 1 1 1]
# [ 2 -1 3]]
# D = 5
#
# compute 3 x 3 det of
# [[ 5 3 0]
# [ 3 1 1]
# [ 7 -1 3]]
# Dx = 14
#
# compute 3 x 3 det of
# [[2 5 0]
# [1 3 1]
# [2 7 3]]
# Dy = -1
#
# compute 3 x 3 det of
# [[ 2 3 5]
# [ 1 1 3]
# [ 2 -1 7]]
# Dz = 2
#
# solution
# x = 2.8
# y = -0.2
# z = 0.4
#
# check
# row 0 = [2 3 0 5]
# 2*2.8 + 3*-0.2 + 0*0.4 = 5.0
#
# row 1 = [1 1 1 3]
# 1*2.8 + 1*-0.2 + 1*0.4 = 3.0
#
# row 2 = [ 2 -1 3 7]
# 2*2.8 + -1*-0.2 + 3*0.4 = 7.0
#
There is also an online calculator at here, as mentioned in the source article.
Recommended Posts