It was necessary to test the difference between the average values of the count data according to the Poisson distribution in my work, but I did not have the information in Japanese and implemented it with reference to the following paper, so I will leave it as a memorandum.
[A more powerful test for comparing two Poisson means] (http://www.sciencedirect.com/science/article/pii/S0378375802004081)
It was said that there are two methods of testing, the conditional test (C-test) and the test using the P value proposed in this paper (E-test), but this time the detection power Implemented the higher E-test.
$ n1, n2 $: Number of elapsed unit times $ k1, k2 $: Total number of occurrences of events for the entire period $ d $: Difference between the means you want to test
When
Null hypothesis $ H_0: \ lambda_1 = \ lambda_2 + d $ Alternative hypothesis $ H_1: \ lambda_1 \ neq \ lambda_2 + d $
The P value when performing the hypothesis test of is calculated by the following formula.
\begin{align}
& \hat{\lambda}_{2k} = \frac{k_1 + k_2}{n_1 + n_2} - \frac{d n_1}{n_1 + n_2}\\[2mm]
& p = \sum_{x_1=0}^{k_1} \sum_{x_2=0}^{k_2}
\frac{e^{-n_1(\hat{\lambda}_{2k}+d)}(n_1(\hat{\lambda}_{2k} + d))^{x_1}}{x_1!}
\frac{e^{-n_2\hat{\lambda}_{2k}}(n_2(\hat{\lambda}_{2k}))^{x_2}}{x_2!}
\end{align}
The following is an implementation of the above expression in Python and Ruby. For example, if the counts you want to compare are 40 and 65, respectively, and you want to know if there is a significant difference between the two mean values,
pois_mean_diff_test(40, 65)
=> 0.04921332025427114
Then you can calculate the P value. This time, we can see that the P value is below 0.05, so we can see that there is a significant difference at the 5% level.
If the formula is implemented as it is, it will overflow in the floating point calculation, so the factorial calculation was decomposed and implemented as follows.
\frac{e^{-n_1(\hat{\lambda}_{2k}+d)}(n_1(\hat{\lambda}_{2k} + d))^{x_1}}{x_1!}
= e^{-n_1(\hat{\lambda}_{2k}+d)} \times
\frac{n_1(\hat{\lambda}_{2k} + d)}{x_1} \times \frac{n_1(\hat{\lambda}_{2k} + d)}{x_1 -1} \times \dots \times \frac{n_1(\hat{\lambda}_{2k} + d)}{1}
Python
import math
import numpy as np
def pois_mean_diff_test(k1, k2, n1=1, n2=1, d=0.0):
x1_seq = range(0, k1 + 1)
x2_seq = range(0, k2 + 1)
l2k = (k1+k2)/(n1+n2) - d*n1/(n1+n2)
p_value = sum([math.exp(-n1*(l2k+d)) * np.prod([n1*(l2k+d)/i for i in range(1, x1+1)]) * \
math.exp(-n2*l2k) * np.prod([n2*l2k/j for j in range(1, x2+1)]) \
for x2 in x2_seq for x1 in x1_seq])
return p_value
if __name__ == '__main__':
print "P-value is " + str(pois_mean_diff_test(40, 65))
Ruby
def pois_mean_diff_test(k1, k2, n1=1, n2=1, d=0.0, alpha=0.05)
x1_seq = Array(0..k1)
x2_seq = Array(0..k2)
l2k = (k1+k2)/(n1+n2) - d*n1/(n1+n2)
p_value = x1_seq.product(x2_seq).map{|x| x1=x[0]; x2=x[1];
Math.exp(-n1*(l2k+d)) * Array(1..x1).map{|i| n1*(l2k+d)/i }.inject(:*).to_f *
Math.exp(-n1*l2k) * Array(1..x2).map{|j| n2*l2k/j }.inject(:*).to_f
}.inject(:+)
return p_value
end
out = "P-value is " + pois_mean_diff_test(20,10).to_s
puts out
[K. Krishnamoorthy, Jessica Thomson, A more powerful test for comparing two Poisson means, Journal of Statistical Planning and Inference, Volume 119, Issue 1, 15 January 2004, Pages 23-35] (http://www.sciencedirect.com/science/article/pii/S0378375802004081)
Recommended Posts