9 个稳定版本
1.2.2 | 2021 年 5 月 17 日 |
---|---|
1.2.0 | 2021 年 5 月 14 日 |
1.0.4 | 2021 年 4 月 22 日 |
0.1.0 | 2021 年 4 月 21 日 |
#318 in 科学
38 每月下载量
用于 kda-tools
40KB
421 行
包含 (debian 包, 6KB) poisson-rate-test_1.2.2_amd64.deb
poisson-rate-test
目的
一个 Rust 库,提供比较泊松数据率和进行关于该数据的假设检验的方法。
具体来说,截至 1.0 版本,提供了两种类型的测试:速率到速率比较(2 个事件)和比率到比率比较(4 个事件)。
速率到速率
此测试假设给定数据集中事件 A 和事件 B 的数量具有形式 r_a / r_b >= R
的速率,对于常数 R 与两个事件以相同速率发生的零假设进行对比。
示例:测试事件速率与假设
use poisson_ratio_test::two_tailed_rates_equal;
//make some data that sure looks like it occurs with rate = 0.5;
let data = vec![0,1,1,0]; //note, 0,2,0,0 would be the same (2/4).
let n1 = data.len() as f64;
let sum1 = data.iter().sum::<usize>() as f64;
//are these rates equal to my hypothesized rate of 0.5?
let expected_n = n1;
let expected_sum = 0.5 * n1;
let p = two_tailed_rates_equal(sum1, n1, expected_sum, expected_n);
assert!(p>0.99); //<--confidently yes
示例:比较新条件下的事件速率
use claim::{assert_lt,assert_gt};
use poisson_ratio_test::{one_tailed_ratio,two_tailed_rates_equal};
//say we made a change, and observed the new rates
let occurances_observed = vec![0,0,1,0];
//and here's the "usual" data
let occurances_usual = vec![1,1,5,3,3,8];
//need the basic n/sum statistics
let n1 = occurances_observed.len() as f64;
let n2 = occurances_usual.len() as f64;
let sum1 = occurances_observed.iter().sum::<usize>() as f64;
let sum2 = occurances_usual.iter().sum::<usize>() as f64;
//is rate of observed > rate usual?
let p = one_tailed_ratio(sum1, n1, sum2, n2, 1.0);
assert_lt!(p,0.01); //<--confidently no
//Maybe just check both tails to be sure (this tests r observed / r baseline != 1)
let p = two_tailed_rates_equal(sum1, n1, sum2, n2);
assert_lt!(p,0.01); //<--confidently no
示例:更多的数据有助于
这里有一个长示例,更多请参阅 文档
use claim::{assert_lt,assert_gt};
use poisson_ratio_test::{one_tailed_ratio,two_tailed_rates_equal};
//create data where rate1 == 1/2 * rate2
let occurances_one = vec![1,0,1,0,1,0];
let occurances_two = vec![1,1,1,1,0,2];
let n1 = occurances_one.len() as f64;
let n2 = occurances_two.len() as f64;
let sum1 = occurances_one.iter().sum::<usize>() as f64;
let sum2 = occurances_two.iter().sum::<usize>() as f64;
//test hypothesis that r1/r2 > 1/2
let p = one_tailed_ratio(sum1, n1, sum2, n2, 0.5);
assert_eq!(p, 0.50); //<-- nope
//let's test the neighbordhood around that
let p = one_tailed_ratio(sum1, n1, sum2, n2, 0.49999 );
assert_gt!(p, 0.49); //<-- still nope
//Two sided test. What is the likelihood of seeing the data we got
//given that r1/r2 == 1/2?
let p_half = one_tailed_ratio(sum1, n1, sum2, n2, 0.49999);
//other side
let p_double = one_tailed_ratio(sum2, n2, sum1, n1, 2.0001);
//just about 1.0!
assert_gt!(2.0*p_half.min(p_double),0.99);
//we *know* they are not equal, but can we prove it in general?
let mut p_double = two_tailed_rates_equal(sum2, n2, sum1, n1);
//note: p_double is in [.15,.25]
assert_lt!(p_double,0.25);//<--looking unlikely... maybe more data is required
assert_gt!(p_double,0.15);//<--looking unlikely... maybe more data is required
//get more of the same data
let trial2_one = vec![1,0,1,0,1,0,1,0,1,0,1,0,1,0];
let trial2_two = vec![1,1,1,1,0,2,0,2,1,1,0,2,1,1];
let t2n1 = trial2_one.len() as f64;
let t2n2 = trial2_two.len() as f64;
let t2sum1 = trial2_one.iter().sum::<usize>() as f64;
let t2sum2 = trial2_two.iter().sum::<usize>() as f64;
p_double = two_tailed_rates_equal(t2sum2, t2n2, t2sum1, t2n1);
assert_lt!(p_double,0.05);//<--That did the trick
比较事件比率
假设有两个事件 a 和 b。我们有两个组(基线和治疗)。我们在治疗中做了些改变,想知道这种改变是否影响了 a/b 的比率。因此,我们为基线和治疗都计数 a 和 b。注意 p 值是从模拟中估计的,所以它们在不同运行之间可能略有变化(如 0.01 左右)。传递更高的样本数以稳定,但以 CPU 成本为代价。
示例:比较 Hunt Showdown 中的新武器
在 kda-tools 中就是这样做的
use poisson_rate_test::bootstrap::param::ratio_events_greater_pval;
use claim::{assert_lt,assert_gt};
//57 matches, 50 kills, 27 deaths without Caldwell Conversion pistol (baseline)
let normal_matches = 57;
let normal_kills = 50;
let normal_deaths = 27;
//10 matches, 4 kills, 9 deaths with Caldell Conversion pistol (treatment)
let cc_matches=10;
let cc_kills=4;
let cc_deaths=9;
let p_cc_treatment_greater= bootstrap::param::ratio_events_greater_pval(
normal_kills,normal_deaths, normal_matches,
cc_kills,cc_deaths, cc_matches,
).unwrap() ;
assert_gt!(p_cc_treatment_greater,0.90); //Hell no that's not greater (cc_kills/cc_deaths) is much less than normal_kills/normal_deaths
let p_cc_treatment_less = bootstrap::param::ratio_events_greater_pval(
cc_kills,cc_deaths, cc_matches,
normal_kills,normal_deaths, normal_matches,
).unwrap() ;
assert_lt!(p_cc_treatment_less,0.05); //very high significance / very low p-value
use poisson_rate_test::boostrap::param::ratio_events_equal_pval_n;
use claim::{assert_lt,assert_gt};
let base_a = vec![0,0,1,0];
let base_b = vec![1,0,1,1];
let treat_a = vec![1,1,1,2];
let treat_b = vec![1,1,1,1];
//Did treatment increase ratio of a/b?
let p = bootstrap::param::ratio_events_equal_pval_n(
base_a.iter().sum::<usize>(),
base_b.iter().sum::<usize>(),
base_a.len() as usize,
treat_a.iter().sum::<usize>(),
treat_b.iter().sum::<usize>(),
treat_a.len() as usize,
10000
);
assert_lt!(p.unwrap(),0.15); //<--tentatively yes
assert_gt!(p.unwrap(),0.05);
//just need more data, right?
let base_a = vec![0,0,1,0, 1,0,0,0];
let base_b = vec![1,0,1,1, 0,1,1,1];
let treat_a = vec![1,1,1,2, 1,2,1,1];
let treat_b = vec![1,1,1,1, 1,1,1,1];
//Did treatment increase ratio of a/b?
let p = bootstrap::param::ratio_events_equal_pval_n(
base_a.iter().sum::<usize>(),
base_b.iter().sum::<usize>(),
base_a.len() as usize,
treat_a.iter().sum::<usize>(),
treat_b.iter().sum::<usize>(),
treat_a.len() as usize,
10000
);
assert_lt!(p.unwrap(),0.05); //<--confidently yes
assert_gt!(p.unwrap(),0.01);
比率到比率
此测试假设两个数据集中的两个事件以不同的比率发生 r1_a/r2_b >= r2_a/r2_b
,与它们相等的零假设进行对比。
为什么
在游戏中一个有趣的统计量是事件的比率(如各种载具的击杀/死亡),或带和不带物品的击杀/比赛速率。
我在 kda-tools 中用它来进行 Hunt Showdown 中的载具假设检验。
依赖关系
~6MB
~116K SLoC