3 个版本
0.1.2 | 2019年9月5日 |
---|---|
0.1.1 | 2019年9月5日 |
0.1.0 | 2019年8月12日 |
457 在 科学
185KB
3.5K SLoC
Saber
ScAlaBle Estimator Regressor
构建
安装 Rust
curlhttps://sh.rustup.rs -sSf | sh
rustuptoolchain install nightly
rustupdefault nightly
有关更多详细信息,请访问 https://rust-lang.net.cn/tools/install 和 https://github.com/rust-lang/rustup.rs#working-with-nightly-rust
构建 Saber
RUSTFLAGS='-L /path/to/OpenBLAS -lopenblas -C target-cpu=native' cargobuild --release
其中 /path/to/OpenBLAS
是包含 OpenBLAS 库的目录路径。
运行
在 saber 顶级目录内部,构建过程生成的可执行文件将位于 ./target/release
一些有趣的执行文件
./target/release/partition_by_chrom -h
partition_by_chrom 0.1
Aaron Zhou
USAGE:
partition_by_chrom --bim <BIM> --out <out_path>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
-b, --bim <BIM> required; the PLINK bim file
-o, --out <out_path> output path; each line will have two fields: variant_id chrom_partition_assignment
./target/release/estimate_heritability -h
estimate_heritability 0.1
Aaron Zhou
USAGE:
estimate_heritability [OPTIONS] --nrv <num_random_vecs> --pheno <pheno_path> --bfile <plink_filename_prefix>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--lowest-maf <lowest_allowed_maf>
Lowest allowed minor allele frequency (MAF)
Any SNPs with a MAF less than <lowest_allowed_maf> will be ignored
-k, --num-jackknifes <num_jackknife_partitions>
The number of jackknife partitions
SNPs will be divided into <num_jackknife_partitions> partitions
where each partition will be treated as a single point of observation [default: 20]
--nrv <num_random_vecs>
The number of random vectors used to estimate traces
Recommends at least 100 for small datasets, and 10 for huge datasets
--partition <partition_file>
A file to partition the SNPs into multiple components.
Each line consists of two values of the form:
SNP_ID PARTITION
For example,
rs3115860 1
will assign SNP with ID rs3115860 in the BIM file to a partition named 1
-p, --pheno <pheno_path>
The header line should be
FID IID PHENOTYPE_NAME
where PHENOTYPE_NAME can be any string without white spaces.
The rest of the lines are of the form:
1000011 1000011 -12.11363
-b, --bfile <plink_filename_prefix>
If we have files named
PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam
then the <plink_filename_prefix> should be path/to/x
./target/release/estimate_g_gxg_heritability -h
estimate_multi_gxg_heritability 0.1
USAGE:
estimate_g_gxg_heritability [OPTIONS] --le <le_snps_filename_prefix> --nrv-gxg <num_rand_vecs_gxg> --nrv <num_random_vecs> --pheno <pheno_path>... --bfile <plink_filename_prefix>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--gxg-partition <gxg_partition_file>
Form GxG for each of the partitions instead of
over the entire range of LE SNPs.
Taking the same file format as the --partition option
--le <le_snps_filename_prefix>
The SNPs that are in linkage equilibrium.
To be used to construct the GxG matrix.
If we have files named
PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam
then the <le_snps_filename_prefix> should be path/to/x
-k, --num-jackknifes <num_jackknife_partitions> The number of jackknife partitions [default: 20]
--nrv-gxg <num_rand_vecs_gxg>
The number of random vectors used to estimate traces related to the GxG matrix
--nrv <num_random_vecs>
The number of random vectors used to estimate traces
Recommends at least 100 for small datasets, and 10 for huge datasets
--partition <partition_file>
A file to partition the G SNPs into multiple components.
Each line consists of two values of the form:
SNP_ID PARTITION
For example,
rs3115860 1
will assign SNP with ID rs3115860 in the BIM file to a partition named 1
-p, --pheno <pheno_path>...
The header line should be
FID IID PHENOTYPE_NAME
where PHENOTYPE_NAME can be any string without white spaces.
The rest of the lines are of the form:
1000011 1000011 -12.11363
-b, --bfile <plink_filename_prefix>
If we have files named
PATH/TO/x.bed PATH/TO/x.bim PATH/TO/x.fam
then the <plink_filename_prefix> should be path/to/x
依赖项
~9–17MB
~280K SLoC