#alignment #read #high #bam #filtering #numbers #clipped

bin+lib filter-clipped

用于从比对文件中过滤出高度剪接的 NGS 读取数据的 bam/sam 工具

3 个版本 (破坏性更新)

0.3.0 2023年1月29日
0.2.0 2022年8月16日
0.1.0 2022年8月7日

#22 in #bam

MIT 许可协议

20KB
282 代码行

filter-clipped

CI crates.io

[Rust 学习项目]

移除具有大量剪接基的比对

有时比对器具有非常宽松的评分方法,并将具有大量软/硬剪接基的比对写入比对 BAM 文件。此程序通过根据读取序列长度设置剪接基的数量来过滤这些读取数据

安装

$ git clone https://github.com/wckdouglas/filter-clipped
$ cd filter-clipped
$ cargo install --path .  # if compilation error, try CC=/usr/bin/gcc cargo install --path .
$ filter-clipped --help
filter-clipped 0.1.0
Remove alignments with high number of clipped base. Sometimes aligner has very loose scoring methods
and write alignments with high abundant of soft/hard-clipped base into alignment BAM files. This
program is for filtering these reads out by gating the number of clipped bases in relative to the
read sequence length

USAGE:
    filter-clipped [OPTIONS] --in-bam <IN_BAM>

OPTIONS:
    -b, --both-end <BOTH_END>        maximum fraction of total bases on the sequence being clipped
                                     [default: 0.1]
    -h, --help                       Print help information
    -i, --in-bam <IN_BAM>            input bam file path  ("-" for stdin)
        --inverse                    keeping the failed ones (high-clipped-fraction alignments)
    -l, --left-side <LEFT_SIDE>      maximum fraction of bases on the sequence being clipped from
                                     the left side (5' end) [default: 0.1]
    -o, --out-bam <OUT_BAM>          output bam file path ("-" for stdout) [default: -]
    -r, --right-side <RIGHT_SIDE>    maximum fraction of bases on the sequence being clipped from
                                     the right side (3' end) [default: 0.1]
    -V, --version                    Print version information

测试

cargo test

Docker

docker pull ghcr.io/wckdouglas/filter-clipped:main
docker run --env RUST_LOG=info -v $PWD/:/root/  ghcr.io/wckdouglas/filter-clipped:main --in-bam /root/bamfile_in_current_path.bam | samtools view

依赖项

~14–24MB
~382K SLoC