#command-line-tool #newick #tree #ncbi #taxonomy #bioinformatics #assembly

bin+lib nwr

nwr 是一个用于处理 Newick 和分类学的命令行工具

17 个版本

0.7.2 2024年5月28日
0.7.0 2023年9月15日
0.6.2 2023年7月18日
0.5.10 2023年1月28日
0.5.5 2022年3月4日

#49 in 生物学

MITGPL-3.0 许可协议

5MB
6K SLoC

Rust 5K SLoC // 0.1% comments Shell 1K SLoC // 0.1% comments

nwr

Publish Build Codecov Crates.io Lines of code

nwr 是一个用于处理 NCBI 分类学、NeWick 文件和组装 R 报告的命令行工具,用 Rust 编写。

安装

当前版本:0.7.2

cargo install nwr

# or
brew install wang-q/tap/nwr

# local repo
cargo install --path . --force --offline

# build under WSL 2
export CARGO_TARGET_DIR=/tmp
cargo build

nwr help

`nwr` is a command line tool for working with NCBI taxonomy, Newick files and assembly reports

Usage: nwr [COMMAND]

Commands:
  append       Append fields of higher ranks to a TSV file
  ardb         Init the assembly database
  comment      Add comments to node(s) in a Newick file
  common       Output the common tree of terms
  distance     Output a TSV/phylip file with distances between all named nodes
  download     Download the latest releases of `taxdump` and assembly reports
  indent       Indent the Newick file
  info         Information of Taxonomy ID(s) or scientific name(s)
  kb           Prints docs (knowledge bases)
  label        Labels in the Newick file
  lineage      Output the lineage of the term
  member       List members (of certain ranks) under ancestral term(s)
  order        Order nodes in a Newick file
  pl-condense  Pipeline - condense subtrees based on taxonomy
  prune        Remove nodes from the Newick file
  rename       Rename named/unnamed nodes in a Newick file
  replace      Replace node names/comments in a Newick file
  reroot       Place the root in the middle of the desired node and its parent
  restrict     Restrict taxonomy terms to ancestral descendants
  subtree      Extract a subtree
  stat         Statistics about the Newick file
  template     Create dirs, data and scripts for a phylogenomic research
  tex          Visualize the Newick tree via LaTeX
  topo         Topological information of the Newick file
  txdb         Init the taxonomy database
  help         Print this message or the help of the given subcommand(s)

Options:
  -h, --help     Print help
  -V, --version  Print version


Subcommand groups:

* Database
    * download
    * txdb
    * ardb

* Taxonomy
    * info
    * lineage
    * member
    * append
    * restrict
    * common

* Newick
    * Information
        * label
        * stat
        * distance
    * Manipulation
        * order
        * rename
        * replace
        * topo
        * subtree
        * prune
        * reroot
        * pl-condense
    * Visualization
        * indent
        * comment
        * tex

* Assembly
    * template
    * kb

示例

每个命令的使用方法

有关 nwr 和其他优秀伴侣的实际应用,请参阅此 页面

nwr download

nwr txdb

nwr info "Homo sapiens" 4932

nwr lineage "Homo sapiens"
nwr lineage 4932

nwr restrict "Vertebrata" -c 2 -f tests/nwr/taxon.tsv
##sci_name       tax_id
#Human   9606

nwr member "Homo"

nwr append tests/nwr/taxon.tsv -c 2 -r species -r family --id

nwr ardb
nwr ardb --genbank

nwr common "Escherichia coli" 4932 Drosophila_melanogaster 9606 "Mus musculus"

开发

# Concurrent tests may trigger sqlite locking
cargo test -- --test-threads=1

cargo test --color=always --package nwr --test cli_nwr command_template -- --show-output

# debug mode has a slow connection
cargo run --release --bin nwr download

# tests/nwr/
cargo run --bin nwr txdb -d tests/nwr/

cargo run --bin nwr info -d tests/nwr/ --tsv Viruses "Actinophage JHJ-1" "Bacillus phage bg1"

cargo run --bin nwr common -d tests/nwr/ "Actinophage JHJ-1" "Bacillus phage bg1"

cargo run --bin nwr template tests/assembly/Trichoderma.assembly.tsv --ass -o stdout

Newick 文件和 LaTeX

有关更详细的用法,请参阅 此文件

从树中获取信息

# List all names
nwr label tests/newick/hg38.7way.nwk

# The intersection between the nodes in the tree and the provided
nwr label tests/newick/hg38.7way.nwk -r "^ch" -n Mouse -n foo
nwr label tests/newick/catarrhini.nwk -n Homo -n Pan -n Gorilla -M
# Is Pongo the sibling of Homininae?
nwr label tests/newick/catarrhini.nwk -n Homininae -n Pongo -DM
# All leaves belong to Hominidae
nwr label tests/newick/catarrhini.nwk -t Hominidae -I

nwr label tests/newick/catarrhini.nwk -c dup
nwr label tests/newick/catarrhini.comment.nwk -c full

nwr stat tests/newick/hg38.7way.nwk

# Various distances
nwr distance -m root -I tests/newick/catarrhini.nwk
nwr distance -m parent -I tests/newick/catarrhini.nwk
nwr distance -m pairwise -I tests/newick/catarrhini.nwk
nwr distance -m lca -I tests/newick/catarrhini.nwk

nwr distance -m root -L tests/newick/catarrhini_topo.nwk

# Phylip distance matrix
nwr distance -m phylip tests/newick/catarrhini.nwk

树的操纵

echo "((A,B),C);" | nwr order --ndr stdin
nwr order --nd tests/newick/hg38.7way.nwk

nwr rename tests/newick/abc.nwk -n C -r F -l A,B -r D

nwr replace tests/newick/abc.nwk tests/newick/abc.replace.tsv
nwr replace tests/newick/abc.nwk tests/newick/abc3.replace.tsv

nwr topo tests/newick/catarrhini.nwk

# The behavior is very similar to `nwr label`, but outputs a subtree instead of labels
nwr subtree tests/newick/hg38.7way.nwk -n Human -n Rhesus -r "^ch" -M

# Condense the subtree to a node
nwr subtree tests/newick/hg38.7way.nwk -n Human -n Rhesus -r "^ch" -M -c Primates

nwr subtree tests/newick/catarrhini.nwk -t Hominidae

nwr prune tests/newick/catarrhini.nwk -n Homo -n Pan

echo "((A:1,B:1)D:1,C:1)E;" |
    nwr reroot stdin -n B
nwr reroot tests/newick/catarrhini_wrong.nwk -n Cebus

cargo run --bin nwr pl-condense tests/newick/catarrhini.nwk -r family

树的可视化

nwr indent tests/newick/hg38.7way.nwk --text ".   "

echo "((A,B),C);" |
    nwr comment stdin -n A -n C --color green |
    nwr comment stdin -l A,B --dot

tectonic doc/template.tex

nwr tex tests/newick/catarrhini.nwk -o output.tex
tectonic output.tex

nwr tex --bl tests/newick/hg38.7way.nwk

nwr tex --forest --bare tests/newick/test.forest

nwr common "Escherichia coli" 4932 Drosophila_melanogaster 9606 "Mus musculus" |
    nwr tex --bare stdin

数据库架构

brew install k1LoW/tap/tbls

tbls doc sqlite://./tests/nwr/taxonomy.sqlite doc/txdb

tbls doc sqlite://./tests/nwr/ar_refseq.sqlite doc/ardb

txdb

ardb

依赖项

~64MB
~1M SLoC