groan_rs — Rust库 // Lib.rs • Rust 包仓库

17个版本 (7个破坏性更新)

0.8.2	2024年8月8日
0.8.1	2024年6月23日
0.7.1	2024年3月15日
0.5.0	2023年11月24日

#403 在科学

每月下载136次
用于 gcenter

MIT 许可证

1.5MB
28K SLoC

groan_rs: 基于Rust的Gromacs分析库

Rust库，用于分析Gromacs模拟。目前处于开发早期阶段：任何内容都可能随时中断、更改或停止工作。

使用方法

运行

$ cargo add groan_rs

在Rust代码中导入该crate

use groan_rs::prelude::*;

许可证

此库根据MIT许可证发布。

完整文档

有关完整文档，请参阅相应的docs.rs页面。

lib.rs:

groan_rs: 基于Rust的Gromacs分析库

Rust库，用于分析Gromacs模拟。目前处于开发早期阶段：任何内容都可能随时中断、更改或停止工作。

它能做什么

读取和写入gro、pdb、pqr、ndx、xtc和trr文件。

从tpr文件中读取系统的拓扑和结构（使用minitprcrate）。

遍历原子并访问它们的属性，包括连接性（键）。

使用类似VMD的选择语言选择原子。

计算原子间的距离，同时遵守周期性边界条件。

根据几何条件选择原子。

分配元素到原子并猜测连接性。

计算任何原子组的几何中心或质量。

在一个模拟盒子中定位原子组。

通过提供实用数据结构（例如，GridMaps）帮助进行特定分析。

一些其他小功能...

它目前不能做什么

与非正交周期性边界条件一起工作。

无需额外工具即可进行结构和动力学的高级分析。（但是groan_rs库试图使实现自己的功能变得简单！)

使用方法

运行

$ cargo add groan_rs

在Rust代码中导入该crate

use groan_rs::prelude::*;

示例

分析结构文件

读取gro文件和ndx文件，并计算蛋白质的几何中心。

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let mut system = System::from_file("system.gro")?; // `groan_rs` also supports pdb, pqr, and tpr files which // can be read using the same function as above // read an ndx file system.read_ndx("index.ndx")?; // calculate the center of geometry of a protein // note that the ndx file `index.ndx` must contain a group `Protein` for this to work let center = system.group_get_center("Protein")?; // print the result println!("{:?}", center); Ok(()) }

选择原子并创建组

读取gro文件，并选择属于系统特定部分的原子。groan_rs使用groan选择语言（GSL）来选择原子。GSL类似于VMD查询语言，见下文。

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let mut system = System::from_file("system.gro")?; // select atoms using groan selection language creating a group 'My Group' system.group_create("My Group", "serial 25-28 or (resname POPC and name P)")?; // the group 'My Group' is now tightly associated with the system // we can now use the previously created group to construct another group // note that // a) each atom can be in any number of groups // b) the group names are case-sensitive system.group_create("my group", "'My Group' || resid 87 to 124")?; // we can then perform operations with the groups, e.g. write them into separate pdb files system.group_write_pdb("My Group", "My_Group.pdb", false)?; system.group_write_pdb("my group", "my_group.pdb", false)?; // each system also by default contains two groups consisting of all atoms in the system // these groups are called 'All' and 'all' assert!(system.group_exists("all")); assert!(system.group_exists("All")); // if you read an ndx file into the system like this: system.read_ndx("index.ndx")?; // the groups defined in the ndx file will be associated // with the system in the same way as the manually created groups Ok(()) }

分析轨迹文件

读取xtc文件，并计算从100纳秒开始到300纳秒结束的每一帧中两个原子组之间的距离。（groan_rs支持过程式和函数式方法。）

过程式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let mut system = System::from_file("structure.gro")?; // create the groups we are interested in // `groan_rs` uses VMD-like selection language for specifying groups of atoms system.group_create("group 1", "serial 1 to 5")?; system.group_create("group 2", "resid 45")?; // prepare a vector for the calculated distances let mut distances = Vec::new(); // read the trajectory calculating the distance between the groups for each frame for frame in system .xtc_iter("files/md_short.xtc")? // we are only interested in frames between 100 and 300 ns .with_range(100_000.0, 300_000.)? { // check that the xtc frame has been read correctly let frame = frame?; // calculate the distance and put it into the vector distances.push( frame .group_distance("group 1", "group 2", Dimension::XYZ) .expect("Groups do not exist but they should."), ); } // print the calculated distances println!("{:?}", distances); Ok(()) }

函数式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let mut system = System::from_file("structure.gro")?; // create the groups we are interested in // `groan_rs` uses VMD-like selection language for specifying groups of atoms system.group_create("group 1", "serial 1 to 5")?; system.group_create("group 2", "resid 45")?; // read the trajectory calculating distance between the groups // for each frame in the time range 100-300 ns // and collect the results into a vector let distances: Vec<f32> = system // read an xtc trajectory .xtc_iter("trajectory.xtc")? // we are only interested in frames between 100 and 300 ns .with_range(100_000.0, 300_000.)? // calculate distance between the groups for each frame .map(|frame| { // check that the xtc frame has been read correctly let frame = frame?; // calculate the distance Ok(frame .group_distance("group 1", "group 2", Dimension::XYZ) .expect("Groups do not exist but they should.")) }) // collect the calculated distances // if any error occured while reading the trajectory, propagate it .collect::<Result<Vec<f32>, Box<dyn Error + Send + Sync>>>()?; // print the calculated distances println!("{:?}", distances); Ok(()) }

请注意，with_range是一个非常有效的方法，它将跳过不在指定范围内的xtc帧，而无需实际读取这些帧中原子的属性。

您还可以让轨迹迭代器将轨迹读取信息打印到标准输出

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { let mut system = System::from_file("structure.gro")?; system.group_create("group 1", "serial 1 to 5")?; system.group_create("group 2", "resid 45")?; // create default progress printer let printer = ProgressPrinter::new(); let distances: Vec<f32> = system .xtc_iter("trajectory.xtc")? // attach progress printer to the iterator .print_progress(printer) .map(|frame| { let frame = frame?; Ok(frame .group_distance("group 1", "group 2", Dimension::XYZ) .expect("Groups do not exist but they should.")) }) .collect::<Result<Vec<f32>, Box<dyn Error + Send + Sync>>>()?; println!("{:?}", distances); Ok(()) }

在trr和xtc文件之间转换

读取trr文件，并将粒子的位置写入新的xtc文件。

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let mut system = System::from_file("structure.gro")?; // create an output xtc file // the `XtcWriter` structure is tightly coupled with the corresponding `System` structure // if the `System` is updated, the `XtcWriter` reflects this change let mut writer = XtcWriter::new(&system, "output.xtc")?; // iterate through the trr trajectory // the `trr_iter` (as well as `xtc_iter`) function updates the `System` structure // with which it is associated with the properties of the system in the current frame for frame in system.trr_iter("input.trr")? { // check that the trr frame has been read correctly frame?; // write the current frame into the output xtc file writer.write_frame()?; } Ok(()) }

遍历原子

遍历两个不同组中的原子，收集它们的信息或修改它们。

过程式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file and an ndx file let mut system = System::from_file("system.gro")?; system.read_ndx("index.ndx")?; // immutably iterate over atoms of the group 'Protein' collecting atom names let mut names = Vec::new(); for atom in system.group_iter("Protein")? { names.push(atom.get_atom_name().to_owned()); } // (you can also iterate over all the atoms in the system using `System::atoms_iter`) // print the names println!("{:?}", names); // mutably iterate over atoms of the group 'Membrane' changing their residue names for atom in system.group_iter_mut("Membrane")? { atom.set_residue_name("MEMB"); } // (you can also iterate over all the atoms in the system using `System::atoms_iter_mut`) Ok(()) }

函数式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file and an ndx file let mut system = System::from_file("system.gro")?; system.read_ndx("index.ndx")?; // immutably iterate over atoms of the group 'Protein' collecting atom names let names: Vec<String> = system .group_iter("Protein")? .map(|atom| atom.get_atom_name().to_owned()) .collect(); // (you can also iterate over all the atoms in the system using `System::atoms_iter`) // print the names println!("{:?}", names); // mutably iterate over atoms of the group 'Membrane' changing their residue names system .group_iter_mut("Membrane")? .for_each(|atom| atom.set_residue_name("MEMB")); // (you can also iterate over all the atoms in the system using `System::atoms_iter_mut`) Ok(()) }

根据几何标准过滤和提取原子

过滤位于指定位置和尺寸的圆柱体内的原子，并将它们提取到单独的向量中。

过程式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let system = System::from_file("system.gro")?; // extract atoms that are located inside a specified cylinder let mut inside_cylinder = Vec::new(); // the cylinder has its center of the base at x = 1.5, y = 2.5, z = 3.5 (in nm) // the cylinder has radius of 2.1 nm and height of 4.3 nm // the cylinder has its main axis oriented along the z-axis of the system let cylinder = Cylinder::new([1.5, 2.5, 3.5].into(), 2.1, 4.3, Dimension::Z); for atom in system.atoms_iter() { if cylinder.inside(atom.get_position().unwrap(), system.get_box_as_ref().unwrap()) { inside_cylinder.push(atom.clone()); } } // atoms in the `inside_cylinder` vector are fully independent copies // of the original atoms in the `System` structure // print the number of extracted atoms println!("{:?}", inside_cylinder.len()); Ok(()) }

函数式方法

use groan_rs::prelude::*; use std::error::Error; fn main() -> Result<(), Box<dyn Error + Send + Sync>> { // read a gro file let system = System::from_file("system.gro")?; // extract atoms that are located inside a specified cylinder let inside_cylinder: Vec<Atom> = system .atoms_iter() // the cylinder has its center of the base at x = 1.5, y = 2.5, z = 3.5 (in nm) // the cylinder has radius of 2.1 nm and height of 4.3 nm // the cylinder has its main axis oriented along the z-axis of the system .filter_geometry(Cylinder::new( [1.5, 2.5, 3.5].into(), 2.1, 4.3, Dimension::Z, )) .cloned() .collect(); // atoms in the `inside_cylinder` vector are fully independent copies // of the original atoms in the `System` structure // print the number of extracted atoms println!("{}", inside_cylinder.len()); Ok(()) }

groan选择语言

groan选择语言（GSL）是用于在groan_rs库中选择原子的查询语言。一般来说，GSL与VMD使用的选择语言类似。

基本查询

您可以根据以下条件选择原子：

使用resname XYZ根据它们的残基名称。例如，resname POPE将选择所有命名为POPE的系统中的原子。
使用resid XYZ或resnum XYZ根据它们的残基编号。例如，resid 17将选择所有编号为17的系统中的原子。使用name XYZ或atomname XYZ根据它们的原子名称。例如，name P将选择所有名称为P的系统中的原子。使用serial XYZ根据它们的真实原子编号。例如，serial 256将选择编号为256的原子（这保证是单个原子）。请注意，在这种情况下，原子是根据gromacs理解的“真实”原子编号选择的，而不是系统起源于的gro或pdb文件中指定的原子编号。他们使用 atomid XYZ 或 atomnum XYZ 来指定原子序数。例如，atomid 124 将选择所有在系统来源的 gro 或 pdb 文件中原子序数为 124 的原子。这可能涉及多个原子。他们使用 chain X 指定链标识符。例如 chain A 将选择属于链 'A' 的所有原子。链的信息通常存在于 pdb 文件中。注意，如果没有链信息，使用此关键字将不会选择任何原子。他们使用 element name XYZ 或 elname XYZ 指定元素名称。例如，element name carbon 将选择所有碳原子。注意，原子必须被分配元素才能被选择，请参阅下文。他们使用 element symbol X 或 elsymbol X 指定元素符号。例如，element symbol C 将选择所有碳原子。注意，原子必须被分配元素才能被选择，请参阅下文。他们使用 label XYZ 指定标签，见下文。
多个标识符您可以在查询中指定多个标识符。例如，使用 resname POPE POPG，您将选择与名为 POPE 的残基以及与名为 POPG 的残基对应的系统中的所有原子。下面是类似查询的示例。 resid 13 15 16 17 将选择编号为 13、15、16 或 17 的残基对应的原子。 name P CA HA 将选择具有原子名称 P、CA 或 HA 的所有原子。 serial 245 267 269 271 将选择编号为 245、267、269 或 271 的原子。 chain A B C 将选择属于链 'A'、'B' 或 'C' 的原子。 elname carbon hydrogen 将选择所有碳和氢原子。 elsymbol C H 将选择所有碳和氢原子。使用组选择原子您还可以使用先前创建的原子组来选择原子。例如，如果您已创建了名为 "Protein" 和 "Membrane" 的组，您可以使用查询 group Protein Membrane 或仅 Protein Membrane 来选择这两个组的原子。如果任何组不存在，将引发错误。如果您使用 System::read_ndx() 加载了系统的 ndx 文件，您还可以使用 ndx 文件中的组。如果您的组由多个单词组成，您必须将其放在引号（' 或 "）中。例如，Protein 'My Group' 将选择组 "Protein" 以及组 "My Group" 的所有原子。自动检测选择原子您可以使用内部定义的宏来选择原子。目前，groan_rs 库提供了六个这样的宏。 @protein 将选择所有与氨基酸对应的原子（支持140多种不同的氨基酸）。 @water 将选择所有水分子的原子。 @ion 将选择所有离子的原子。 @membrane 将选择所有与膜脂质对应的原子（支持200多种膜脂质类型）。 @dna 将选择属于DNA分子的所有原子。 @rna 将选择属于RNA分子的所有原子。请注意，虽然groan宏通常很可靠，但不能保证它们一定能正确识别所有原子。使用时请谨慎。选择所有原子您可以使用 all 选择系统的所有原子。范围您可以使用关键字 to 或 - 来指定范围，而不是明确写出残基或原子编号。例如，您可以使用 resid 14 to 20 或 resid 14-20 来代替 resid 14 15 16 17 18 19 20。这将选择所有残基编号为14、15、16、17、18、19或20的原子。您还可以在一个查询中指定多个范围，或者将范围与明确提供的数字组合。例如，serial 1 3 to 6 10 12 - 14 17 将展开为 serial 1 3 4 5 6 10 12 13 14 17。您可以使用 <、>、<= 和 >= 运算符来指定开放式范围。例如，您可以使用 serial <= 180 来代替 serial 1 to 180。这将选择所有原子编号小于或等于180的原子。同样，resid > 33 将选择所有残基编号为34或更高的原子。开放式范围可以与from-to范围和明确提供的数字组合。例如，serial 1 3-6 >=20 将选择所有原子编号为1、3、4、5、6或20及以上的原子。否定在查询前使用关键词 not 或 ! 可以对其取反。例如，查询 not name CA 或 ! name CA 将选择所有名称不对应于 CA 的原子。同样地，not resname POPE POPG 将选择所有名称不是 POPE 或 POPG 的残基对应的原子。!Protein 将选择所有不属于名为 Protein 的组的原子。二元运算您可以通过使用 and （&&）和 or （||）运算符来组合基本查询。通过 and 联接两个查询将只选择两个查询都选择的原子。例如，resname POPE and name P 将选择所有属于名为 POPE 的残基并且具有名称 P 的原子。同样地，resid 17 18 && serial 256 to 271 将只选择对应于残基 17 或 18 并且原子编号在 256 到 271 之间的原子（包括 271）。通过 or 联接两个查询将选择由任一查询选择的原子（至少一个是）。例如，resname POPE or name P 将选择所有属于名为 POPE 的残基以及所有具有名称 P 的原子。同样地，resid 17 18 || serial 256 to 271 将选择所有对应于残基 17 或 18 的原子以及所有原子编号在 256 到 271 之间的原子。如果在一个查询中使用了多个 and 和/或 or 运算符，它们将按从左到右的顺序计算。例如，resname POPE or name CA and not Protein 将选择所有属于名为 POPE 的残基或具有原子名称 CA 的原子，但所有这些原子不能属于名为 Protein 的组。可以使用运算符将自检测宏与其他子查询组合，即 @membrane or group 'My Lipids' 将选择所有自动检测到的膜脂和组 "My Lipids" 的所有原子。括号您可以通过使用括号 ( 和 ) 来改变个别子查询和操作的评估顺序。括号内的表达式首先计算（想想数学）。例如，resname POPE or (name CA and not resid 18 to 21) 将选择所有属于名为 POPE 的残基以及所有具有原子名称 P 并且不对应于编号为 18 到 21 的残基的原子。。同时，(resname POPE 或名称 CA) 并且不是 resid 18 到 21 等同于 resname POPE 或名称 CA 并且不是 resid 18 到 21。这将选择所有名为 POPE 或具有原子名称 CA 的原子，但这些原子不能属于 18 到 21 号残基。您可以将括号表达式放入其他括号表达式中。例如，serial 1 到 6 或 (name CA 和 resname POPE || (resid 1 到 7 或 serial 123 到 128)) 和 Protein 是一个有效的查询，尽管可能过于复杂。您还可以在括号表达式前放置 not (!) 操作符。例如，!(serial 1 到 6 && name P) 将选择所有既不具有原子编号在 1 到 6 之间、同时具有原子名称 P 的原子。选择分子您可以使用 molecule with (或 mol with) 操作符选择所有属于特定分子的原子。例如，molecule with serial 15 将选择与原子编号为 15 的原子属于同一分子的所有原子（包括原子 15）。同样，molecule with resid 4 17 29 将选择所有包含 4、17 或 29 号残基原子中的任一原子的所有分子的所有原子。然后，molecule with name P 将选择所有包含具有名称 P 的原子的所有分子的所有原子。请注意，a) 我们所说的分子是由键连接的原子集合，b) 分子不是一个残基。 molecule with 操作符仅与它右侧的第一个子查询相关。例如，molecule with serial 15 或 name BB 将选择所有既属于原子 15 的同一分子，又具有名称 BB 的原子。同时，molecule with (serial 15 或 name BB) 将选择所有包含原子 15 或任何具有名称 BB 的原子的所有分子（即，如果一个分子包含一个名为 BB 的原子，则所有原子都将被选中）。请注意，为了能够选择分子，系统必须包含拓扑信息。否则，不选择任何原子。标记和选择原子 groan_rs 库允许您通过字符串标记特定的原子（见 System::label_atom）。这些标记可以用作 Groan 选择语言中的标识符。每个标记都与单个原子相关联，这意味着它可以用来选择这个原子。换句话说，标记类似于组，但保证只包含一个原子。例如，将特定的原子标记为 MyAtom，您可以使用查询 label MyAtom 在以后选择它。您也可以通过链式调用标签来选择多个标记原子（例如，label MyAtom AnotherAtom OneMoreAtom）。像组一样，标签可以由多个单词组成。在这种情况下，标签必须用引号（' 或 "）括起来，例如 label 'Very interesting atom'。正则表达式原子、残基和组名称，以及元素名称、元素符号和标签都可以使用正则表达式指定。例如，可以使用 name r'^[1-9]?H.*'（或使用 elname hydrogen）选择系统中的所有氢原子。类似地，可以使用 resname r'^.*PC' 选择所有磷脂酰胆碱脂质。然后，查询 group r'^P' 或仅仅是 r'^P' 将选择所有名称以 'P' 开头的组（区分大小写）。请注意，正则表达式必须用开始于 r' 并以 ' 结束的“正则表达式块”括起来。您可以在单个查询中指定多个正则表达式，并将它们与普通字符串一起使用。使用 regex crate 评估正则表达式。有关支持的正则表达式的更多信息，请访问 https://docs.rs/regex/latest/regex/。关于选择元素请注意，系统中的原子不是由 groan_rs 库隐式分配元素的。当使用 element name 或 element symbol 关键字时，只有在原子已分配元素的情况下，才会选择这些原子。在 groan_rs 库内部，您可以通过从 tpr 文件创建 System 结构或调用 System::guess_elements 函数来实现这一点。如果您是使用 groan 选择语言的程序的用户，在使用 element 关键字之前，请确保程序已将元素分配给原子。关于空格运算符和括号不需要与查询的其他部分用空格隔开，除非查询的意义会变得不清楚。例如，not(name CA)or(serial 1to45||Protein) 是一个有效的查询，而 not(name CA)or(serial 1to45orProtein) 是无效的，因为 orProtein 变得无法解释。但是，将 Protein 括在括号内，即 not(name CA)or(serial 1to45or(Protein))，将查询转换为有效查询。错误处理适当的错误处理和传播是 groan_rs 库的核心。然而，groan_rs 提供的个别错误类型并未导出到 prelude 模块。如果您想使用 groan_rs 库中的特定错误类型，您必须从 errors 模块显式包含它。例如，如果您想直接处理写入 pdb 文件时可能发生的错误，请使用 use groan_rs::errors::WritePdbError; 请注意，即使在没有显式包含错误类型的情况下，groan_rs 也能正常工作。特性序列化支持 (serde)：通过集成 serde 框架，使 groan_rs 数据结构能够进行序列化和反序列化。并发增强 (parallel)：通过为多线程执行设计的函数扩展 groan_rs 库。使用 cargo add groan_rs --features [FEATURE] 通过指定功能安装 groan_rs crate。限制目前，groan_rs 库无法正确处理非正交的周期性模拟盒。虽然它可以读取非正交盒的结构和轨迹，但计算出的距离和类似属性可能不正确！请务必小心处理！虽然 groan_rs 可以读取双精度 trr 和 tpr 文件，但它在其代码中所有地方都使用单精度浮点数。如果您需要双精度进行分析，请考虑其他选择。许可证此库根据MIT许可证发布。

依赖项 ~10–22MB ~286K SLoC 彩色 fancy-regex 0.13 float-cmp 0.9 hashbrown 0.14.5 indexmap 2.1+serde minitpr nalgebra 0.32.3 ndarray regex serde+derive serde_yaml 0.9 thiserror build build.rs build cc+parallel dev criterion 0.5.1 dev file_diff dev rand dev tempfile 其他功能 parallel

17个版本 (7个破坏性更新)

groan_rs: 基于Rust的Gromacs分析库

使用方法

许可证

完整文档

lib.rs:

groan_rs: 基于Rust的Gromacs分析库

它能做什么

它目前不能做什么

使用方法

示例

分析结构文件

选择原子并创建组

分析轨迹文件

在trr和xtc文件之间转换

遍历原子

根据几何标准过滤和提取原子

groan选择语言

错误处理

特性

限制

许可证

依赖项

`lib.rs`: