12 个版本 (5 个重大更新)

0.6.4	2021 年 12 月 2 日
0.6.3	2021 年 12 月 2 日
0.5.0	2021 年 11 月 30 日
0.4.1	2021 年 11 月 30 日
0.1.0	2021 年 11 月 27 日

#1172 在文本处理

每月下载量 32 次

MIT/Apache

38KB
817 行

GenEx

GenEx 是一个文本模板扩展库。

`lib.rs`:

Rust 库，实现了自定义的文本生成/模板系统。GenEx 与 Tracery 类似，但增加了处理外部数据的一些额外功能。

使用方法

首先创建一个语法，然后从这个语法生成一个或多个扩展。

use std::collections::HashSet;
use std::str::FromStr;
use maplit::hashmap;
use genex::Grammar;

let grammar = Grammar::from_str(
    r#"
      RULES:
      top = The <adj> <noun> #action|ed# #object|a#?:[ with gusto] in <place>.
      adj = [glistening|#adj#]
      noun = key
      place = [the #room#|#city#]

      WEIGHTS:
      room = 2
      city = 1
    "#,
)
.unwrap();

let data = hashmap! {
    "action".to_string() => "pick".to_string(),
    "object".to_string() => "lizard".to_string(),
    "room".to_string() => "kitchen".to_string(),
    "city".to_string() => "New York".to_string(),
};

// Now we find the top-scoring expansion. The score is the sum of the
// weights of all variables used in an expansion. We know that the top
// scoring expansion is going to end with "the kitchen" because we gave
// `room` a higher weight than `city`.

let best_expansion = grammar.generate("top", &data).unwrap().unwrap();

assert_eq!(
    best_expansion,
    "The glistening key picked a lizard in the kitchen.".to_string()
);

// Now get all possible expansions:

let all_expansions = grammar.generate_all("top", &data).unwrap();

assert_eq!(
    HashSet::<_>::from_iter(all_expansions),
    HashSet::<_>::from_iter(vec![
        "The glistening key picked a lizard in New York.".to_string(),
        "The glistening key picked a lizard with gusto in New York.".to_string(),
        "The glistening key picked a lizard with gusto in the kitchen.".to_string(),
        "The glistening key picked a lizard in the kitchen.".to_string(),
    ])
);

特性

Genex 旨在使基于不同数量的外部数据生成文本变得容易。例如，您可以编写一个仅当您知道对象的名称时才工作的单个扩展语法，但如果您知道对象的大小、位置、颜色或其他属性，则会使用附加信息。

默认行为是让 genex 尝试找到使用尽可能多外部数据的扩展，但通过更改分配给变量的权重，您可以选择优先使用哪些变量，甚至优先使用单个重要变量而不是多个不太重要的变量。

语法语法

规则

"RULES:" 表示语法的规则部分。规则由左侧（LHS）和右侧（RHS）定义。LHS 是规则的名称。RHS 是一系列项。

项

序列：[term1 term2 ...]
选择：[term1|term2|...]（您可以在 | 字符后放置换行符。）
可选：?:[term1 term2 ...]
变量：#variable# 或 #variable|modifier#
非终结符：<rule-name>
纯文本：I am some plain text. I hope I get expanded.

权重

"WEIGHTS:" 表示语法的权重部分。权重形式为 <规则名> = <数字>。

修饰符

修饰符用于在扩展过程中转换变量值。

修饰符

capitalize：将值的第一个字母大写。
capitalizeAll：将值中每个单词的第一个字母大写。
inQuotes：用双引号包围值。
comma：如果值不以标点符号结尾，则在值后添加逗号。
s：使值变为复数。
a：根据需要，在值前加上 "a"/"an" 冠词。
ed：将值的第一个单词改为过去时。

依赖

~5.5–7.5MB
~143K SLoC