5 个版本
0.0.6 | 2023 年 11 月 17 日 |
---|---|
0.0.5 | 2023 年 11 月 15 日 |
0.0.0 |
|
#380 在 文本处理
52KB
1K SLoC
pink accents
允许定义要替换的字符串中的模式集。这是一个华丽的正则表达式替换,一系列的它们。主要用例是模拟愚蠢的语音口音。
最初基于 python pink-accents,主要用于 ssnt 游戏。
目前无法独立使用,因为你不能使用内部结构构造 Accent
,但有计划支持程序定义。
替换类型
口音是一系列规则,按顺序应用。每个规则由正则表达式模式和替换组成。当正则表达式匹配发生时,调用替换。然后它决定(如果有的话)要放置什么。
可能的替换包括
Original
:不替换Simple
:将字符串原样输出(支持模板化)Any
(递归):随机选择具有相等权重的替换Weights
(递归):根据相对权重选择替换Uppercase
(递归):将内部结果转换为大写Lowercase
(递归):将内部结果转换为小写
序列化格式
deserialize
功能提供了一种定义规则的有意见方法,特别设计用于语音口音。反序列化主要开发用于支持 ron 格式,它有其怪癖,但应在 json 和可能的其他格式中工作。
完整参考
(
// on by default, tries to match input case with output after each rule
// for example, if you replaced "HELLO" with "bye", it would use "BYE" instead
normalize_case: true,
// pairs of (regex, replacement)
// this is same as `patterns` except that each regex is surrounded with \b to avoid copypasting.
// `words` are applied before `patterns`
words: [
// this is the simplest rule to replace all "windows" words (separated by regex \b)
// occurences with "linux", case sensitive
("windows", Simple("linux")),
// this replaces word "OS" with one of replacements, with equal probability
("os", Any([
Simple("Ubuntu"),
Simple("Arch"),
Simple("Gentoo"),
])),
// `Simple` supports regex templating: https://docs.rs/regex/latest/regex/struct.Regex.html#example-9
// this will swwap "a" and "b" "ab" -> "ba"
(r"(a)(?P<b_group>b)", Simple("$b_group$a")),
],
// pairs of (regex, replacement)
// this is same as `words` except these are used as is, without \b
patterns: [
// inserts one of the honks. first value of `Weights` is relative weight. higher is better
("$", Weights([
(32, Simple(" HONK!")),
(16, Simple(" HONK HONK!")),
(08, Simple(" HONK HONK HONK!")),
// ultra rare sigma honk - 1 / 56
(01, Simple(" HONK HONK HONK HONK!!!!!!!!!!!!!!!")),
])),
// lowercases all `p` letters (use "p" match from `Original`, then lowercase)
("p", Lowercase(Original)),
// uppercases all `p` letters, undoing previous operation
("p", Uppercase(Original)),
],
// accent can be used with intensity (non negative value). higher intensities can either extend
// lower level or completely replace it.
// default intensity is 0. higher ones are defined here
intensities: {
// extends previous intensity (level 0, base one in this case), adding additional rules
// below existingones. words and patterns keep their relative order though - words are
// processed first
1: Extend(
(
words: [
// even though we are extending, defining same rule will overwrite result.
// relative order of rules remain the same: "windows" will remain first
("windows", Simple("windoos")),
],
// extend patterns, adding 1 more rule
patterns: [
// replacements can be nested arbitrarily
("[A-Z]", Weights([
// 50% to replace capital letter with one of the Es
(1, Any([
Simple("E"),
Simple("Ē"),
Simple("Ê"),
Simple("Ë"),
Simple("È"),
Simple("É"),
])),
// 50% to do nothing, no replacement
(1, Original),
])),
],
),
),
// replace intensity 1 entirely. in this case with nothing. remove all rules on intensity 2+
2: Replace(()),
},
)
在 示例 文件夹中查看更多示例。
依赖项
~2.4–4MB
~69K SLoC