#unicode #confusable #unicode-characters #homoglyphs #security #moderation #binary-search

decancer

一个用于从字符串中移除常见 Unicode 混淆字符/同形异义词的库

27 个稳定版本

3.2.4 2024 年 8 月 4 日
3.2.3 2024 年 7 月 1 日
3.2.2 2024 年 6 月 18 日
3.1.2 2024 年 3 月 30 日
1.4.1 2022 年 7 月 16 日

#201 in 文本处理

Download history 188/week @ 2024-05-03 124/week @ 2024-05-10 99/week @ 2024-05-17 86/week @ 2024-05-24 125/week @ 2024-05-31 141/week @ 2024-06-07 259/week @ 2024-06-14 205/week @ 2024-06-21 315/week @ 2024-06-28 224/week @ 2024-07-05 123/week @ 2024-07-12 75/week @ 2024-07-19 248/week @ 2024-07-26 279/week @ 2024-08-02 135/week @ 2024-08-09 118/week @ 2024-08-16

790 每月下载量
3 crates 中使用

MIT 许可证

89KB
2.5K SLoC

decancer npm crates.io npm downloads crates.io downloads codacy ko-fi

一个用于从字符串中移除常见 Unicode 混淆字符/同形异义词的库。

安装

在您的 Cargo.toml

decancer = "3.2.4"

示例

有关更多信息,请参阅 文档

let mut cured = decancer::cure!(r"vEⓡ𝔂 𝔽𝕌Ňℕy ţ乇𝕏𝓣 wWiIiIIttHh l133t5p3/-\|<").unwrap();

assert_eq!(cured, "very funny text with leetspeak");

// WARNING: it's NOT recommended to coerce this output to a Rust string
//          and process it manually from there, as decancer has its own
//          custom comparison measures, including leetspeak matching!
assert_ne!(cured.as_str(), "very funny text with leetspeak");

assert!(cured.contains("funny"));

cured.censor("funny", '*');
assert_eq!(cured, "very ***** text with leetspeak");

cured.censor_multiple(["very", "text"], '-');
assert_eq!(cured, "---- ***** ---- with leetspeak");

捐赠

如果您想支持我手动查看数千个Unicode字符,请考虑捐款!❤

ko-fi

依赖项

~0–0.8MB
~14K SLoC