#nlp #cmudict #rhyme #alliteration #double-metahone

ttaw

与墙壁对话,一个零散的自然语言处理库

3 个版本 (破坏性更新)

使用旧的 Rust 2015

0.3.0 2020年2月4日
0.2.0 2019年11月15日
0.1.0 2019年11月2日

#731文本处理

每月 30 次下载

MIT 许可证

53KB
1.5K SLoC

Build Status Coverage Status Crates.io Version Crates.io LICENSE

ttaw

与墙壁对话,一个零散的自然语言处理库。

一些注意事项

  • 我的实现可能并不完美 😄 如果您遇到任何奇特或意外的情况,请提出问题 ❤️ ❤️

功能

  • 使用双元音音标编码确定两个单词是否押韵

  • 使用 CMUdict 音标编码确定两个单词是否押韵

  • 使用双元音音标编码确定两个单词是否谐音

  • 使用 CMUdict 音标编码确定两个单词是否谐音

  • 获取单词的 CMUdict 音标编码

  • 获取单词的双元音音标编码(words/double-metaphone 库的移植)

押韵

extern crate ttaw;
use ttaw;

// Initialize the CmuDict with a path to the existing serialized CMU dictionary
// or a directoy containing it. If the dictionary doesn't exisit, it will be
// downloaded and serialized at the location specified by the path parameter.
let cmudict = ttaw::cmu::CmuDict::new("cmudict.json").unwrap();

assert_eq!(Ok(true), cmudict.rhyme("far", "tar"));
assert_eq!(Ok(true), ttaw::metaphone::rhyme("far", "tar"));

assert_eq!(Ok(false), cmudict.rhyme("shopping", "cart"));
assert_eq!(Ok(false), ttaw::metaphone::rhyme("shopping", "cart"));

// Deviations in cmu and metaphone
assert_eq!(true, ttaw::metaphone::rhyme("hear", "near"));
assert_eq!(Ok(false), cmudict.rhyme("hear", "near"));

谐音

extern crate ttaw;
use ttaw;

// Initialize the CmuDict with a path to the existing serialized CMU dictionary
// or a directoy containing it. If the dictionary doesn't exisit, it will be
// downloaded and serialized at the location specified by the path parameter.
let cmudict = ttaw::cmu::CmuDict::new("cmudict.json").unwrap();

assert_eq!(Ok(true), cmudict.alliteration("bounding","bears"));
assert_eq!(true, ttaw::metaphone::alliteration("bounding","bears"));

assert_eq!(Ok(false), cmudict.alliteration("lazy", "dog"));
assert_eq!(false, ttaw::metaphone::alliteration("lazy", "dog"));

CMUdict

extern crate ttaw;
use ttaw;

// Initialize the CmuDict with a path to the existing serialized CMU dictionary
// or a directoy containing it. If the dictionary doesn't exisit, it will be
// downloaded and serialized at the location specified by the path parameter.
let cmudict = ttaw::cmu::CmuDict::new("cmudict.json").unwrap();

assert_eq!(
    cmudict.encoding(("unearthed"),
    Ok(Some(vec![vec![
        "AH0".to_string(),
        "N".to_string(),
        "ER1".to_string(),
        "TH".to_string(),
        "T".to_string()
    ]]))
);

双元音音标

extern crate ttaw;
use ttaw;
assert_eq!(ttaw::metaphone::encoding("Arnow").primary, "ARN");
assert_eq!(ttaw::metaphone::encoding("Arnow").secondary, "ARNF");

assert_eq!(
    ttaw::metaphone::encoding("detestable").primary,
    "TTSTPL"
);
assert_eq!(
    ttaw::metaphone::encoding("detestable").secondary,
    "TTSTPL"
);

依赖关系

~21MB
~463K SLoC