5个版本

0.2.1 2023年11月13日
0.2.0 2023年11月12日
0.1.2 2023年11月7日
0.1.1 2023年11月7日
0.1.0 2023年10月28日

#503算法

每月 31 次下载

MIT 许可证

56KB
876

simple_search

一个简单的对象搜索库。

基本用法

use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::base::weighted_levenshtein_similarity;

fn main() {
    let engine = SearchEngine::new()
        .with_values(vec!["hello", "world", "foo", "bar"])
        .with(|v, q| weighted_levenshtein_similarity(v, q));

    let results = engine.search("hallo");

    println!("search for hallo: {:?}", results);
}

高级用法

以下示例展示了如何使用自定义类型与库结合。SearchEngine被配置为根据标题、作者和描述来搜索书籍。每个部分都有不同的权重,并使用IncrementalLevenshtein来计算相似度。

use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::incremental::IncrementalLevenshtein;

#[derive(Debug)]
struct Book {
    title: String,
    description: String,
    author: String,
}

fn main() {
    let book1 = Book {
        title: "The Winds of Winter".to_string(),
        description: "The sixth book in the A Song of Ice and Fire series.".to_string(),
        author: "George R. R. Martin".to_string(),
    };
    let book2 = Book {
        title: "The Great Gatsby".to_string(),
        description: "A classic novel of the roaring twenties.".to_string(),
        author: "F. Scott Fitzgerald".to_string(),
    };
    let book3 = Book {
        title: "Brave New World".to_string(),
        description: "A visionary and disturbing novel about a dystopian future.".to_string(),
        author: "Aldous Huxley".to_string(),
    };
    let book4 = Book {
        title: "To Kill a Mockingbird".to_string(),
        description: "A novel that deals with issues like injustice and moral growth.".to_string(),
        author: "Harper Lee".to_string(),
    };
    
    let engine = SearchEngine::new()
        .with_values(vec![book1, book2, book3, book4])
        .with_state(
            |book| IncrementalLevenshtein::new("", &book.title),
            |s, _, q| s.weighted_similarity(q),
        )
        .with_state_and_weight(
            0.8,
            |book| IncrementalLevenshtein::new("", &book.author),
            |s, _, q| s.weighted_similarity(q),
        )
        .with_state_and_weight(
            0.5,
            |book| IncrementalLevenshtein::new("", &book.description),
            |s, _, q| s.weighted_similarity(q),
        );

    let results = engine.similarities("Fire adn water");
    
    println!("search for Fire adn water:");
    for result in results {
        println!("{:?}", result);
    }
    
    println!();
    
    let results = engine.similarities("Fitzereld");
    
    println!("Fitzereld");
    for result in results {
        println!("{:?}", result);
    }
    
    println!();
}

存储引擎

SearchEngine通常具有非常复杂的类型,难以表达。为此,type_erasure模块提供了一种使用Box中的trait object存储引擎的方法。
此解决方案不是最佳方案,因为它需要动态调度,但开销很小。一旦RFC 2515被纳入稳定Rust,这将被更优雅的解决方案所取代。更多细节请参阅type_erasure模块。

 use simple_search::search_engine::SearchEngine;
 use simple_search::levenshtein::incremental::IncrementalLevenshtein;
 use simple_search::type_erasure::non_cloneable::MutableSearchEngine;

 fn main() {
     let engine = SearchEngine::new()
         .with_values(vec!["hello", "world", "foo", "bar"])
         .with_state(
                 |v| IncrementalLevenshtein::new("", v),
                 |s, _, q| s.weighted_similarity(q),
         );
     
     let mut engine: MutableSearchEngine<&str, str> = engine.erase_type();
     
     let results = engine.search("hallo");
     println!("search for hallo: {:?}", results);
 }

并行化

可以使用rayon迭代器并行使用SearchEngine。这只需要调用相应函数的并行版本
(只要值和查询都是Send + Sync)。

use simple_search::search_engine::SearchEngine;
use simple_search::levenshtein::base::weighted_levenshtein_similarity;

fn main() {
    let engine = SearchEngine::new()
        .with_values(vec!["hello", "world", "foo", "bar"])
        .with(|v, q| weighted_levenshtein_similarity(v, q));
   
    let results = engine.par_search("hallo");
   
    println!("search for hallo: {:?}", results);
}

依赖项

~31–300KB