#double-array #trie #double #array #analysis #morphological

dary

此软件包提供双数组构建和搜索函数

1个不稳定版本

0.1.1 2019年11月17日

#2463 in 算法

MIT 许可证

43KB
722

Build Status

dary

描述

dary是一个实现了双数组构建和搜索的库。

基准测试

进行尝试树构建、双数组构建、双数组搜索等基准测试。
基准测试1中,使用以下代码作为数据: MorphemeData { surface: String, cost usize }
基准测试2中,使用以下代码作为数据: u32

# [usage]
#   cargo run --release --example benchmarks -- <DATA_LENGTH_IN_BITS>
# 
#   <DATA_LENGTH_IN_BITS> テストするデータ数をビットで指定します。10とするとデータ数は 1024 個です。

cargo run --release --example benchmarks -- 23
    Finished release [optimized] target(s) in 0.03s
     Running `target/release/examples/benchmarks 23`
data len: 8388608

benchmark 1 start
build trie: 14.3049744 sec
build double array: 12.821794241 sec
dump double array: 2.061486205 sec
get all data: 6.9113534 sec

benchmark 2 start
build trie: 19.967433678 sec
build double array: 20.440614366 sec
dump double array: 1.854162494 sec
get all data: 5.642468931 sec

入门指南

use std::fmt::Debug;
use dary::DoubleArray;
use dary::Trie;
use serde_derive::{Serialize, Deserialize};

fn main() {
  let key1 = String::from("foo");
  let key2 = String::from("bar");
  let key3 = String::from("baz");

  let sample1 = Sample { surface: key1.clone(), cost: 1 };
  let sample2 = Sample { surface: key1.clone(), cost: 2 };
  let sample3 = Sample { surface: key2.clone(), cost: 1 };
  let sample4 = Sample { surface: key3.clone(), cost: 1 };

  let mut trie: Trie<Sample> = Trie::new();
  trie.set(&key1, sample1.clone());
  trie.set(&key1, sample2.clone());
  trie.set(&key2, sample3.clone());
  trie.set(&key3, sample4.clone());

  let double_array = trie.to_double_array().ok().unwrap();
  assert_eq!(vec![sample1, sample2], double_array.get(&key1).unwrap());
  assert_eq!(vec![sample3]         , double_array.get(&key2).unwrap());
  assert_eq!(vec![sample4]         , double_array.get(&key3).unwrap());
}

#[derive(Serialize, Deserialize, Clone, Debug, PartialEq)]
struct Sample {
    surface: String,
    cost: usize,
}

依赖项

~1.1–2MB
~40K SLoC