8个版本

0.1.7	2020年9月30日
0.1.6	2020年9月27日
0.1.5	2020年3月7日
0.1.4	2019年5月14日
0.1.3	2018年7月26日

#480 在数据结构

每月 355 次下载
用于 5 个crate（4个直接使用）

MIT/Apache

38KB
838 行

FID

这个crate提供了一种简洁的数据结构，用于位向量，支持两种常量时间的位操作。

rank(i)计算[0..i)中0的个数（或1的个数）。
select(r)定位0（或1）的第(r+1)个位置。

支持这两种操作的结构称为FID（完全可索引字典）。

用法

在您的Cargo.toml

[dependencies]
fid = "0.1"

然后

extern crate fid;

use fid::{BitVector, FID};

let mut bv = BitVector::new();
// 01101101
bv.push(false); bv.push(true); bv.push(true); bv.push(false);
bv.push(true); bv.push(true); bv.push(false); bv.push(true);

assert_eq!(bv.rank0(5), 2);
assert_eq!(bv.rank1(5), 3);
assert_eq!(bv.select0(2), 6);
assert_eq!(bv.select1(2), 4);

鸣谢

BitVector的基本压缩和计算算法最初来自[1]，其实际实现技术来自[2]。

在BitVector中，位被分成小和大块。每个小块由一个类（块中的1的个数）和类内的索引来标识。类存储在ceil(log(SBLOCK_WIDTH + 1))位中。如果压缩大小小于MAX_CODE_SIZE，则索引存储在log(C(SBLOCK_WIDTH, index))位中，并使用枚举代码。否则，为了提高效率，显式存储小块的位模式作为索引。这个想法最初来自[2]。对于每个大块，我们存储其开始之前1的个数和一个指向第一个小块索引的指针。

[1] Gonzalo Navarro and Eliana Providel. 2012. Fast, small, simple rank/select on bitmaps. In Proceedings of the 11th international conference on Experimental Algorithms (SEA'12), Ralf Klasing (Ed.). Springer-Verlag, Berlin, Heidelberg, 295-306. DOI=http://dx.doi.org/10.1007/978-3-642-30850-5_26

[2] rsdic by Daisuke Okanohara. https://github.com/hillbig/rsdic

基准测试

在长度为(1,000,000和100,000,000)且密度为(密集：99%，正常：50%，稀疏：1% 1s)的位向量上进行10,000次操作。

$ rustup nightly run cargo bench
running 12 tests
test rank_100000000_dense    ... bench:     752,410 ns/iter (+/- 39,871)
test rank_100000000_normal   ... bench:     865,107 ns/iter (+/- 34,210)
test rank_100000000_sparse   ... bench:     714,583 ns/iter (+/- 17,977)
test rank_1000000_dense      ... bench:     670,544 ns/iter (+/- 18,139)
test rank_1000000_normal     ... bench:     376,054 ns/iter (+/- 8,969)
test rank_1000000_sparse     ... bench:     635,294 ns/iter (+/- 15,752)
test select_100000000_dense  ... bench:   1,026,957 ns/iter (+/- 740,011)
test select_100000000_normal ... bench:   2,193,391 ns/iter (+/- 63,561)
test select_100000000_sparse ... bench:   1,971,993 ns/iter (+/- 60,703)
test select_1000000_dense    ... bench:     805,135 ns/iter (+/- 20,085)
test select_1000000_normal   ... bench:   1,456,985 ns/iter (+/- 33,205)
test select_1000000_sparse   ... bench:   1,791,824 ns/iter (+/- 44,174)

依赖关系

~0.4–1MB
~23K SLoC