8个版本

0.1.7	2024年7月19日
0.1.6	2024年7月4日
0.1.5	2024年6月17日

146 在压缩中

每月下载量376次
在2个crate中使用（通过vortex-fastlanes）

Apache-2.0

36KB
615 行

FastLanes Rust

FastLanes压缩库的Rust实现

Azim Afroozeh和Peter Boncz. 2023. The FastLanes Compression Layout: Decoding > 100 Billion Integers per Second with Scalar Code. Proc. VLDB Endow. 16, 9 (May 2023), 2132–2144. https://doi.org/10.14778/3598581.3598587

FastLanes是一个压缩框架，可以利用LLVM的自动向量化，在不使用内联函数或其他显式SIMD代码的情况下实现高性能SIMD解码。

用法

use fastlanes::BitPacking;

fn pack_u16_into_u3() {
    const WIDTH: usize = 3;

    // Generate some values.
    let mut values: [u16; 1024] = [0; 1024];
    for i in 0..1024 {
        values[i] = (i % (1 << WIDTH)) as u16;
    }

    // Pack the values.
    let values = [3u16; 1024];
    let mut packed = [0; 128 * WIDTH / size_of::<u16>()];
    BitPacking::bitpack::<WIDTH>(&values, &mut packed);

    // Unpack the values.
    let mut unpacked = [0u16; 1024];
    BitPacking::bitunpack::<WIDTH>(&packed, &mut unpacked);
    assert_eq!(values, unpacked);

    // Unpack a single value at index 14.
    // Note that for more than ~10 values, it can be faster to unpack all values and then 
    // access the desired one.
    assert_eq!(BitPacking::bitunpack_single::<WIDTH>(&packed, 14), 14);
}

与原始FastLanes的不同之处

[!警告] Rust FastLanes与原始FastLanes不二进制兼容

此库中的BitPacking实现与原始实现顺序不同，以便为转置编码（如Delta和RLE）启用融合内核，除了FoR等线性内核。

验证汇编

要验证生成的汇编的正确性并确保它是向量化的，可以使用以下命令

RUSTFLAGS='-C target-cpu=native' cargo asm --profile release --bench bitpacking --rust BitPacking

注意，它需要cargo install cargo-show-asm。

基准测试

RUSTFLAGS='-C target-cpu=native' cargo bench --profile release

许可证

根据Apache 2.0许可证授权。

依赖关系

~195KB