3个版本 (破坏性更新)

0.2.0	2023年5月23日
0.1.0	2023年5月22日
0.0.0	2023年5月19日

#607 in 内存管理

每月38次下载

MIT/Apache

51KB
917 行

ring-alloc

基于环的Rust内存分配器，用于短生命周期分配。

由于没有生命周期限制，与基于竞技场的分配器相比，提供了更好的灵活性。但是用户仍然需要在短时间内释放内存，以避免浪费内存。

分配器使用块环缓冲区在前面块中分配内存，当充满时将其移动到后面。如果下一个块仍然被旧分配占用，分配器将分配新的块。当有足够的块，以至于下一个块总是空闲的，当第一个块耗尽时，它将不再分配新的块。

未能释放单个内存块将阻止环分配器重新使用块，因此请小心，不要泄漏块或将它们保留得太久。

使用

此crate提供两种类型的环分配器。 RingAlloc 是一个线程局部分配器，它拥有自己的块环并使用用户提供的底层分配器来分配块。 RingAlloc 可以廉价地克隆，并且克隆共享内部状态。

#![cfg_attr(feature = "nightly", feature(allocator_api))]
use ring_alloc::RingAlloc;
use allocator_api2::{boxed::Box, vec::Vec};

fn foo() -> Vec<Box<u32, RingAlloc>, RingAlloc> {
    let alloc = RingAlloc::new();
    let b = Box::new_in(42, alloc.clone());

    let mut v = Vec::new_in(alloc);
    v.push(b);
    v
}

fn main() {
    let v = foo();
    assert_eq!(*v[0], 42);
}

OneRingAlloc 是ZST分配器，它使用全局状态和线程局部存储。它可以在线程之间使用，并且在线程存在且仍在使用块时，可能会在线程之间传输块。

OneRingAlloc 总是使用全局分配器来分配块。

#![cfg_attr(feature = "nightly", feature(allocator_api))]
use ring_alloc::OneRingAlloc;
use allocator_api2::{boxed::Box, vec::Vec};

fn foo() -> Vec<Box<u32, OneRingAlloc>, OneRingAlloc> {
    let b = Box::new_in(42, OneRingAlloc);

    let mut v = Vec::new_in(OneRingAlloc);
    v.push(b);
    v
}

fn main() {
    let v = std::thread::spawn(foo).join().unwrap();
    assert_eq!(*v[0], 42);
}

分配器可以使用 allocator-api2 crate在稳定的Rust上使用。 "nightly"功能启用对不稳定Rust allocator_api 的支持，在nightly编译器上可用。

基准测试

预热

	`全局`	`ring_alloc::RingAlloc`	`ring_alloc::OneRingAlloc`	`bumpalo::Bump`
`alloc4bytes x65535`	`2.73 ms` (✅ 1.00x)	`209.67 us` (🚀 13.02x faster)	`306.38 us` (🚀 8.91x faster)	`343.45 us` (🚀 7.95x faster)

分配

	`全局`	`ring_alloc::RingAlloc`	`ring_alloc::OneRingAlloc`	`bumpalo::Bump`
`alloc`	`23.91 ns` (✅ 1.00x)	`5.17 ns` (🚀 4.62x 更快)	`11.24 ns` (🚀 2.13x 更快)	`7.39 ns` (🚀 3.24x 更快)

vec

	`全局`	`ring_alloc::RingAlloc`	`ring_alloc::OneRingAlloc`	`bumpalo::Bump`
`推入 x10`	`97.21 ns` (✅ 1.00x)	`32.31 ns` (🚀 3.01x 更快)	`41.95 ns` (🚀 2.32x 更快)	`33.19 ns` (🚀 2.93x 更快)
`reserve_exact(1)x10`	`212.99 ns` (✅ 1.00x)	`82.41 ns` (🚀 2.58x 更快)	`119.27 ns` (✅ 1.79x 更快)	`73.23 ns` (🚀 2.91x 更快)
`推入 x146`	`480.62 ns` (✅ 1.00x)	`376.28 ns` (✅ 1.28x 更快)	`379.11 ns` (✅ 1.27x 更快)	`342.50 ns` (✅ 1.40x 更快)
`reserve_exact(1)x146`	`4.02 us` (✅ 1.00x)	`2.01 us` (🚀 2.00x 更快)	`2.57 us` (✅ 1.56x 更快)	`1.90 us` (🚀 2.12x 更快)
`推入 x2134`	`5.07 us` (✅ 1.00x)	`5.27 us` (✅ 1.04x 慢)	`5.35 us` (✅ 1.06x 慢)	`5.07 us` (✅ 1.00x 慢)
`reserve_exact(1)x2134`	`49.59 us` (✅ 1.00x)	`207.60 us` (❌ 4.19x 慢)	`222.35 us` (❌ 4.48x 慢)	`212.09 us` (❌ 4.28x 慢)
`推入 x17453`	`39.23 us` (✅ 1.00x)	`41.75 us` (✅ 1.06x 慢)	`42.01 us` (✅ 1.07x 慢)	`41.61 us` (✅ 1.06x 慢)
`reserve_exact(1)x17453`	`425.45 us` (✅ 1.00x)	`13.41 ms` (❌ 31.51x 慢)	`13.65 毫秒` (❌ 慢32.08倍)	`21.14 毫秒` (❌ 慢49.70倍)

由criterion-table创建

结论

RingAlloc在大多数情况下比bumpalo快。OneRingAlloc在多线程支持方面慢于RingAlloc和bumpalo。

Global分配器在reserve_exact(1)测试中表现出更好的结果，因为它提供了优化的Allocator::grow，而RingAlloc尚未实现。对于大型向量，Global分配器在push操作上略好。在测试中，RingAlloc将大分配直接指向底层分配器，该底层分配器是Global。

许可证

根据您选择以下任何一个进行许可：

Apache License, Version 2.0, (license/APACHE 或 https://apache.ac.cn/licenses/LICENSE-2.0)
MIT license (license/MIT 或 https://open-source.org.cn/licenses/MIT)

任选其一。

贡献

除非您明确声明，否则根据Apache-2.0许可证定义的，您有意提交的任何贡献，都应以上述双许可形式进行，不附加任何额外条款或条件。

依赖关系

~0.2–5.5MB