4 个版本

0.2.0	2021 年 6 月 28 日
0.1.2	2020 年 10 月 9 日
0.1.1	2020 年 10 月 9 日
0.1.0	2020 年 10 月 9 日

#915 在数据结构

在 3 个 crate 中使用 (通过 bellframe)

MIT 许可证

36KB
271 行

shortlist

一个数据结构，用于跟踪推送到它的最大项目，无需堆分配，每次推送的平均时间为 O(1)。

功能

推送的时间复杂度为 O(1) 平均和 O(log n) 最坏情况（如果输入已经排序）

除了在创建新的 Shortlist 时，没有堆分配无依赖项，仅约 150 行源代码无 unsafe

问题假设你正在对一个非常大的搜索空间进行暴力搜索，但想保留不仅仅是单个最佳项目 - 例如，你希望从十亿个选项中找到最佳的前 100 个项目。即你想实现以下函数 fn get_best<T: Ord>( big_computation: impl Iterator<Item = T>, n: usize ) -> Vec<T> { // Somehow get the `n` largest items produced by `big_computation` ... } 一个糟糕的解决方案对此的直观方法是存储我们搜索的每个项目。然后一旦搜索完成，对这个列表进行排序，然后从列表的末尾取我们需要的任何数量的项目。这大致对应以下代码 fn get_best<T: Ord>( big_computation: impl Iterator<Item = T>, n: usize ) -> Vec<T> { // Collect all the results into a big sorted vec let mut giant_vec: Vec<T> = big_computation.collect(); giant_vec.sort(); // Return the last and therefore biggest n items with some iterator magic giant_vec.drain(..).rev().take(n).rev().collect() } 但这在两个方面都非常低效对非常大的列表进行排序非常慢，而且我们正在对可能永远不会需要的数十亿个项目进行排序。对于任何相当大的搜索空间，存储这些项目可能会使计算机崩溃，因为它会耗尽内存。此 crate 使用的解决方案这就是使用 Shortlist 有用的地方。 Shortlist 是一种数据结构，它将动态保持迄今为止给定的最佳项目的“短名单”，每次推送新项目的时间平均为 O(1)。它也只会在创建 Shortlist 时执行一次堆分配，并且后续的所有操作都不会进行分配。因此，对于此库的用户，代码变为 use shortlist::Shortlist; fn get_best<T: Ord>( big_computation: impl Iterator<Item = T>, n: usize ) -> Vec<T> { // Create a new Shortlist that will take at most `n` items let mut shortlist = Shortlist::new(n); // Feed it all the results from `big_computation` for v in big_computation { shortlist.push(v); } // Return the shortlisted values as a sorted vec shortlist.into_sorted_vec() } 或作为一句话 use shortlist::Shortlist; fn get_best<T: Ord>(big_computation: impl Iterator<Item = T>, n: usize) -> Vec<T> { Shortlist::from_iter(n, big_computation).into_sorted_vec() } 在两种情况下，代码都只会进行一次堆分配（为 Shortlist 预留空间）。许可证：MIT

无运行时依赖项 dev rand 0.7.3