13 个版本
0.1.1 | 2022 年 1 月 15 日 |
---|---|
0.1.0 | 2022 年 1 月 15 日 |
0.0.10 | 2020 年 3 月 31 日 |
0.0.9 | 2020 年 2 月 19 日 |
0.0.6 | 2019 年 11 月 14 日 |
#430 in 异步
2,994 每月下载量
83KB
1K SLoC
::next_gen
在稳定 Rust 上的安全生成器。
示例
重实现 range
迭代器
use ::next_gen::prelude::*;
#[generator(yield(u8))]
fn range (start: u8, end: u8)
{
let mut current = start;
while current < end {
yield_!(current);
current += 1;
}
}
mk_gen!(let generator = range(3, 10));
assert_eq!(
generator.collect::<Vec<_>>(),
(3.. 10).collect::<Vec<_>>(),
);
使用埃拉托斯特尼筛法实现质数迭代器
use ::next_gen::prelude::*;
enum NeverSome {}
/// Generator over all the primes less or equal to `up_to`.
#[generator(yield(usize))]
fn primes_up_to (up_to: usize)
-> Option<NeverSome>
{
if up_to < 2 { return None; }
let mut sieve = vec![true; up_to.checked_add(1).expect("Overflow")];
let mut p: usize = 1;
loop {
p += 1 + sieve
.get(p + 1..)?
.iter()
.position(|&is_prime| is_prime)?
;
yield_!(p);
let p2 = if let Some(p2) = p.checked_mul(p) { p2 } else {
continue
};
if p2 >= up_to { continue; }
sieve[p2..]
.iter_mut()
.step_by(p)
.for_each(|is_prime| *is_prime = false)
;
}
}
mk_gen!(let primes = primes_up_to(10_000));
for prime in primes {
assert!(
(2_usize..)
.take_while(|&n| n.saturating_mul(n) <= prime)
.all(|n| prime % n != 0)
);
}
定义具有自借用状态的迭代器
这无疑是生成器最有用的功能。
例如,考虑以下问题
fn iter_locked (elems: &'_ Mutex<Set<i32>>)
-> impl '_ + Iterator<Item = i32>
没有生成器的痛苦尝试
无论你多么努力,如果没有使用 unsafe
或其他使用 unsafe
的自引用库/工具,你将无法使用这样的签名!
-
以下将失败
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_elems = mutexed_elems.lock().unwrap(); let mut elems = locked_elems.iter().copied(); move || { // let _ = locked_elems; elems.next() } // Error, borrowed `locked_elems` is not captured and is thus dropped! }) }
error[E0515]: cannot return value referencing local variable `locked_elems` --> src/lib.rs:122:5 | 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- `locked_elems` is borrowed here 14 | | move || { ... | 17 | | } // Error, borrowed `locked_elems` is not captured and is thus dropped! 18 | | }) | |______^ returns a value referencing data owned by the current function | = help: use `.collect()` to allocate the iterator
-
以及这个
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_elems = mutexed_elems.lock().unwrap(); let mut elems = locked_elems.iter().copied(); move || { let _ = &locked_elems; // ensure `locked_elems` is captured (and thus moved) elems.next() // Error, can't use borrow of moved value! } }) }
error[E0515]: cannot return value referencing local variable `locked_elems` --> src/lib.rs:144:5 | 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- `locked_elems` is borrowed here 14 | | move || { ... | 17 | | } 18 | | }) | |______^ returns a value referencing data owned by the current function | = help: use `.collect()` to allocate the iterator error[E0505]: cannot move out of `locked_elems` because it is borrowed --> src/lib.rs:147:9 | 8 | fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) | - let's call the lifetime of this reference `'1` ... 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- borrow of `locked_elems` occurs here 14 | | move || { | | ^^^^^^^ move out of `locked_elems` occurs here 15 | | let _ = &locked_elems; // ensure `locked_elems` is captured (and thus moved) | | ------------ move occurs due to use in closure 16 | | elems.next() // Error, can't use borrow of moved value! 17 | | } 18 | | }) | |______- returning this value requires that `locked_elems` is borrowed for `'1` error: aborting due to 2 previous errors
-
在其他情况下,可能存在低效的解决方案
例如,如果那个
Set
是一个Vec
,在这种情况下,我们可以使用索引作为穷人的自引用,没有“官方”的生命周期,因此 Rust 不会抱怨fn iter_locked (mutexed_vec: &'_ Mutex<Vec<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_vec = mutexed_vec.lock().unwrap(); let mut indices = 0.. locked_vec.len(); move /* locked_vec, indices */ || { let i = indices.next()?; Some(locked_vec[i]) // copies, so OK. } }) } let mutexed_elems = Mutex::new(vec![27, 42]); let mut iter = iter_locked(&mutexed_elems); assert_eq!(iter.next(), Some(27)); assert_eq!(iter.next(), Some(42)); assert_eq!(iter.next(), None);
但使用生成器就简单多了
use ::next_gen::prelude::*;
#[generator(yield(i32))]
fn gen_iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
{
let locked_elems = mutexed_elems.lock().unwrap();
for elem in locked_elems.iter().copied() {
yield_!(elem);
}
}
并且 voilà!
这个 #[generator] fn
是我们安全自引用迭代器的关键构造函数!
现在,从自引用生成器实例化迭代器有一个微妙的地方,就像轮询自引用 Future
一样(这就是为什么缺少 Unpin
约束的原因):我们需要在轮询之前将其固定!
关于“使用前固定”以及两种固定形式
-
获取一个
Future
let future = async { ... }; // or let future = some_async_fn(...);
-
将实例化的
Future
钩挂在堆中(Box
包装)// Pinning it in the heap (boxed): let mut pinned_future = Box::pin(future) // or, through an extension trait (`::futures::future::FutureExt`): let mut pinned_future = future.boxed() // this also incidentally `dyn`-erases the future.
-
现在我们可以 返回 它,或者轮询它
if true { pinned_future.as_mut().poll(...); } // and/or return it: return pinned_future;
-
-
将实例化的
Future
钩挂在栈上(钩挂在局部作用域)use ::some_lib::some_pinning_macro as stack_pinned; // Pinning it in the "stack" stack_pinned!(mut future); /* the above shadows `future`, thus acting as: let mut future = magic::Stack::pin(future); // */ // Let's rename it for clarity: let mut pinned_future = future;
-
现在我们可以在当前栈帧中轮询它/使用它,但不能返回它。
pinned_future.as_mut().poll(...)
-
好吧,结果对于生成器来说也是类似的
-
一旦你有了
#[generator] fn
"生成器构造函数"use ::next_gen::prelude::*; #[generator(yield(u8))] fn foo () { yield_!(42); }
-
实例化需要钩挂,因此
-
栈钩挂:成本低,与
no_std
兼容,可在同一作用域内使用。但不能返回。mk_gen!(let mut generator = foo()); // can be used within the same scope assert_eq!(generator.next(), Some(42)); assert_eq!(generator.next(), None); // but it can't be returned // return generator; /* Error, can't return borrow to local value */
-
堆钩挂:稍微贵一些,需要
::alloc
ator 或者不是no_std
,但这样钩挂的生成器可以返回。mk_gen!(let mut generator = box foo()); // can be used within the same scope if some_condition { assert_eq!(generator.next(), Some(42)); assert_eq!(generator.next(), None); } // and/or it can be returned return generator; // OK
-
所以,回到我们的例子,这是我们需要做的事情
use ::next_gen::prelude::*;
/// We already have:
#[generator(yield(i32))]
fn gen_iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
...
/// Now let's wrap-it so that it yields a nice iterator:
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
-> impl '_ + Iterator<Item = i32>
{
if true {
// One possible syntax to instantiate the generator
mk_gen!(let generator = box gen_iter_locked(mutexed_elems));
generator
} else {
// or, since we are `box`-ing, we can directly do:
gen_iter_locked.call_boxed((mutexed_elems, ))
}
// : Pin<Box<impl '_ + Generator<Yield = i32>>>
// : impl '_ + Iterator<Item = i32>
}
let mutexed_elems = Mutex::new([27, 42].iter().copied().collect::<Set<_>>());
let mut iter = iter_locked(&mutexed_elems);
assert_eq!(iter.next(), Some(27));
assert_eq!(iter.next(), Some(42));
assert_eq!(iter.next(), None);
-
如果你正在尝试实现的
iter_locked()
函数是 trait 定义的一部分,因此需要命名类型,此时impl '_ + Iterator…
存在性语法可能有问题,你可以使用dyn
代替impl
,但这需要提及Pin<Box<>>
层// instead of -> impl '_ + Iterator<Item = i32> // write: -> Pin<Box<dyn '_ + Generator<Yield = i32, Return = ()>>>
一个例子
use ::next_gen::prelude::*; struct Once<T>(T); impl<T : 'static> IntoIterator for Once<T> { type Item = T; type IntoIter = Pin<Box<dyn Generator<(), Yield = T, Return = ()> + 'static>>; fn into_iter (self: Once<T>) -> Self::IntoIter { #[generator(yield(T))] fn once_generator<T> (value: T) { yield_!(value); } once_generator.call_boxed((self.0, )) } } assert_eq!(Once(42).into_iter().next(), Some(42));
恢复参数
这个 crate 已更新以支持恢复参数:现在 Generator
trait 对 ResumeArg
参数(默认为 ()
)是泛型的,并且其 .resume(…)
方法现在接受该类型的参数
let _: GeneratorState<Yield, Return> = generator.as_mut().resume(resume_arg);
这使得生成器内部的 yield_!(…)
表达式求值结果为 ResumeArg
而不是 ()
let _: ResumeArg = yield_!(value);
宏语法
为了使用 #[generator]
属性表达这个,向其中添加一个 resume(Type)
参数
use ::next_gen::prelude::*;
type ShouldContinue = bool;
#[generator(yield(i32), resume(ShouldContinue))]
fn g ()
{
for i in 0.. {
let should_continue = yield_!(i);
if should_continue.not() {
break;
}
}
}
mk_gen!(let mut generator = g());
assert!(matches!(
generator.as_mut().resume(bool::default()), // <- this resume arg is being ignored
GeneratorState::Yielded(0),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(1),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(2),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(3),
));
assert!(matches!(
generator.as_mut().resume(false),
GeneratorState::Complete,
));
如果你不想忽略/忽略第一个恢复参数(我们可以称之为“起始参数”),那么你可以在 resume(ResumeArgTy)
注解后附加一个 as <binding>
use ::next_gen::prelude::*;
type ShouldContinue = bool;
#[generator(
yield(i32),
resume(ShouldContinue) as mut should_continue,
)]
fn g ()
{
for i in 0.. {
if should_continue.not() {
break;
}
should_continue = yield_!(i);
}
}
-
一个关于“自动分段栈”的递归让人头疼的例子
use ::next_gen::prelude::*; /// A silly recursive function, computing the sum of integers up to `n`. /// /// If you know your math, you know this equals `n * (n + 1) / 2`. /// /// This result is quite "obvious" from the geometric representation: /// /// ```text /// # . . . . . <- Amount of #: 1 /// # # . . . . <- Amount of #: 2 /// # # # . . . <- Amount of #: 3 /// # # # # . . <- Amount of #: 4 /// ⋮ … ⋱ ⋮ /// # # # # # # <- Amount of #: N /// Height = N + 1 Total: 1 + 2 + … + N /// Width = N /// Half the area of the "square": (N + 1) * N / 2 /// ``` /// /// As you can see, computing the sum `1 + 2 + 3 + … + N` is the same as /// counting the number of `#` in that diagram. And those `#` fill half a /// "square". But it's actually not exactly a `N x N` square since we have /// from `1` to `N` rows, that is, `N + 1` rows, and a width of `N`. /// /// This results in `(n + 1) * n` for the area of the "square", followed by /// the `/ 2` halving operation. fn triangular (n: u64) -> u64 { #[generator(yield(u64), resume(u64))] fn triangular (n: u64) -> u64 { use yield_ as recurse; if n == 0 { 0 } else { n + recurse!(n - 1) } } drive_recursion(n, |n| triangular.call_boxed((n, ))) } const N: u64 = 10_000; assert_eq!( triangular(N), N * (N + 1) / 2 ); // where the `drive_recursion` "runtime" is defined as: /// A recursive computation can be seen as a "suspensible coroutine", /// whereby, when needing to "compute-recurse" into new inputs, /// that current computation just suspends and yields the new input /// for which it requests a computation. /// /// The driver / "executor", thus starts with the initial input, and /// polls the suspensible coroutine until reaching a suspension point. /// /// Such suspension point gives the driver a new computation it needs to /// perform (updates `input`), and a new "customer" waiting for that new /// result: that suspended computation. These stack onto each other as /// we recurse, and when the innermost computation _completes_ / _returns_ /// rather than yield-enqueuing a new one, we can then feed that result to /// the top-most suspended computation, _resuming_ it. fn drive_recursion<Input, SuspendedComputation, Result> ( input: Input, mut start_computing: impl FnMut(Input) -> Pin<Box<SuspendedComputation>>, ) -> Result where SuspendedComputation : Generator< /* ResumedWith = */ Result, // recursive result Yield = Input, // recursive "query" Return = Result, > , Result : Default, // to feed the initial dummies. { // This is the "recursive state stack", when you think about this, // and with this approach we automagically get it heap-allocated // (the `Pin<Box<GeneratorFn…>>` state machines are the main things // heap-allocating the "recursively captured local state". // This vec is just storing these `Pin<Box<…>>` things, to avoid // stack-allocating those (which naively recursing within this very body // would achieve). let mut suspended_computations = Vec::new(); let mut last_suspended_computation = start_computing(input); let mut computation_result = Result::default(); // start with a dummy one loop { match last_suspended_computation.resume_unpin(computation_result) { // We reached `return`: completion of the current computation. | GeneratorState::Returned(computation_result_) => { match suspended_computations.pop() { // If it was the outer-most computation, we've finished. | None => return computation_result_, // Otherwise, feed the current result to the outer // computation that had previously yield-requested the // current computation. | Some(suspended_computation) => { last_suspended_computation = suspended_computation; computation_result = computation_result_; }, } }, // We need to "compute-recurse" ourselves with this new `arg` | GeneratorState::Yielded(arg) => { suspended_computations.push(last_suspended_computation); last_suspended_computation = start_computing(arg); computation_result = Result::default(); }, } } }
因此,“栈”(用于存储非终端/非尾递归调用中捕获的局部变量)实际上是在
yield_!()
点之间跨越的状态,从而导致了Generator
捕获的状态。由于Generator
实例被Box
包装,这意味着这样的栈最终位于堆中,背后是一个指针。这发生在每次递归步骤中。这意味着栈(在每个
Generator
实例中)已成功分段到堆中;否则,这是一个繁琐的手动过程,但对于非平凡的递归函数来说却是必需的。
功能
性能
该软件包通过使用堆栈固定,实现了无分配生成器。当这样使用时,它应该接近零成本。
人体工程学/糖
在处理这些生成器时,尽管涉及到的内部结构复杂/微妙,如堆栈固定,但已经投入了大量精力到宏和属性宏,以提供最舒适的使用体验。
安全
几乎未使用 unsafe
,例外的是
-
堆栈固定,其中它使用了官方的
::pin_utils::pin_mut
实现; -
使用固定保证来延长生命周期;
no_std
支持
此软件包支持 #![no_std]
。为此,只需禁用默认的 "std"
功能即可
<!-- AUTOGENERATED FILE -->
```toml
[dependencies]
next-gen.version = "..."
next-gen.default-features = false # <- ADD THIS to disable `std`&`alloc` for `no_std` compat
next-gen.features = [
"alloc", # If your no_std platform has access to allocators.
# "std", # `default-features` bundles this.
]
想法
由于生成器和协程依赖于相同的内部结构,因此可以使用 async
/ await
机制推导出生成器的安全实现,该机制已在稳定版 Rust 中实现。
类似的想法已在 https://docs.rs/genawaiter 中实现。
依赖项
~1-11MB
~114K SLoC