10 个版本
0.0.10 | 2020 年 3 月 31 日 |
---|---|
0.0.9 | 2020 年 2 月 19 日 |
0.0.6 | 2019 年 11 月 14 日 |
在 #next-gen 中排名 11
8KB
179 行
::next_gen
稳定 Rust 上的安全生成器。
示例
重新实现 range
迭代器
use ::next_gen::prelude::*;
#[generator(yield(u8))]
fn range (start: u8, end: u8)
{
let mut current = start;
while current < end {
yield_!(current);
current += 1;
}
}
mk_gen!(let generator = range(3, 10));
assert_eq!(
generator.collect::<Vec<_>>(),
(3.. 10).collect::<Vec<_>>(),
);
使用埃拉托斯特尼筛法实现质数迭代器
use ::next_gen::prelude::*;
enum NeverSome {}
/// Generator over all the primes less or equal to `up_to`.
#[generator(yield(usize))]
fn primes_up_to (up_to: usize)
-> Option<NeverSome>
{
if up_to < 2 { return None; }
let mut sieve = vec![true; up_to.checked_add(1).expect("Overflow")];
let mut p: usize = 1;
loop {
p += 1 + sieve
.get(p + 1..)?
.iter()
.position(|&is_prime| is_prime)?
;
yield_!(p);
let p2 = if let Some(p2) = p.checked_mul(p) { p2 } else {
continue
};
if p2 >= up_to { continue; }
sieve[p2..]
.iter_mut()
.step_by(p)
.for_each(|is_prime| *is_prime = false)
;
}
}
mk_gen!(let primes = primes_up_to(10_000));
for prime in primes {
assert!(
(2_usize..)
.take_while(|&n| n.saturating_mul(n) <= prime)
.all(|n| prime % n != 0)
);
}
定义具有自我借用状态的迭代器
这无疑是生成器的最有用特性。
例如,考虑以下问题
fn iter_locked (elems: &'_ Mutex<Set<i32>>)
-> impl '_ + Iterator<Item = i32>
没有生成器的悲惨尝试
无论你多么努力,如果不使用 unsafe
或其他使用 unsafe
的自我引用库/工具,你将无法实现这样的签名!
-
以下将失败
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_elems = mutexed_elems.lock().unwrap(); let mut elems = locked_elems.iter().copied(); move || { // let _ = locked_elems; elems.next() } // Error, borrowed `locked_elems` is not captured and is thus dropped! }) }
error[E0515]: cannot return value referencing local variable `locked_elems` --> src/lib.rs:122:5 | 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- `locked_elems` is borrowed here 14 | | move || { ... | 17 | | } // Error, borrowed `locked_elems` is not captured and is thus dropped! 18 | | }) | |______^ returns a value referencing data owned by the current function | = help: use `.collect()` to allocate the iterator
-
以及这个
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_elems = mutexed_elems.lock().unwrap(); let mut elems = locked_elems.iter().copied(); move || { let _ = &locked_elems; // ensure `locked_elems` is captured (and thus moved) elems.next() // Error, can't use borrow of moved value! } }) }
error[E0515]: cannot return value referencing local variable `locked_elems` --> src/lib.rs:144:5 | 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- `locked_elems` is borrowed here 14 | | move || { ... | 17 | | } 18 | | }) | |______^ returns a value referencing data owned by the current function | = help: use `.collect()` to allocate the iterator error[E0505]: cannot move out of `locked_elems` because it is borrowed --> src/lib.rs:147:9 | 8 | fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>) | - let's call the lifetime of this reference `'1` ... 11 | / ::std::iter::from_fn({ 12 | | let locked_elems = mutexed_elems.lock().unwrap(); 13 | | let mut elems = locked_elems.iter().copied(); | | ------------------- borrow of `locked_elems` occurs here 14 | | move || { | | ^^^^^^^ move out of `locked_elems` occurs here 15 | | let _ = &locked_elems; // ensure `locked_elems` is captured (and thus moved) | | ------------ move occurs due to use in closure 16 | | elems.next() // Error, can't use borrow of moved value! 17 | | } 18 | | }) | |______- returning this value requires that `locked_elems` is borrowed for `'1` error: aborting due to 2 previous errors
-
在其他情况下,可能存在效率较低的解决方案
例如,当那个
Set
将是一个Vec
时。在这种情况下,我们可以使用索引作为穷人的自我引用,没有“官方”的生命周期,因此 Rust 不会抱怨fn iter_locked (mutexed_vec: &'_ Mutex<Vec<i32>>) -> impl '_ + Iterator<Item = i32> { ::std::iter::from_fn({ let locked_vec = mutexed_vec.lock().unwrap(); let mut indices = 0.. locked_vec.len(); move /* locked_vec, indices */ || { let i = indices.next()?; Some(locked_vec[i]) // copies, so OK. } }) } let mutexed_elems = Mutex::new(vec![27, 42]); let mut iter = iter_locked(&mutexed_elems); assert_eq!(iter.next(), Some(27)); assert_eq!(iter.next(), Some(42)); assert_eq!(iter.next(), None);
但是,使用生成器,这很容易
use ::next_gen::prelude::*;
#[generator(yield(i32))]
fn gen_iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
{
let locked_elems = mutexed_elems.lock().unwrap();
for elem in locked_elems.iter().copied() {
yield_!(elem);
}
}
并且,看吧!
这个 #[generator] fn
是我们安全自我引用迭代器的关键构造函数!
现在,从自我引用生成器中实例化迭代器有一个微妙的地方,就像轮询自我引用的 Future
一样(这就是缺少 Unpin
约束的含义):在轮询之前,我们需要将其固定!
关于“使用前固定”和两种固定形式
-
获取
Future
let future = async { ... }; // or let future = some_async_fn(...);
-
在堆中固定已实例化的
Future
(boxed)// Pinning it in the heap (boxed): let mut pinned_future = Box::pin(future) // or, through an extension trait (`::futures::future::FutureExt`): let mut pinned_future = future.boxed() // this also incidentally `dyn`-erases the future.
-
现在我们可以 返回 它或轮询它
if true { pinned_future.as_mut().poll(...); } // and/or return it: return pinned_future;
-
-
在栈中固定已实例化的
Future
(固定到局部作用域)use ::some_lib::some_pinning_macro as stack_pinned; // Pinning it in the "stack" stack_pinned!(mut future); /* the above shadows `future`, thus acting as: let mut future = magic::Stack::pin(future); // */ // Let's rename it for clarity: let mut pinned_future = future;
-
现在我们可以在当前栈帧中轮询它/使用它,但不能返回它。
pinned_future.as_mut().poll(...)
-
事实上,对于生成器来说也是类似的
-
一旦你有了这个
#[generator] fn
"生成器构造函数"use ::next_gen::prelude::*; #[generator(yield(u8))] fn foo () { yield_!(42); }
-
实例化需要固定,因此
-
栈固定:便宜,
no_std
兼容,可以在同一作用域内使用。但不能返回。mk_gen!(let mut generator = foo()); // can be used within the same scope assert_eq!(generator.next(), Some(42)); assert_eq!(generator.next(), None); // but it can't be returned // return generator; /* Error, can't return borrow to local value */
-
堆固定:稍微昂贵一些,需要一个
::alloc
分配器或者不是no_std
,但这样固定的生成器可以被返回。mk_gen!(let mut generator = box foo()); // can be used within the same scope if some_condition { assert_eq!(generator.next(), Some(42)); assert_eq!(generator.next(), None); } // and/or it can be returned return generator; // OK
-
所以,回到我们的例子,我们需要这样做
use ::next_gen::prelude::*;
/// We already have:
#[generator(yield(i32))]
fn gen_iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
...
/// Now let's wrap-it so that it yields a nice iterator:
fn iter_locked (mutexed_elems: &'_ Mutex<Set<i32>>)
-> impl '_ + Iterator<Item = i32>
{
if true {
// One possible syntax to instantiate the generator
mk_gen!(let generator = box gen_iter_locked(mutexed_elems));
generator
} else {
// or, since we are `box`-ing, we can directly do:
gen_iter_locked.call_boxed((mutexed_elems, ))
}
// : Pin<Box<impl '_ + Generator<Yield = i32>>>
// : impl '_ + Iterator<Item = i32>
}
let mutexed_elems = Mutex::new([27, 42].iter().copied().collect::<Set<_>>());
let mut iter = iter_locked(&mutexed_elems);
assert_eq!(iter.next(), Some(27));
assert_eq!(iter.next(), Some(42));
assert_eq!(iter.next(), None);
-
如果你正在尝试实现的
iter_locked
函数是 trait 定义的一部分,因此需要命名类型,此时,使用dyn
而不是impl
,代价是必须提到Pin<Box<...>>
层// instead of -> impl '_ + Iterator<Item = i32> // write: -> Pin<Box<dyn '_ + Generator<Yield = i32, Return = ()>>>
示例
use ::next_gen::prelude::*; struct Once<T>(T); impl<T : 'static> IntoIterator for Once<T> { type Item = T; type IntoIter = Pin<Box<dyn Generator<(), Yield = T, Return = ()> + 'static>>; fn into_iter (self: Once<T>) -> Self::IntoIter { #[generator(yield(T))] fn once_generator<T> (value: T) { yield_!(value); } once_generator.call_boxed((self.0, )) } } assert_eq!(Once(42).into_iter().next(), Some(42));
恢复参数
这个 crate 已经更新以支持恢复参数:现在 Generator
trait 对 ResumeArg
参数是泛型的(默认为 ()
),它的 .resume
方法现在接受该类型的参数
let _: GeneratorState<Yield, Return> = generator.as_mut().resume(resume_arg);
这使得生成器内部的 yield_!
表达式求值结果为 ResumeArg
而不是 ()
let _: ResumeArg = yield_!(value);
宏语法
为了使用 #[generator]
属性表达这一点,向其中添加一个 resume
(Type 参数
use ::next_gen::prelude::*;
type ShouldContinue = bool;
#[generator(yield(i32), resume(ShouldContinue))]
fn g ()
{
for i in 0.. {
let should_continue = yield_!(i);
if should_continue.not() {
break;
}
}
}
mk_gen!(let mut generator = g());
assert!(matches!(
generator.as_mut().resume(bool::default()), // <- this resume arg is being ignored
GeneratorState::Yielded(0),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(1),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(2),
));
assert!(matches!(
generator.as_mut().resume(true),
GeneratorState::Yielded(3),
));
assert!(matches!(
generator.as_mut().resume(false),
GeneratorState::Complete,
));
如果你不想忽略/忽视第一个恢复参数(我们可以称之为“起始参数”),那么你可以在 resume
(ResumeArgTy 注释后附加一个 as
<binding
use ::next_gen::prelude::*;
type ShouldContinue = bool;
#[generator(
yield(i32),
resume(ShouldContinue) as mut should_continue,
)]
fn g ()
{
for i in 0.. {
if should_continue.not() {
break;
}
should_continue = yield_!(i);
}
}
-
一个具有“自动分段栈”的递归的令人费解的例子
use ::next_gen::prelude::*; /// A silly recursive function, computing the sum of integers up to `n`. /// /// If you know your math, you know this equals `n * (n + 1) / 2`. /// /// This result is quite "obvious" from the geometric representation: /// /// ```text /// # . . . . . <- Amount of #: 1 /// # # . . . . <- Amount of #: 2 /// # # # . . . <- Amount of #: 3 /// # # # # . . <- Amount of #: 4 /// ⋮ … ⋱ ⋮ /// # # # # # # <- Amount of #: N /// Height = N + 1 Total: 1 + 2 + … + N /// Width = N /// Half the area of the "square": (N + 1) * N / 2 /// ``` /// /// As you can see, computing the sum `1 + 2 + 3 + … + N` is the same as /// counting the number of `#` in that diagram. And those `#` fill half a /// "square". But it's actually not exactly a `N x N` square since we have /// from `1` to `N` rows, that is, `N + 1` rows, and a width of `N`. /// /// This results in `(n + 1) * n` for the area of the "square", followed by /// the `/ 2` halving operation. fn triangular (n: u64) -> u64 { #[generator(yield(u64), resume(u64))] fn triangular (n: u64) -> u64 { use yield_ as recurse; if n == 0 { 0 } else { n + recurse!(n - 1) } } drive_recursion(n, |n| triangular.call_boxed((n, ))) } const N: u64 = 10_000; assert_eq!( triangular(N), N * (N + 1) / 2 ); // where the `drive_recursion` "runtime" is defined as: /// A recursive computation can be seen as a "suspensible coroutine", /// whereby, when needing to "compute-recurse" into new inputs, /// that current computation just suspends and yields the new input /// for which it requests a computation. /// /// The driver / "executor", thus starts with the initial input, and /// polls the suspensible coroutine until reaching a suspension point. /// /// Such suspension point gives the driver a new computation it needs to /// perform (updates `input`), and a new "customer" waiting for that new /// result: that suspended computation. These stack onto each other as /// we recurse, and when the innermost computation _completes_ / _returns_ /// rather than yield-enqueuing a new one, we can then feed that result to /// the top-most suspended computation, _resuming_ it. fn drive_recursion<Input, SuspendedComputation, Result> ( input: Input, mut start_computing: impl FnMut(Input) -> Pin<Box<SuspendedComputation>>, ) -> Result where SuspendedComputation : Generator< /* ResumedWith = */ Result, // recursive result Yield = Input, // recursive "query" Return = Result, > , Result : Default, // to feed the initial dummies. { // This is the "recursive state stack", when you think about this, // and with this approach we automagically get it heap-allocated // (the `Pin<Box<GeneratorFn…>>` state machines are the main things // heap-allocating the "recursively captured local state". // This vec is just storing these `Pin<Box<…>>` things, to avoid // stack-allocating those (which naively recursing within this very body // would achieve). let mut suspended_computations = Vec::new(); let mut last_suspended_computation = start_computing(input); let mut computation_result = Result::default(); // start with a dummy one loop { match last_suspended_computation.resume_unpin(computation_result) { // We reached `return`: completion of the current computation. | GeneratorState::Returned(computation_result_) => { match suspended_computations.pop() { // If it was the outer-most computation, we've finished. | None => return computation_result_, // Otherwise, feed the current result to the outer // computation that had previously yield-requested the // current computation. | Some(suspended_computation) => { last_suspended_computation = suspended_computation; computation_result = computation_result_; }, } }, // We need to "compute-recurse" ourselves with this new `arg` | GeneratorState::Yielded(arg) => { suspended_computations.push(last_suspended_computation); last_suspended_computation = start_computing(arg); computation_result = Result::default(); }, } } }
因此,“栈”(非终端/非尾递归调用中捕获的局部变量的存储)实际上是在
yield_!
点之间跨越的状态,从而产生了Generator
捕获的状态。由于Generator
实例被Box
化,这意味着这样的栈最终位于堆中,在指针之后。这是在每次递归步骤中发生的。这意味着栈已经在每个
Generator
实例中被成功分段到堆中;否则,这是一个繁琐的手动过程,但对于非平凡递归函数来说却是必要的。
特性
性能
该 crate 通过使用栈固定实现了无分配生成器,因此,当以这种方式使用时,它应该接近零成本。
人体工程学/糖
在处理这些生成器时,已经投入了大量精力来优化宏以及提供一个最符合人体工程学的体验,尽管涉及到的内部结构复杂/微妙,例如栈固定。
安全
几乎未使用任何 unsafe
代码,例外是
-
栈固定,其中使用了官方的
::pin_utils::pin_mut
实现; -
使用固定保证来扩展生命周期;
no_std
支持
此 crate 支持 #![no_std]
。为此,只需禁用默认的 "std"
功能
[dependencies]
next-gen.version = "..."
next-gen.default-features = false # <- ADD THIS to disable `std`&`alloc` for `no_std` compat
next-gen.features = [
"alloc", # If your no_std platform has access to allocators.
# "std", # `default-features` bundles this.
]
想法
由于生成器和协程依赖于相同的内部机制,可以借助 async
/ await
机制实现生成器的安全实现,该机制仅存在于稳定的 Rust 中。
类似的思路也已在 https://docs.rs/genawaiter 中实现。
依赖项
~1.5MB
~35K SLoC