11个版本

0.0.14	2024年2月26日
0.0.13	2023年7月21日
0.0.11	2023年3月18日
0.0.9	2022年4月30日
0.0.4	2019年8月18日

#47 in #robust

262次每月下载
在reflect中使用

MIT/Apache

16KB
450 行

我以为Rust没有反射...？

这个crate通过类似于编译时反射的编程模型探索了如何解决80%的定制 derive 宏的使用场景。

动机

我现有的syn和quote库以非常通用的方式处理过程宏的问题空间，并且适合大约95%的使用场景。然而，这种通用性是以相对较低抽象层次为代价的。宏作者负责放置每个单独的尖括号、生命周期、类型参数、特性和幻影数据。涉及到大量的领域知识，而且很少有人能够可靠地使用这种方法产生健壮的宏。

这里探索的设计侧重于消除所有边缘情况——这样，如果你的宏对最基本的情况有效，那么它也会在所有棘手的情况下有效。

编程模型

我们的想法是公开一个看起来很无聊、直接运行时反射 API，就像你在使用Java中的反射或Go中的反射时可能认识的那样。

宏作者使用这个API以这种方式表达他们的宏逻辑，使用像reflect::Value这样的类型来检索函数参数、访问数据结构的字段以及调用函数等等。重要的是，在这个模型中没有泛型类型或幻影数据。一切只是reflect::Value，它在运行时以概念上单态化的类型存在。

同时，库正在跟踪控制流和函数调用，以构建作者宏的完整通用和健壮的过程式实现。生成的代码将包含所有正确的尖括号、生命周期、界限和幻影类型，而宏的作者无需考虑这些。

反射API只是定义过程宏的一种手段。库将其全部消除，并输出干净的Rust源代码，不包含任何实际的运行时反射。请注意，这不是关于编译器优化的声明——我们不依赖Rust编译器对糟糕生成的代码进行英雄般的优化。实际上，通过反射API编写的源代码将与经验丰富的宏作者简单地使用syn和quote产生的代码相同。

从最终调用宏的人的角度来看，关于如何调用它的所有信息都与没有反射的传统方式编写的宏相同，他们的代码编译速度和性能完全一样。优点是对于宏作者来说，开发和维护一个健壮的宏变得大大简化。

演示

此项目包含了一个用于定义自定义派生的编译时反射API的证明概念。

在tests/debug/目录中，演示了为具有命名字段的struct实现#[derive(Debug)]的编译时可编译实现。相应的测试用例显示了当我们为具有两个字段的struct Point派生Debug时生成的代码；它与没有反射的手写derive(Debug)宏为相同的数据结构生成的代码等效。

宏实现从运行时所需的类型和函数的DSL声明开始。

reflect::library! {
    extern crate std {
        mod fmt {
            type Formatter;
            type Result;
            type DebugStruct;

            trait Debug {
                fn fmt(&self, &mut Formatter) -> Result;
            }

            impl Formatter {
                fn debug_struct(&mut self, &str) -> DebugStruct;
            }

            impl DebugStruct {
                fn field(&mut self, &str, &Debug) -> &mut DebugStruct;
                fn finish(&mut self) -> Result;
            }
        }
    }
}

如果需要使用标准库之外的类型，这里可能有额外的extern crate块。例如，Serde的#[derive(Serialize)]宏会列出serde crate、Serialize和Serializer类型，以及它们在运行时可能调用的任何方法。

在宏实现的其余部分，所有类型信息都基于在此库声明中给出的签名静态推断。

接下来，宏入口点是一个普通的proc_macro_derive函数，就像以其他方式定义的派生宏一样。

再次强调，反射API只是定义过程宏的一种手段。尽管下面看起来可能不同，但这里写的一切都是在编译时执行的。reflect库将生成的代码输出为输出TokenStream，该代码被编译到宏用户的crate中。此令牌流不包含运行时反射的痕迹。

use proc_macro::TokenStream;

// Macro that is called when someone writes derive(MyDebug) on a data structure.
// It returns a fragment of Rust source code (TokenStream) containing an
// implementation of Debug for the input data structure. The macro uses
// compile-time reflection internally, but the generated Debug impl is exactly
// as if this macro were handwritten without reflection.
#[proc_macro_derive(MyDebug)]
pub fn derive(input: TokenStream) -> TokenStream {
    // Feed the tokens describing the data structure into the reflection library
    // for parsing and analysis. We provide a callback that describes what trait
    // impl(s) the reflection library will need to generate code for.
    reflect::derive(input, |ex| {
        // Instruct the library to generate an impl of Debug for the derive
        // macro's target type / Self type.
        ex.make_trait_impl(RUNTIME::std::fmt::Debug, ex.target_type(), |block| {
            // Instruct the library to compile debug_fmt (a function shown
            // below) into the source code for the impl's Debug::fmt method.
            block.make_function(RUNTIME::std::fmt::Debug::fmt, debug_fmt);
        });
    })
}

以下看起来像是一个进行运行时反射的函数。它接收具有reflect::Value类型的函数参数，可以将它们传递和传递，提取它们的字段，检查属性，调用方法等等。

use reflect::*;

// This function will get compiled into Debug::fmt, which has this signature:
//
//     fn fmt(&self, formatter: &mut fmt::Formatter) -> fmt::Result
//
fn debug_fmt(f: MakeFunction) -> Value {
    let receiver: reflect::Value = f.arg(0);  // this is `self`
    let formatter: reflect::Value = f.arg(1);

    // The input value may be any of unit struct, tuple struct, ordinary braced
    // struct, or enum.
    match receiver.data() {
        Data::Struct(receiver) => match receiver {
            Struct::Unit(receiver) => unimplemented!(),
            Struct::Tuple(receiver) => unimplemented!(),
            Struct::Struct(receiver) => {
                /* implemented below */
            }
        },
        // For an enum, the active variant of the enum may be any of unit
        // variant, tuple variant, or struct variant.
        Data::Enum(receiver) => receiver.match_variant(|variant| match variant {
            Variant::Unit(variant) => unimplemented!(),
            Variant::Tuple(variant) => unimplemented!(),
            Variant::Struct(variant) => unimplemented!(),
        }),
    }
}

当处理具有命名字段的struct时，我们使用反射来遍历struct的字段，并调用标准库的Formatter API的方法，将每个字段值附加到调试输出中。

有关此功能在运行时如何执行的详细信息，请参阅标准库API文档中的DebugStruct示例代码。

以RUNTIME::开头的路径，指的是上面library! { ... }代码片段中声明的库签名。

let builder = RUNTIME::std::fmt::Formatter::debug_struct
    .INVOKE(formatter, type_name)
    .reference_mut();

for field in receiver.fields() {
    RUNTIME::std::fmt::DebugStruct::field.INVOKE(
        builder,
        field.get_name(),
        field.get_value(),
    );
}

RUNTIME::std::fmt::DebugStruct::finish.INVOKE(builder)

反射库能够跟踪reflect::Value对象从一次INVOKE到另一次的流动，并包含一个编译器，可以以稳健的方式将此数据流编译成强类型Rust源代码。在这个演示中的Debug derive宏被调用在具有两个字段的括号化的struct上时，

#[derive(MyDebug)]
struct Point {
    x: i32,
    y: i32,
}

反射库将生成如下所示的trait实现

// expands to:
impl ::std::fmt::Debug for Point {
    fn fmt(&self, _arg1: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        match *self {
            Point { x: ref _v0, y: ref _v1 } => {
                let mut _v2 = ::std::fmt::Formatter::debug_struct(_arg1, "Point");
                let _ = ::std::fmt::DebugStruct::field(&mut _v2, "x", _v0);
                let _ = ::std::fmt::DebugStruct::field(&mut _v2, "y", _v1);
                let _v3 = ::std::fmt::DebugStruct::finish(&mut _v2);
                _v3
            }
        }
    }
}

此生成的代码将在运行时执行。请注意，没有使用反射。实际上，这与标准库内置的derive(Debug)宏为相同的数据结构生成的代码几乎完全相同。

稳健性和出错原因

我在上面提到，仅使用syn和quote来实现稳健的宏相当具有挑战性。

我喜欢的例子是将单个struct字段临时包裹在一个新struct中。这是一个从serde_derive如何处理serialize_with属性中提取的实际生活用例。

let input: DeriveInput = syn::parse(...).unwrap();

// Pull out one of the field types.
let type_of_field_x: syn::Type = /* ... */;

quote! {
    // Very not robust.
    struct Wrapper<'a> {
        x: &'a #type_of_field_x,
    }

    Wrapper { x: &self.x }
}

要使quote!部分为所有可能的type_of_field_x值生成可编译的代码，涉及到非常复杂的过程。宏作者需要考虑并处理以下所有内容，以便可靠地实现这一点

type_of_field_x使用的生命周期参数；
type_of_field_x使用的类型参数；
type_of_field_x使用的关联类型；
对input上的where子句，它约束上述任何一个；
同样，对input类型参数的trait约束；
影响input其他字段的where子句或约束；
需要去除的input类型参数默认值。

相比之下，reflect库将能够以宏作者很少思考的方式正确地处理所有内容。可能只需像这样简单

let wrapper: reflect::Type = reflect::new_struct_type();

wrapper.instantiate(vec![input.get_field("x").reference()])

剩余的工作

在其当前状态下，这个概念验证代码仅为我们简单的Debug derive生成了勉强可用的代码。为了在存在生命周期和泛型参数以及涉及更复杂类型的库签名的情况下生成稳健的代码，还需要对reflect库进行更多的工作。

关键的是，所有剩余的工作都应该在不接触我们的Debug derive代码的情况下进行。reflect的承诺是，如果宏对最基本的案例（上面的代码已经做到了）有效，那么它也将在所有边缘案例中有效。从现在开始，它是reflect的责任，将简单的类似反射的reflect::Value对象操作编译成一个完全通用和稳健的过程宏。

许可证

^{根据您的选择，本软件受Apache许可证2.0版或MIT许可证许可。}
_{除非您明确说明，否则根据Apache-2.0许可证定义，您有意提交以包含在本软件包中的任何贡献，将如上双重许可，不附加任何额外的条款或条件。}

依赖项

约315–770KB
约19K SLoC