2个版本
| 0.0.2 | 2024年2月28日 | 
|---|---|
| 0.0.1 | 2023年11月9日 | 
#344 在 编程语言 中
每月 23次下载
22KB
124 行
cfi_types
跨语言LLVM CFI支持的CFI类型。
安装
要安装 cfi_types 包
- 
在命令提示符或终端中,当前工作目录为包根目录,运行以下命令 cargo add cfi-types
或者
- 
将 cfi_types包添加到包根目录的Cargo.toml文件中[dependencies] cfi-types = "0.0.1"
- 
在命令提示符或终端中,当前工作目录为包根目录,运行以下命令 cargo fetch
用法
要使用 cfi_types 包
- 
从 cfi_types包导入CFI类型。例如。use cfi_types::c_long;
- 
用CFI类型替换C类型别名的使用。例如。 extern "C" { fn func(arg: c_long); } fn main() { unsafe { func(c_long(5)) }; }
背景
类型元数据
LLVM使用类型元数据允许IR模块根据类型聚合指针。这种类型元数据被LLVM CFI用于测试给定指针是否与类型标识符相关联(即测试类型成员资格)。
Clang使用Itanium C++ ABI的虚表和RTTI结构名称作为函数指针的类型元数据标识符。
为了支持跨语言LLVM CFI,必须使用兼容的编码。为支持跨语言LLVM CFI而选择的兼容编码是Itanium C++ ABI名称修饰,包括供应商扩展类型限定符和类型,用于Rust类型,这些类型在FFI边界上未使用(请参阅设计文档中的类型元数据)。
C整数类型的编码
Rust 将 char 定义为 Unicode 标量值,而 C 将 char 定义为整数类型。Rust 还定义了显式大小的整数类型(例如,i8、i16、i32 等),而 C 定义了抽象整数类型(例如,char、short、long 等),其实际大小由实现定义,可能在不同的数据模型中有所不同。这会导致歧义,因为在 C 函数类型中使用 Rust 整数类型时(例如,在代表 C 函数的 extern "C" 函数类型中),Itanium C++ ABI 规定了 C 整数类型的编码(例如,char、short、long 等),而不是它们的定义表示(例如,8 位有符号整数、16 位有符号整数、32 位有符号整数等)。
例如,Rust 编译器目前无法确定一个
extern "C" {
    fn func(arg: i64);
}
图 1. 使用 Rust 整数类型的示例 extern "C" 函数。
代表 LP64 或等效数据模型中的 void func(long arg) 还是 void func(long long arg)。
为了支持跨语言 LLVM CFI,Rust 编译器必须能够在启用 CFI 时识别并正确编码跨 FFI 边界间接调用的 extern "C" 函数类型中的 C 类型。
为了方便起见,Rust 提供了一些类似 C 的类型别名,用于与用 C 编写的代码进行交互,并且可以使用这些 C 类型别名来消除歧义。然而,在编码类型时,所有类型别名都已解析为其相应的 ty::Ty 类型表示(即它们的相应 Rust 别名类型),使得目前无法从解析的类型中识别 C 类型别名使用。
例如,Rust 编译器目前也无法确定一个
extern "C" {
    fn func(arg: c_long);
}
图 2. 使用 C 类型别名的示例 extern "C" 函数。
使用了 c_long 类型别名,并且无法在 LP64 或等效数据模型中区分它与 extern "C" fn func(arg: c_longlong) 的区别。
因此,在启用 CFI 时,Rust 编译器无法识别和正确编码跨 FFI 边界间接调用的 extern "C" 函数类型中的 C 类型
#include <stdio.h>
#include <stdlib.h>
// This definition has the type id "_ZTSFvlE".
void
hello_from_c(long arg)
{
    printf("Hello from C!\n");
}
// This definition has the type id "_ZTSFvPFvlElE"--this can be ignored for the
// purposes of this example.
void
indirect_call_from_c(void (*fn)(long), long arg)
{
    // This call site tests whether the destination pointer is a member of the
    // group derived from the same type id of the fn declaration, which has the
    // type id "_ZTSFvlE".
    //
    // Notice that since the test is at the call site and is generated by Clang,
    // the type id used in the test is encoded by Clang.
    fn(arg);
}
图 3. 使用 C 整数类型和 Clang 编码的示例 C 库。
use std::ffi::c_long;
#[link(name = "foo")]
extern "C" {
    // This declaration would have the type id "_ZTSFvlE", but at the time types
    // are encoded, all type aliases are already resolved to their respective
    // Rust aliased types, so this is encoded either as "_ZTSFvu3i32E" or
    // "_ZTSFvu3i64E", depending to what type c_long type alias is resolved to,
    // which currently uses the u<length><type-name> vendor extended type
    // encoding for the Rust integer types--this is the problem demonstrated in
    // this example.
    fn hello_from_c(_: c_long);
    // This declaration would have the type id "_ZTSFvPFvlElE", but is encoded
    // either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
    // (compressed), similarly to the hello_from_c declaration above--this can
    // be ignored for the purposes of this example.
    fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}
// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
    println!("Hello, world!");
}
// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
    println!("Hello from Rust again!");
}
// This definition would also have the type id "_ZTSFvPFvlElE", but is encoded
// either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
// (compressed), similarly to the hello_from_c declaration above--this can be
// ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
    // This indirect call site tests whether the destination pointer is a member
    // of the group derived from the same type id of the f declaration, which
    // would have the type id "_ZTSFvlE", but is encoded either as
    // "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c
    // declaration above.
    //
    // Notice that since the test is at the call site and is generated by the
    // Rust compiler, the type id used in the test is encoded by the Rust
    // compiler.
    unsafe { f(arg) }
}
// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
    // This demonstrates an indirect call within Rust-only code using the same
    // encoding for hello_from_rust and the test at the indirect call site at
    // indirect_call (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
    indirect_call(hello_from_rust, 5);
    // This demonstrates an indirect call across the FFI boundary with the Rust
    // compiler and Clang using different encodings for hello_from_c and the
    // test at the indirect call site at indirect_call (i.e., "_ZTSFvu3i32E" or
    // "_ZTSFvu3i64E" vs "_ZTSFvlE").
    //
    // When using rustc LTO (i.e., -Clto), this works because the type id used
    // is from the Rust-declared hello_from_c, which is encoded by the Rust
    // compiler (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
    //
    // When using (proper) LTO (i.e., -Clinker-plugin-lto), this does not work
    // because the type id used is from the C-defined hello_from_c, which is
    // encoded by Clang (i.e., "_ZTSFvlE").
    indirect_call(hello_from_c, 5);
    // This demonstrates an indirect call to a function passed as a callback
    // across the FFI boundary with the Rust compiler and Clang using different
    // encodings for the hello_from_rust_again and the test at the indirect call
    // site at indirect_call_from_c (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E" vs
    // "_ZTSFvlE").
    //
    // When Rust functions are passed as callbacks across the FFI boundary to be
    // called back from C code, the tests are also at the call site but
    // generated by Clang instead, so the type ids used in the tests are encoded
    // by Clang, which do not match the type ids of declarations encoded by the
    // Rust compiler (e.g., hello_from_rust_again). (The same happens the other
    // way around for C functions passed as callbacks across the FFI boundary to
    // be called back from Rust code.)
    unsafe {
        indirect_call_from_c(hello_from_rust_again, 5);
    }
}
图 4. 使用 Rust 整数类型和 Rust 编译器编码的示例 Rust 程序。
每当在 FFI 边界发生间接调用或间接调用作为回调传递给 FFI 边界的函数时,当启用 CFI 时,Rust 编译器和 Clang 对函数定义和声明中的 C 整数类型使用不同的编码,以及在间接调用位置(见图 3-4)。
cfi_types crate
为了解决C整数类型编码问题,本库提供了一套新的C类型,作为用户自定义类型,使用cfi_encoding属性和repr(transparent)来实现跨语言LLVM CFI支持。
use cfi_types::c_long;
#[link(name = "foo")]
extern "C" {
    // This declaration has the type id "_ZTSFvlE" because it uses the CFI types
    // for cross-language LLVM CFI support. The cfi_types crate provides a new
    // set of C types as user-defined types using the cfi_encoding attribute and
    // repr(transparent) to be used for cross-language LLVM CFI support. This
    // new set of C types allows the Rust compiler to identify and correctly
    // encode C types in extern "C" function types indirectly called across the
    // FFI boundary when CFI is enabled.
    fn hello_from_c(_: c_long);
    // This declaration has the type id "_ZTSFvPFvlElE" because it uses the CFI
    // types for cross-language LLVM CFI support--this can be ignored for the
    // purposes of this example.
    fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}
// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
    println!("Hello, world!");
}
// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
    println!("Hello from Rust again!");
}
// This definition also has the type id "_ZTSFvPFvlElE" because it uses the CFI
// types for cross-language LLVM CFI support, similarly to the hello_from_c
// declaration above--this can be ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
    // This indirect call site tests whether the destination pointer is a member
    // of the group derived from the same type id of the f declaration, which
    // has the type id "_ZTSFvlE" because it uses the CFI types for
    // cross-language LLVM CFI support, similarly to the hello_from_c
    // declaration above.
    unsafe { f(arg) }
}
// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
    // This demonstrates an indirect call within Rust-only code using the same
    // encoding for hello_from_rust and the test at the indirect call site at
    // indirect_call (i.e., "_ZTSFvlE").
    indirect_call(hello_from_rust, c_long(5));
    // This demonstrates an indirect call across the FFI boundary with the Rust
    // compiler and Clang using the same encoding for hello_from_c and the test
    // at the indirect call site at indirect_call (i.e., "_ZTSFvlE").
    indirect_call(hello_from_c, c_long(5));
    // This demonstrates an indirect call to a function passed as a callback
    // across the FFI boundary with the Rust compiler and Clang the same
    // encoding for the hello_from_rust_again and the test at the indirect call
    // site at indirect_call_from_c (i.e., "_ZTSFvlE").
    unsafe {
        indirect_call_from_c(hello_from_rust_again, c_long(5));
    }
}
图5. 使用Rust整数类型和Rust编译器编码以及cfi_types库类型的示例Rust程序。
这一组新的C类型允许Rust编译器识别并正确编码在启用CFI时通过FFI边界间接调用的extern "C"函数类型(见图5)。
贡献
请参阅CONTRIBUTING.md。
许可证
在Apache License,版本2.0或MIT许可证下授权。请参阅LICENSE-APACHE或LICENSE-MIT以获取许可证文本和版权信息。