2个版本
0.0.2 | 2024年2月28日 |
---|---|
0.0.1 | 2023年11月9日 |
#344 在 编程语言 中
每月 23次下载
22KB
124 行
cfi_types
跨语言LLVM CFI支持的CFI类型。
安装
要安装 cfi_types
包
-
在命令提示符或终端中,当前工作目录为包根目录,运行以下命令
cargo add cfi-types
或者
-
将
cfi_types
包添加到包根目录的Cargo.toml
文件中[dependencies] cfi-types = "0.0.1"
-
在命令提示符或终端中,当前工作目录为包根目录,运行以下命令
cargo fetch
用法
要使用 cfi_types
包
-
从
cfi_types
包导入CFI类型。例如。use cfi_types::c_long;
-
用CFI类型替换C类型别名的使用。例如。
extern "C" { fn func(arg: c_long); } fn main() { unsafe { func(c_long(5)) }; }
背景
类型元数据
LLVM使用类型元数据允许IR模块根据类型聚合指针。这种类型元数据被LLVM CFI用于测试给定指针是否与类型标识符相关联(即测试类型成员资格)。
Clang使用Itanium C++ ABI的虚表和RTTI结构名称作为函数指针的类型元数据标识符。
为了支持跨语言LLVM CFI,必须使用兼容的编码。为支持跨语言LLVM CFI而选择的兼容编码是Itanium C++ ABI名称修饰,包括供应商扩展类型限定符和类型,用于Rust类型,这些类型在FFI边界上未使用(请参阅设计文档中的类型元数据)。
C整数类型的编码
Rust 将 char
定义为 Unicode 标量值,而 C 将 char
定义为整数类型。Rust 还定义了显式大小的整数类型(例如,i8
、i16
、i32
等),而 C 定义了抽象整数类型(例如,char
、short
、long
等),其实际大小由实现定义,可能在不同的数据模型中有所不同。这会导致歧义,因为在 C 函数类型中使用 Rust 整数类型时(例如,在代表 C 函数的 extern "C"
函数类型中),Itanium C++ ABI 规定了 C 整数类型的编码(例如,char
、short
、long
等),而不是它们的定义表示(例如,8 位有符号整数、16 位有符号整数、32 位有符号整数等)。
例如,Rust 编译器目前无法确定一个
extern "C" {
fn func(arg: i64);
}
图 1. 使用 Rust 整数类型的示例 extern "C" 函数。
代表 LP64 或等效数据模型中的 void func(long arg)
还是 void func(long long arg)
。
为了支持跨语言 LLVM CFI,Rust 编译器必须能够在启用 CFI 时识别并正确编码跨 FFI 边界间接调用的 extern "C"
函数类型中的 C 类型。
为了方便起见,Rust 提供了一些类似 C 的类型别名,用于与用 C 编写的代码进行交互,并且可以使用这些 C 类型别名来消除歧义。然而,在编码类型时,所有类型别名都已解析为其相应的 ty::Ty
类型表示(即它们的相应 Rust 别名类型),使得目前无法从解析的类型中识别 C 类型别名使用。
例如,Rust 编译器目前也无法确定一个
extern "C" {
fn func(arg: c_long);
}
图 2. 使用 C 类型别名的示例 extern "C" 函数。
使用了 c_long
类型别名,并且无法在 LP64 或等效数据模型中区分它与 extern "C" fn func(arg: c_longlong)
的区别。
因此,在启用 CFI 时,Rust 编译器无法识别和正确编码跨 FFI 边界间接调用的 extern "C"
函数类型中的 C 类型
#include <stdio.h>
#include <stdlib.h>
// This definition has the type id "_ZTSFvlE".
void
hello_from_c(long arg)
{
printf("Hello from C!\n");
}
// This definition has the type id "_ZTSFvPFvlElE"--this can be ignored for the
// purposes of this example.
void
indirect_call_from_c(void (*fn)(long), long arg)
{
// This call site tests whether the destination pointer is a member of the
// group derived from the same type id of the fn declaration, which has the
// type id "_ZTSFvlE".
//
// Notice that since the test is at the call site and is generated by Clang,
// the type id used in the test is encoded by Clang.
fn(arg);
}
图 3. 使用 C 整数类型和 Clang 编码的示例 C 库。
use std::ffi::c_long;
#[link(name = "foo")]
extern "C" {
// This declaration would have the type id "_ZTSFvlE", but at the time types
// are encoded, all type aliases are already resolved to their respective
// Rust aliased types, so this is encoded either as "_ZTSFvu3i32E" or
// "_ZTSFvu3i64E", depending to what type c_long type alias is resolved to,
// which currently uses the u<length><type-name> vendor extended type
// encoding for the Rust integer types--this is the problem demonstrated in
// this example.
fn hello_from_c(_: c_long);
// This declaration would have the type id "_ZTSFvPFvlElE", but is encoded
// either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
// (compressed), similarly to the hello_from_c declaration above--this can
// be ignored for the purposes of this example.
fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}
// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
println!("Hello, world!");
}
// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
println!("Hello from Rust again!");
}
// This definition would also have the type id "_ZTSFvPFvlElE", but is encoded
// either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
// (compressed), similarly to the hello_from_c declaration above--this can be
// ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
// This indirect call site tests whether the destination pointer is a member
// of the group derived from the same type id of the f declaration, which
// would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c
// declaration above.
//
// Notice that since the test is at the call site and is generated by the
// Rust compiler, the type id used in the test is encoded by the Rust
// compiler.
unsafe { f(arg) }
}
// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
// This demonstrates an indirect call within Rust-only code using the same
// encoding for hello_from_rust and the test at the indirect call site at
// indirect_call (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
indirect_call(hello_from_rust, 5);
// This demonstrates an indirect call across the FFI boundary with the Rust
// compiler and Clang using different encodings for hello_from_c and the
// test at the indirect call site at indirect_call (i.e., "_ZTSFvu3i32E" or
// "_ZTSFvu3i64E" vs "_ZTSFvlE").
//
// When using rustc LTO (i.e., -Clto), this works because the type id used
// is from the Rust-declared hello_from_c, which is encoded by the Rust
// compiler (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
//
// When using (proper) LTO (i.e., -Clinker-plugin-lto), this does not work
// because the type id used is from the C-defined hello_from_c, which is
// encoded by Clang (i.e., "_ZTSFvlE").
indirect_call(hello_from_c, 5);
// This demonstrates an indirect call to a function passed as a callback
// across the FFI boundary with the Rust compiler and Clang using different
// encodings for the hello_from_rust_again and the test at the indirect call
// site at indirect_call_from_c (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E" vs
// "_ZTSFvlE").
//
// When Rust functions are passed as callbacks across the FFI boundary to be
// called back from C code, the tests are also at the call site but
// generated by Clang instead, so the type ids used in the tests are encoded
// by Clang, which do not match the type ids of declarations encoded by the
// Rust compiler (e.g., hello_from_rust_again). (The same happens the other
// way around for C functions passed as callbacks across the FFI boundary to
// be called back from Rust code.)
unsafe {
indirect_call_from_c(hello_from_rust_again, 5);
}
}
图 4. 使用 Rust 整数类型和 Rust 编译器编码的示例 Rust 程序。
每当在 FFI 边界发生间接调用或间接调用作为回调传递给 FFI 边界的函数时,当启用 CFI 时,Rust 编译器和 Clang 对函数定义和声明中的 C 整数类型使用不同的编码,以及在间接调用位置(见图 3-4)。
cfi_types crate
为了解决C整数类型编码问题,本库提供了一套新的C类型,作为用户自定义类型,使用cfi_encoding
属性和repr(transparent)
来实现跨语言LLVM CFI支持。
use cfi_types::c_long;
#[link(name = "foo")]
extern "C" {
// This declaration has the type id "_ZTSFvlE" because it uses the CFI types
// for cross-language LLVM CFI support. The cfi_types crate provides a new
// set of C types as user-defined types using the cfi_encoding attribute and
// repr(transparent) to be used for cross-language LLVM CFI support. This
// new set of C types allows the Rust compiler to identify and correctly
// encode C types in extern "C" function types indirectly called across the
// FFI boundary when CFI is enabled.
fn hello_from_c(_: c_long);
// This declaration has the type id "_ZTSFvPFvlElE" because it uses the CFI
// types for cross-language LLVM CFI support--this can be ignored for the
// purposes of this example.
fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}
// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
println!("Hello, world!");
}
// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
println!("Hello from Rust again!");
}
// This definition also has the type id "_ZTSFvPFvlElE" because it uses the CFI
// types for cross-language LLVM CFI support, similarly to the hello_from_c
// declaration above--this can be ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
// This indirect call site tests whether the destination pointer is a member
// of the group derived from the same type id of the f declaration, which
// has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c
// declaration above.
unsafe { f(arg) }
}
// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
// This demonstrates an indirect call within Rust-only code using the same
// encoding for hello_from_rust and the test at the indirect call site at
// indirect_call (i.e., "_ZTSFvlE").
indirect_call(hello_from_rust, c_long(5));
// This demonstrates an indirect call across the FFI boundary with the Rust
// compiler and Clang using the same encoding for hello_from_c and the test
// at the indirect call site at indirect_call (i.e., "_ZTSFvlE").
indirect_call(hello_from_c, c_long(5));
// This demonstrates an indirect call to a function passed as a callback
// across the FFI boundary with the Rust compiler and Clang the same
// encoding for the hello_from_rust_again and the test at the indirect call
// site at indirect_call_from_c (i.e., "_ZTSFvlE").
unsafe {
indirect_call_from_c(hello_from_rust_again, c_long(5));
}
}
图5. 使用Rust整数类型和Rust编译器编码以及cfi_types库类型的示例Rust程序。
这一组新的C类型允许Rust编译器识别并正确编码在启用CFI时通过FFI边界间接调用的extern "C"
函数类型(见图5)。
贡献
请参阅CONTRIBUTING.md。
许可证
在Apache License,版本2.0或MIT许可证下授权。请参阅LICENSE-APACHE或LICENSE-MIT以获取许可证文本和版权信息。