#llvm #type #cross-language #metadata #alias #cfi #c-long

nightly cfi-types

跨语言LLVM CFI支持的CFI类型

2个版本

0.0.2 2024年2月28日
0.0.1 2023年11月9日

#344编程语言

每月 23次下载

MIT/Apache

22KB
124

cfi_types

Build Status

跨语言LLVM CFI支持的CFI类型。

安装

要安装 cfi_types

  1. 在命令提示符或终端中,当前工作目录为包根目录,运行以下命令

    cargo add cfi-types
    

或者

  1. cfi_types 包添加到包根目录的 Cargo.toml 文件中

    [dependencies]
    cfi-types = "0.0.1"
    
  2. 在命令提示符或终端中,当前工作目录为包根目录,运行以下命令

    cargo fetch
    

用法

要使用 cfi_types

  1. cfi_types 包导入CFI类型。例如。

    use cfi_types::c_long;
    
  2. 用CFI类型替换C类型别名的使用。例如。

    extern "C" {
        fn func(arg: c_long);
    }
    
    fn main() {
        unsafe { func(c_long(5)) };
    }
    

背景

类型元数据

LLVM使用类型元数据允许IR模块根据类型聚合指针。这种类型元数据被LLVM CFI用于测试给定指针是否与类型标识符相关联(即测试类型成员资格)。

Clang使用Itanium C++ ABI虚表和RTTI结构名称作为函数指针的类型元数据标识符。

为了支持跨语言LLVM CFI,必须使用兼容的编码。为支持跨语言LLVM CFI而选择的兼容编码是Itanium C++ ABI名称修饰,包括供应商扩展类型限定符和类型,用于Rust类型,这些类型在FFI边界上未使用(请参阅设计文档中的类型元数据)。

C整数类型的编码

Rust 将 char 定义为 Unicode 标量值,而 C 将 char 定义为整数类型。Rust 还定义了显式大小的整数类型(例如,i8i16i32 等),而 C 定义了抽象整数类型(例如,charshortlong 等),其实际大小由实现定义,可能在不同的数据模型中有所不同。这会导致歧义,因为在 C 函数类型中使用 Rust 整数类型时(例如,在代表 C 函数的 extern "C" 函数类型中),Itanium C++ ABI 规定了 C 整数类型的编码(例如,charshortlong 等),而不是它们的定义表示(例如,8 位有符号整数、16 位有符号整数、32 位有符号整数等)。

例如,Rust 编译器目前无法确定一个

extern "C" {
    fn func(arg: i64);
}

图 1. 使用 Rust 整数类型的示例 extern "C" 函数。

代表 LP64 或等效数据模型中的 void func(long arg) 还是 void func(long long arg)

为了支持跨语言 LLVM CFI,Rust 编译器必须能够在启用 CFI 时识别并正确编码跨 FFI 边界间接调用的 extern "C" 函数类型中的 C 类型。

为了方便起见,Rust 提供了一些类似 C 的类型别名,用于与用 C 编写的代码进行交互,并且可以使用这些 C 类型别名来消除歧义。然而,在编码类型时,所有类型别名都已解析为其相应的 ty::Ty 类型表示(即它们的相应 Rust 别名类型),使得目前无法从解析的类型中识别 C 类型别名使用。

例如,Rust 编译器目前也无法确定一个

extern "C" {
    fn func(arg: c_long);
}

图 2. 使用 C 类型别名的示例 extern "C" 函数。

使用了 c_long 类型别名,并且无法在 LP64 或等效数据模型中区分它与 extern "C" fn func(arg: c_longlong) 的区别。

因此,在启用 CFI 时,Rust 编译器无法识别和正确编码跨 FFI 边界间接调用的 extern "C" 函数类型中的 C 类型

#include <stdio.h>
#include <stdlib.h>

// This definition has the type id "_ZTSFvlE".
void
hello_from_c(long arg)
{
    printf("Hello from C!\n");
}

// This definition has the type id "_ZTSFvPFvlElE"--this can be ignored for the
// purposes of this example.
void
indirect_call_from_c(void (*fn)(long), long arg)
{
    // This call site tests whether the destination pointer is a member of the
    // group derived from the same type id of the fn declaration, which has the
    // type id "_ZTSFvlE".
    //
    // Notice that since the test is at the call site and is generated by Clang,
    // the type id used in the test is encoded by Clang.
    fn(arg);
}

图 3. 使用 C 整数类型和 Clang 编码的示例 C 库。

use std::ffi::c_long;

#[link(name = "foo")]
extern "C" {
    // This declaration would have the type id "_ZTSFvlE", but at the time types
    // are encoded, all type aliases are already resolved to their respective
    // Rust aliased types, so this is encoded either as "_ZTSFvu3i32E" or
    // "_ZTSFvu3i64E", depending to what type c_long type alias is resolved to,
    // which currently uses the u<length><type-name> vendor extended type
    // encoding for the Rust integer types--this is the problem demonstrated in
    // this example.
    fn hello_from_c(_: c_long);

    // This declaration would have the type id "_ZTSFvPFvlElE", but is encoded
    // either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
    // (compressed), similarly to the hello_from_c declaration above--this can
    // be ignored for the purposes of this example.
    fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}

// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
    println!("Hello, world!");
}

// This definition would have the type id "_ZTSFvlE", but is encoded either as
// "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
    println!("Hello from Rust again!");
}

// This definition would also have the type id "_ZTSFvPFvlElE", but is encoded
// either as "_ZTSFvPFvu3i32ES_E" (compressed) or "_ZTSFvPFvu3i64ES_E"
// (compressed), similarly to the hello_from_c declaration above--this can be
// ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
    // This indirect call site tests whether the destination pointer is a member
    // of the group derived from the same type id of the f declaration, which
    // would have the type id "_ZTSFvlE", but is encoded either as
    // "_ZTSFvu3i32E" or "_ZTSFvu3i64E", similarly to the hello_from_c
    // declaration above.
    //
    // Notice that since the test is at the call site and is generated by the
    // Rust compiler, the type id used in the test is encoded by the Rust
    // compiler.
    unsafe { f(arg) }
}

// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
    // This demonstrates an indirect call within Rust-only code using the same
    // encoding for hello_from_rust and the test at the indirect call site at
    // indirect_call (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
    indirect_call(hello_from_rust, 5);

    // This demonstrates an indirect call across the FFI boundary with the Rust
    // compiler and Clang using different encodings for hello_from_c and the
    // test at the indirect call site at indirect_call (i.e., "_ZTSFvu3i32E" or
    // "_ZTSFvu3i64E" vs "_ZTSFvlE").
    //
    // When using rustc LTO (i.e., -Clto), this works because the type id used
    // is from the Rust-declared hello_from_c, which is encoded by the Rust
    // compiler (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E").
    //
    // When using (proper) LTO (i.e., -Clinker-plugin-lto), this does not work
    // because the type id used is from the C-defined hello_from_c, which is
    // encoded by Clang (i.e., "_ZTSFvlE").
    indirect_call(hello_from_c, 5);

    // This demonstrates an indirect call to a function passed as a callback
    // across the FFI boundary with the Rust compiler and Clang using different
    // encodings for the hello_from_rust_again and the test at the indirect call
    // site at indirect_call_from_c (i.e., "_ZTSFvu3i32E" or "_ZTSFvu3i64E" vs
    // "_ZTSFvlE").
    //
    // When Rust functions are passed as callbacks across the FFI boundary to be
    // called back from C code, the tests are also at the call site but
    // generated by Clang instead, so the type ids used in the tests are encoded
    // by Clang, which do not match the type ids of declarations encoded by the
    // Rust compiler (e.g., hello_from_rust_again). (The same happens the other
    // way around for C functions passed as callbacks across the FFI boundary to
    // be called back from Rust code.)
    unsafe {
        indirect_call_from_c(hello_from_rust_again, 5);
    }
}

图 4. 使用 Rust 整数类型和 Rust 编译器编码的示例 Rust 程序。

每当在 FFI 边界发生间接调用或间接调用作为回调传递给 FFI 边界的函数时,当启用 CFI 时,Rust 编译器和 Clang 对函数定义和声明中的 C 整数类型使用不同的编码,以及在间接调用位置(见图 3-4)。

cfi_types crate

为了解决C整数类型编码问题,本库提供了一套新的C类型,作为用户自定义类型,使用cfi_encoding属性和repr(transparent)来实现跨语言LLVM CFI支持。

use cfi_types::c_long;

#[link(name = "foo")]
extern "C" {
    // This declaration has the type id "_ZTSFvlE" because it uses the CFI types
    // for cross-language LLVM CFI support. The cfi_types crate provides a new
    // set of C types as user-defined types using the cfi_encoding attribute and
    // repr(transparent) to be used for cross-language LLVM CFI support. This
    // new set of C types allows the Rust compiler to identify and correctly
    // encode C types in extern "C" function types indirectly called across the
    // FFI boundary when CFI is enabled.
    fn hello_from_c(_: c_long);

    // This declaration has the type id "_ZTSFvPFvlElE" because it uses the CFI
    // types for cross-language LLVM CFI support--this can be ignored for the
    // purposes of this example.
    fn indirect_call_from_c(f: unsafe extern "C" fn(c_long), arg: c_long);
}

// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust(_: c_long) {
    println!("Hello, world!");
}

// This definition has the type id "_ZTSFvlE" because it uses the CFI types for
// cross-language LLVM CFI support, similarly to the hello_from_c declaration
// above.
unsafe extern "C" fn hello_from_rust_again(_: c_long) {
    println!("Hello from Rust again!");
}

// This definition also has the type id "_ZTSFvPFvlElE" because it uses the CFI
// types for cross-language LLVM CFI support, similarly to the hello_from_c
// declaration above--this can be ignored for the purposes of this example.
fn indirect_call(f: unsafe extern "C" fn(c_long), arg: c_long) {
    // This indirect call site tests whether the destination pointer is a member
    // of the group derived from the same type id of the f declaration, which
    // has the type id "_ZTSFvlE" because it uses the CFI types for
    // cross-language LLVM CFI support, similarly to the hello_from_c
    // declaration above.
    unsafe { f(arg) }
}

// This definition has the type id "_ZTSFvvE"--this can be ignored for the
// purposes of this example.
fn main() {
    // This demonstrates an indirect call within Rust-only code using the same
    // encoding for hello_from_rust and the test at the indirect call site at
    // indirect_call (i.e., "_ZTSFvlE").
    indirect_call(hello_from_rust, c_long(5));

    // This demonstrates an indirect call across the FFI boundary with the Rust
    // compiler and Clang using the same encoding for hello_from_c and the test
    // at the indirect call site at indirect_call (i.e., "_ZTSFvlE").
    indirect_call(hello_from_c, c_long(5));

    // This demonstrates an indirect call to a function passed as a callback
    // across the FFI boundary with the Rust compiler and Clang the same
    // encoding for the hello_from_rust_again and the test at the indirect call
    // site at indirect_call_from_c (i.e., "_ZTSFvlE").
    unsafe {
        indirect_call_from_c(hello_from_rust_again, c_long(5));
    }
}

图5. 使用Rust整数类型和Rust编译器编码以及cfi_types库类型的示例Rust程序。

这一组新的C类型允许Rust编译器识别并正确编码在启用CFI时通过FFI边界间接调用的extern "C"函数类型(见图5)。

贡献

请参阅CONTRIBUTING.md

许可证

在Apache License,版本2.0或MIT许可证下授权。请参阅LICENSE-APACHELICENSE-MIT以获取许可证文本和版权信息。

无运行时依赖