6 个版本

新版本 0.3.0	2024 年 8 月 23 日
0.2.0	2024 年 8 月 9 日
0.1.0	2024 年 8 月 3 日

在机器学习中排名第 65

每月下载量 406

Apache-2.0

225KB
2.5K SLoC

ExecuTorch-rs

executorch 是一个用于在 Rust 中执行 PyTorch 模型的 Rust 库。它是围绕 ExecuTorch C++ API 的 Rust 包装器。它依赖于 Cpp API 的 0.3.0 版本，但将随着 API 的更新而更新。底层 C++ 库仍在 alpha 版，其 API 可能会随着 Rust API 的更新而更改。

使用方法

在 Python 中创建一个模型并导出它

import torch
from executorch.exir import to_edge
from torch.export import export

class Add(torch.nn.Module):
    def __init__(self):
        super(Add, self).__init__()

    def forward(self, x: torch.Tensor, y: torch.Tensor):
        return x + y


aten_dialect = export(Add(), (torch.ones(1), torch.ones(1)))
edge_program = to_edge(aten_dialect)
executorch_program = edge_program.to_executorch()
with open("model.pte", "wb") as file:
    file.write(executorch_program.buffer)

在 Rust 中执行模型

use executorch::evalue::{EValue, Tag};
use executorch::module::Module;
use executorch::tensor::{Array, Tensor};
use ndarray::array;

let mut module = Module::new("model.pte", None);

let input_array1 = Array::new(array![1.0_f32]);
let input_tensor1 = input_array1.to_tensor_impl();
let input_evalue1 = EValue::new(Tensor::new(&input_tensor1));

let input_array2 = Array::new(array![1.0_f32]);
let input_tensor2 = input_array2.to_tensor_impl();
let input_evalue2 = EValue::new(Tensor::new(&input_tensor2));

let outputs = module.forward(&[input_evalue1, input_evalue2]).unwrap();
assert_eq!(outputs.len(), 1);
let output = outputs.into_iter().next().unwrap();
assert_eq!(output.tag(), Some(Tag::Tensor));
let output = output.as_tensor();

println!("Output tensor computed: {:?}", output);
assert_eq!(array![2.0_f32], output.as_array());

请参阅 example/hello_world_add 和 example/hello_world_add_no_std 以获取完整示例。

构建

要构建库，您需要首先构建 C++ 库。C++ 库允许通过许多标志实现高度的灵活性，可以自定义构建哪些模块、内核和扩展。构建了多个静态库，Rust 库链接到这些库。以下示例中，我们使用运行示例 hello_world_add 所需的标志构建 C++ 库

# Clone the C++ library
cd ${TEMP_DIR}
git clone --depth 1 --branch v0.3.0 https://github.com/pytorch/executorch.git
cd executorch
git submodule sync --recursive
git submodule update --init --recursive

# Install requirements
./install_requirements.sh

# Build C++ library
mkdir cmake-out && cd cmake-out
cmake \
    -DDEXECUTORCH_SELECT_OPS_LIST=aten::add.out \
    -DEXECUTORCH_BUILD_EXECUTOR_RUNNER=OFF \
    -DEXECUTORCH_BUILD_EXTENSION_RUNNER_UTIL=OFF \
    -DBUILD_EXECUTORCH_PORTABLE_OPS=ON \
    -DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
    -DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
    -DEXECUTORCH_ENABLE_PROGRAM_VERIFICATION=ON \
    -DEXECUTORCH_ENABLE_LOGGING=ON \
    ..
make -j

# Static libraries are in cmake-out/
# core:
#   cmake-out/libexecutorch.a
#   cmake-out/libexecutorch_no_prim_ops.a
# kernels implementations:
#   cmake-out/kernels/portable/libportable_ops_lib.a
#   cmake-out/kernels/portable/libportable_kernels.a
# extension data loader, enabled with EXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON:
#   cmake-out/extension/data_loader/libextension_data_loader.a
# extension module, enabled with EXECUTORCH_BUILD_EXTENSION_MODULE=ON:
#   cmake-out/extension/module/libextension_module_static.a

# Run example
# We set EXECUTORCH_RS_EXECUTORCH_LIB_DIR to the path of the C++ build output
cd ${EXECUTORCH_RS_DIR}/examples/hello_world_add
python export_model.py
EXECUTORCH_RS_EXECUTORCH_LIB_DIR=${TEMP_DIR}/executorch/cmake-out cargo run

executorch 包将始终寻找以下静态库

libexecutorch.a
libexecutorch_no_prim_ops.a

如果启用了功能标志，则需要额外的库（请参阅下一节）

libextension_data_loader.a
libextension_module_static.a

内核实现的静态库只有在您的模型使用它们时才需要，并且应该由使用 executorch 包的二进制文件手动链接。例如，hello_world_add 示例使用了一个只包含单个加法运算的操作模型，因此它使用 DEXECUTORCH_SELECT_OPS_LIST=aten::add.out 编译 C++ 库，并在其 build.rs 中包含以下行：

println!("cargo::rustc-link-lib=static:+whole-archive=portable_kernels");
println!("cargo::rustc-link-lib=static:+whole-archive=portable_ops_lib");

let libs_dir = std::env::var("EXECUTORCH_RS_EXECUTORCH_LIB_DIR").unwrap();
println!("cargo::rustc-link-search={}/kernels/portable/", libs_dir);

请注意，ops 和内核库使用 +whole-archive 进行链接，以确保所有符号都包含在二进制文件中。

构建（和库）已在 Ubuntu 和 MacOS 上测试，不在 Windows 上。

Cargo 功能

数据加载器

包含 FileDataLoader 和 MmapDataLoader 结构体。如果没有此功能，唯一可用的数据加载器是 BufferDataLoader。需要 libextension_data_loader.a 静态库，使用 EXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON 编译 C++ executorch。
模块

包含 Module 结构体。需要 libextension_module_static.a 静态库，使用 EXECUTORCH_BUILD_EXTENSION_MODULE=ON 编译 C++ executorch。还包括 std 功能。
f16

使用 half 包支持半精度浮点数。具有 f16 数据类型输入或输出张量的模型可以使用此功能。
复数

使用 num-complex 包支持复数。需要输入或输出复数 32 或 64 位浮点数的模型可以使用此功能。如果还启用了 f16 功能，则可以使用半精度复数。

std 启用标准库。此功能默认启用，但可以禁用以在 no_std 环境中构建 executorch。请参阅 hello_world_add_no_std 示例。还包括 alloc 功能。注意：no_std 仍处于 WIP 状态，请参阅 https://github.com/pytorch/executorch/issues/4561 alloc 启用分配。当此功能禁用时，所有需要分配的方法都不会编译。此功能由 std 功能启用，该功能默认启用。可以在不启用 std 功能的情况下启用此功能，并将使用 alloc 包进行分配，该包需要一个全局分配器来设置。

默认情况下，启用了 std 功能。

依赖关系 ~1.4–3.5MB ~72K SLoC cfg-if executorch-sys f16? half 2.4 log ndarray 0.16 complex? num-complex build bindgen 0.69.4 build cc 其他功能 alloc 数据加载器模块 std