55个版本

0.4.19	2024年5月4日
0.4.14	2024年4月17日
0.4.12	2024年3月12日
0.4.6	2023年12月21日
0.3.26	2023年11月30日

#178 in 编码

944 每月下载量
用于 7 个crate (3 个直接使用)

MIT 许可证

27KB
369 行

原生模型

在bincode、postcard等序列化格式之上增加互操作性。

更多详细信息请参阅概念。

目标

互操作性：允许不同应用程序协同工作，即使它们使用不同版本的数据模型。
数据一致性：确保我们处理的是预期的数据模型。
灵活性：您可以使用任何想要的序列化格式。更多详细信息这里。
性能：最小开销（编码：~20 ns，解码：~40 ps）。更多详细信息这里。

用法

       Application 1 (DotV1)        Application 2 (DotV1 and DotV2)
                |                                  |
   Encode DotV1 |--------------------------------> | Decode DotV1 to DotV2
                |                                  | Modify DotV2
   Decode DotV1 | <--------------------------------| Encode DotV2 back to DotV1
                |                                  |

// Application 1
let dot = DotV1(1, 2);
let bytes = native_model::encode(&dot).unwrap();

// Application 1 sends bytes to Application 2.

// Application 2
// We are able to decode the bytes directly into a new type DotV2 (upgrade).
let (mut dot, source_version) = native_model::decode::<DotV2>(bytes).unwrap();
assert_eq!(dot, DotV2 { 
    name: "".to_string(), 
    x: 1, 
    y: 2 
});
dot.name = "Dot".to_string();
dot.x = 5;
// For interoperability, we encode the data with the version compatible with Application 1 (downgrade).
let bytes = native_model::encode_downgrade(dot, source_version).unwrap();

// Application 2 sends bytes to Application 1.

// Application 1
let (dot, _) = native_model::decode::<DotV1>(bytes).unwrap();
assert_eq!(dot, DotV1(5, 2));

完整示例这里。

序列化格式

您可以通过功能标志使用默认序列化格式，例如

[dependencies]
native_model = { version = "0.1", features = ["bincode_2_rc"] }

每个功能标志对应于序列化格式的特定次要版本。为了避免破坏性更改，默认序列化格式是最旧的版本。

bincode_1_3: bincode v1.3（默认）
bincode_2_rc: bincode v2.0.0-rc3
postcard_1_0: postcard v1.0

自定义序列化格式

定义一个您想要的名称的结构体。这个结构体必须实现native_model::Encode和native_model::Decode特质。

完整示例

其他示例，请参阅默认实现

数据模型

使用宏native_model定义您的模型。

属性

id = u32：模型的唯一标识符。
version = u32：模型的版本。
with = type：您用于Encode/Decode实现的序列化格式。在此设置。
from = type：可选，模型的先前版本。
- type：您用于From实现的模型的先前版本。
try_from = (type, error)：可选，带有错误处理的先前版本的模型。
- type：您用于TryFrom实现的先前版本的模型。
- error：您用于TryFrom实现的错误类型。

use native_model::native_model;

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 1)]
struct DotV1(u32, u32);

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 2, from = DotV1)]
struct DotV2 {
    name: String,
    x: u64,
    y: u64,
}

// Implement the conversion between versions From<DotV1> for DotV2 and From<DotV2> for DotV1.

#[derive(Deserialize, Serialize, PartialEq, Debug)]
#[native_model(id = 1, version = 3, try_from = (DotV2, anyhow::Error))]
struct DotV3 {
    name: String,
    cord: Cord,
}

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct Cord {
    x: u64,
    y: u64,
}

// Implement the conversion between versions From<DotV2> for DotV3 and From<DotV3> for DotV2.

状态

早期开发。尚未准备投入生产。

概念

为了了解原生模型的工作原理，您需要了解以下概念。

身份(id)：身份是模型的唯一标识符。它用于识别模型，并防止将模型解码为错误的Rust类型。
版本(version)：版本是模型的版本。它用于检查两个模型之间的兼容性。
Encode：encode是将模型转换为字节数组的过程。
Decode：decode是将字节数组转换为模型的过程。
Downgrade：downgrade是将模型转换为先前版本的模型的过程。
Upgrade：upgrade是将模型转换为较新版本的模型的过程。

在底层，原生模型是序列化数据的一个薄包装。`id`和`version`使用little_endian::U32进行双编码。这代表8个字节，被添加到数据的开头。

+------------------+------------------+------------------------------------+
|     ID (4 bytes) | Version (4 bytes)| Data (indeterminate-length bytes)  |
+------------------+------------------+------------------------------------+

完整示例在此。

性能

原生模型已被设计为具有最小和恒定的开销。这意味着开销与数据的大小无关。在底层，我们使用zerocopy crate来避免不必要的复制。

👉 要知道编码/解码的总时间，您需要添加您序列化格式的计时。

摘要

Encode：~20 ns
Decode：~40 ps

数据大小	编码时间（ns）	解码时间（ps）
1 B	19.769 ns - 20.154 ns	40.526 ps - 40.617 ps
1 KiB	19.597 ns - 19.971 ns	40.534 ps - 40.633 ps
1 MiB	19.662 ns - 19.910 ns	40.508 ps - 40.632 ps
10 MiB	19.591 ns - 19.980 ns	40.504 ps - 40.605 ps
100 MiB	19.669 ns - 19.867 ns	40.520 ps - 40.644 ps

原生模型开销基准测试在此。

依赖

~1.4–3.5MB
~56K SLoC