12 个不稳定版本 (5 个破坏性更新)

0.10.5	2024 年 8 月 1 日
0.10.4	2024 年 6 月 21 日
0.10.3	2024 年 5 月 17 日
0.9.1	2023 年 12 月 15 日
0.6.0	2023 年 7 月 4 日

#29 在缓存

每月 198 次下载

Apache-2.0 OR MIT

255KB
4.5K SLoC

redhac

redhac 是从 Rust 内嵌分布式高可用缓存 派生出来的

“内嵌”和“分布式”这两个关键词同时出现有点奇怪。
redhac 的想法是提供一个可以嵌入任何 Rust 应用程序的缓存库，同时仍然提供构建分布式高可用缓存层的能力。
当然，它也可以用作单实例的缓存，但这样性能可能不是最好的，因为它需要复制大部分值以便能够并发地通过网络发送。如果您需要分布式缓存，那么您可能会想尝试一下 redhac。

底层系统使用法定人数并在启动时动态选举领导者。每个节点将启动一个服务器和多个客户端部分，用于双向 gRPC 流。每个客户端将连接到其他每个服务器。

发布

在开源之前，这个 crate 已经被使用和测试过了。
到目前为止，已经进行了数千次 HA 集群启动，我故意制造了领导选举冲突，以确保它们得到适当的处理和解决。

然而，一些基准测试和可能性能微调仍然缺失，这也是为什么这个 crate 还没有达到 v1.0.0 的原因。在基准测试之后，如果另一种处理所有事情的方法更快，API 可能会发生变化。

一致性和保证

重要： redhac 用于缓存，而不是持久数据！

在正常运行期间，所有实例将将其缓存修改转发到其他缓存成员。
如果一个节点崩溃或遇到暂时性网络问题，从而失去与集群的连接，它将使自己的缓存无效，以确保它永远不会提供可能过时的缓存数据。一旦节点重新加入集群，它将开始接收和保存来自远程的更新。
在 cache_get! 宏中有一个选项，可以在此类事件发生后从远程实例获取缓存的值。如果您有一些只存在于缓存中且无法从数据库刷新的值，可以使用此功能。

另一种情况下，当没有法定人数和集群领导者时，缓存将不会保存任何值。

如果您在延迟加入后需要更多的一致性/保证/同步，您可以考虑查看 openraft 或其他类似项目。

版本和依赖

MSRV 是 Rust 1.70
所有内容都已通过 -Zminimal-versions 检查并正常工作
截至 2024-04-09，没有通过 cargo audit 报告任何问题

单实例示例

// These 3 are needed for the `cache_get` macro
use redhac::{cache_get, cache_get_from, cache_get_value};
use redhac::{cache_put, cache_recv, CacheConfig, SizedCache};

#[tokio::main]
async fn main() {
    let (_, mut cache_config) = CacheConfig::new();
    
    // The cache name is used to reference the cache later on
    let cache_name = "my_cache";
    // We need to spawn a global handler for each cache instance.
    // Communication is done over channels.
    cache_config.spawn_cache(
        cache_name.to_string(), SizedCache::with_size(16), None
    );
    
    // Cache keys can only be `String`s at the time of writing.
    let key = "myKey";
    // The value you want to cache must implement `serde::Serialize`.
    // The serialization of the values is done with `bincode`.
    let value = "myCacheValue".to_string();
    
    // At this point, we need cloned values to make everything work
    // nicely with networked connections. If you only ever need a 
    // local cache, you might be better off with using the `cached`
    // crate directly and use references whenever possible.
    cache_put(
        cache_name.to_string(), key.to_string(), &cache_config, &value
    )
        .await
        .unwrap();
    
    let res = cache_get!(
        // The type of the value we want to deserialize the value into
        String,
        // The cache name from above. We can start as many for our
        // application as we like
        cache_name.to_string(),
        // For retrieving values, the same as above is true - we
        // need real `String`s
        key.to_string(),
        // All our caches have the necessary information added to
        // the cache config. Here a
        // reference is totally fine.
        &cache_config,
        // This does not really apply to this single instance 
        // example. If we would have started a HA cache layer we
        // could do remote lookups for a value we cannot find locally,
        // if this would be set to `true`
        false
    )
        .await
        .unwrap();
    
    assert!(res.is_some());
    assert_eq!(res.unwrap(), value);
}

高可用性配置

高可用性（HA）的工作方式是，每个缓存成员相互连接。当达到最终法定多数时，将选举出一个领导者，该领导者负责所有缓存修改以防止冲突（如果您没有通过直接的 cache_put 拒绝它）。由于每个节点都相互连接，这意味着您不能无限地扩展缓存层。理想节点数量是 3。如果您愿意，可以将此数量扩展到 5 或 7，但迄今为止尚未对其进行更详细的测试。

写入性能会随着集群中节点数量的增加而降低，因为您需要等待更多其他成员的 Ack。

然而，读取性能应该保持不变。

每个节点将在缓存中保留每个值的本地副本（如果它没有丢失连接或在某个时间点加入集群），这意味着在大多数情况下读取不需要任何远程网络访问。

配置

配置 HA_MODE 的方法针对 Kubernetes 部署进行了优化，但在其他地方部署时可能会显得有些奇怪。您可以选择提供 .env 文件并将其用作配置文件，或者直接为环境设置这些变量。您需要设置以下值

`HA_MODE`

第一个很简单，只需设置 HA_MODE=true

`HA_HOSTS`

注意
以下一些示例中，部署的名称可能是 rauthy。原因在于，这个包最初是为了补充我的另一个项目 Rauthy（链接将随后提供）而编写的，Rauthy 是一个用 Rust 编写的 OIDC 提供商和单点登录解决方案。

HA_HOSTS 的工作方式是，在 Kubernetes 中配置它非常简单，只要使用 StatefulSet 进行部署。缓存节点通过 HA_HOSTS 和自己的 HOSTNAME 来找到其成员。在 HA_HOSTS 中添加每个缓存成员。例如，如果您想使用 3 个副本在 HA 模式下运行，并作为名为 rauthy 的 StatefulSet 部署

HA_HOSTS="http://rauthy-0:8000, http://rauthy-1:8000 ,http://rauthy-2:8000"

它是如何工作的

节点从操作系统获取其自己的主机名
这就是为什么您使用没有附加任何卷的 StatefulSet 进行部署的原因。对于名为 rauthy 的 StatefulSet，副本将始终具有名称 rauthy-0、rauthy-1、...，这些名称同时也是 pod 内的主机名。
在 HA_HOSTS 变量中找到“我”
如果主机名在 HA_HOSTS 中找不到，则应用程序会因为配置错误而崩溃并退出。
使用“我”条目中找到的端口用于服务器部分
这意味着您无需在其他变量中指定端口号，从而消除了在这种情况下出现不一致或错误配置的风险。
从 HA_HOSTS 中提取 "me"。
然后将剩余的节点视为所有缓存成员，并连接到它们。
一旦达到法定人数，将选举一个领导者。
从那时起，缓存将开始接受请求。
如果领导者丢失 - 选举一个新的领导者 - 不会有任何值丢失。
如果法定人数丢失，缓存将被无效化。
出于安全原因，为了避免缓存不一致，最好使缓存无效并从数据库或其他缓存成员那里重新获取值，而不是处理可能无效的值，这在身份验证/授权情况下尤为重要。

注意
如果您所在的环境中使用提取主机名的方法不起作用，您可以设置每个实例的 HOSTNAME_OVERWRITE 以匹配其中一个 HA_HOSTS 条目，或者您可以在使用 redhac::start_cluster 时覆盖名称。

`CACHE_AUTH_TOKEN`

您需要为 CACHE_AUTH_TOKEN 设置一个密钥，该密钥随后用于验证缓存成员。

TLS

出于本例的目的，我们将不深入探讨 TLS 并在示例中禁用它，这可以通过 CACHE_TLS=false 完成。
您可以将您的 TLS 证书（以 PEM 格式）和可选的根 CA 添加到其中。这对于服务器和客户端部分分别都是正确的。这意味着您可以为缓存层配置使用 mTLS 连接。

生成 TLS 证书（可选）

当然，如果您已经有了某些证书，可以提供您自己的证书集，或者只需使用您喜欢的创建证书的方式。然而，我想展示使用我另一个名为 Nioca 的工具以最简单的方式完成此操作的方法。
如果您有 docker 或类似工具可用，这是最简单的选择。如果没有，您可以从这里的 "out" 文件夹中获取一个二进制文件，这些二进制文件适用于 Linux amd64 和 arm64。

我建议使用 docker 来完成这项任务。否则，您可以直接在任何 Linux 机器上使用 nioca 二进制文件。如果您想为自己生成证书的永久方式，请查看 Rauthy 的 justfile 并根据您的喜好复制和调整 create-root-ca 和 create-end-entity-tls 脚本。
如果您只想快速启动一切，请按照以下步骤操作

证书文件夹

让我们为我们的证书创建一个文件夹

mkdir ca

为 `docker` 命令创建别名

如果您直接使用二进制文件之一，可以跳过此步骤。

alias nioca='docker run --rm -it -v ./ca:/ca -u $(id -u ${USER}):$(id -g ${USER}) ghcr.io/sebadob/nioca'

查看更多功能集以进行更多自定义

nioca x509 -h

生成完整的证书链

我们可以创建并生成一个功能齐全的、生产就绪的根证书颁发机构（CA），只需一个命令即可。确保至少有一个 --alt-name-dns 与 redhac 配置中的 CACHE_TLS_CLIENT_VALIDATE_DOMAIN 匹配。
为了简化，我们将使用相同的证书为服务器和客户端。当然，如果您喜欢，可以创建不同的证书。

nioca x509 \
    --cn 'redhac.local' \
    --alt-name-dns redhac.local \
    --usages-ext server-auth \
    --usages-ext client-auth \
    --stage full \
    --clean

您将需要输入 6 次至少 16 个字符的密码。

前三次，您需要提供您根CA的加密密码
最后三次，您应该为您的中间CA提供不同的密码

如果一切顺利，您将在当前目录下拥有一个名为 x509 的新文件夹，其中包含子文件夹 root、intermediate 和 end_entity。

从中，您需要以下文件

cp ca/x509/intermediate/ca-chain.pem ./redhac.ca-chain.pem && \
cp ca/x509/end_entity/$(cat ca/x509/end_entity/serial)/cert-chain.pem ./redhac.cert-chain.pem && \
cp ca/x509/end_entity/$(cat ca/x509/end_entity/serial)/key.pem ./redhac.key.pem

您应该在 ls -l 中有3个文件

redhac.ca-chain.pem
redhac.cert-chain.pem
redhac.key.pem

4. 创建Kubernetes Secrets

kubectl create secret tls redhac-tls-server --key="redhac.key.pem" --cert="redhac.cert-chain.pem" && \
kubectl create secret tls redhac-tls-client --key="redhac.key.pem" --cert="redhac.cert-chain.pem" && \
kubectl create secret generic redhac-server-ca --from-file redhac.ca-chain.pem && \
kubectl create secret generic redhac-client-ca --from-file redhac.ca-chain.pem

参考配置

以下变量是您可以用于通过环境变量配置 redhac 的变量。
截至写作时，配置只能通过环境变量完成。

# If the cache should start in HA mode or standalone
# accepts 'true|false', defaults to 'false'
HA_MODE=true

# The connection strings (with hostnames) of the HA instances
# as a CSV. Format: 'scheme://hostname:port'
HA_HOSTS="http://redhac.redhac:8080, http://redhac.redhac:8180, http://redhac.redhac:8280"

# This can overwrite the hostname which is used to identify each
# cache member. Useful in scenarios, where all members are on the
# same host or for testing. You need to add the port, since `redhac`
# will do an exact match to find "me".
#HOSTNAME_OVERWRITE="127.0.0.1:8080"

# Secret token, which is used to authenticate the cache members
CACHE_AUTH_TOKEN=SuperSafeSecretToken1337

# Enable / disable TLS for the cache communication (default: true)
CACHE_TLS=true

# The path to the server TLS certificate PEM file
# default: tls/redhac.cert-chain.pem
CACHE_TLS_SERVER_CERT=tls/redhac.cert-chain.pem
# The path to the server TLS key PEM file
# default: tls/redhac.key.pem
CACHE_TLS_SERVER_KEY=tls/redhac.key.pem

# The path to the client mTLS certificate PEM file. This is optional.
CACHE_TLS_CLIENT_CERT=tls/redhac.cert-chain.pem
# The path to the client mTLS key PEM file. This is optional.
CACHE_TLS_CLIENT_KEY=tls/redhac.key.pem

# If not empty, the PEM file from the specified location will be
# added as the CA certificate chain for validating
# the servers TLS certificate. This is optional.
CACHE_TLS_CA_SERVER=tls/redhac.ca-chain.pem
# If not empty, the PEM file from the specified location will
# be added as the CA certificate chain for validating
# the clients mTLS certificate. This is optional.
CACHE_TLS_CA_CLIENT=tls/redhac.ca-chain.pem

# The domain / CN the client should validate the certificate
# against. This domain MUST be inside the
# 'X509v3 Subject Alternative Name' when you take a look at 
# the servers certificate with the openssl tool.
# default: redhac.local
CACHE_TLS_CLIENT_VALIDATE_DOMAIN=redhac.local

# Can be used if you need to overwrite the SNI when the 
# client connects to the server, for instance if you are 
# behind a loadbalancer which combines multiple certificates. 
# default: ""
#CACHE_TLS_SNI_OVERWRITE=

# Define different buffer sizes for channels between the 
# components. Buffer for client request on the incoming 
# stream - server side (default: 128)
# Makes sense to have the CACHE_BUF_SERVER roughly set to:
# `(number of total HA cache hosts - 1) * CACHE_BUF_CLIENT`
CACHE_BUF_SERVER=128
# Buffer for client requests to remote servers for all cache 
# operations (default: 64)
CACHE_BUF_CLIENT=64

# Connections Timeouts
# The Server sends out keepalive pings with configured timeouts
# The keepalive ping interval in seconds (default: 5)
CACHE_KEEPALIVE_INTERVAL=5
# The keepalive ping timeout in seconds (default: 5)
CACHE_KEEPALIVE_TIMEOUT=5

# The timeout for the leader election. If a newly saved leader
# request has not reached quorum after the timeout, the leader
# will be reset and a new request will be sent out.
# CAUTION: This should not be below 
# CACHE_RECONNECT_TIMEOUT_UPPER, since cold starts and
# elections will be problematic in that case.
# value in seconds, default: 2
CACHE_ELECTION_TIMEOUT=2

# These 2 values define the reconnect timeout for the HA Cache
# Clients. The values are in ms and a random between these 2 
# will be chosen each time to avoid conflicts
# and race conditions (default: 500)
CACHE_RECONNECT_TIMEOUT_LOWER=500
# (default: 2000)
CACHE_RECONNECT_TIMEOUT_UPPER=2000

示例代码

例如代码，请查看 ./examples。

conflict_resolution_tests 通过同时从相同代码启动多个节点来故意产生冲突，如果您想了解它是如何完成的，可能不是很有帮助。

更好的例子将是 ha_setup，它可以用来在3个不同的终端中启动3个节点来观察其行为。

依赖项

~20–32MB
~580K SLoC