#path #utf-8 #unicode

no-std typed-path

为 Unix 和 Windows 提供了 Path 和 PathBuf 的类型化变体

16 个版本 (8 个破坏性更改)

0.9.1 2024 年 7 月 16 日
0.8.0 2024 年 2 月 25 日
0.7.0 2023 年 11 月 4 日
0.3.2 2023 年 3 月 27 日
0.1.0 2022 年 8 月 25 日

#31文件系统

Download history 6070/week @ 2024-05-04 6322/week @ 2024-05-11 8615/week @ 2024-05-18 6475/week @ 2024-05-25 6905/week @ 2024-06-01 6884/week @ 2024-06-08 6096/week @ 2024-06-15 6270/week @ 2024-06-22 5373/week @ 2024-06-29 6849/week @ 2024-07-06 7353/week @ 2024-07-13 6941/week @ 2024-07-20 7734/week @ 2024-07-27 7961/week @ 2024-08-03 9506/week @ 2024-08-10 7517/week @ 2024-08-17

33,704 每月下载量
41 个 Crates 中使用 (14 直接使用)

MIT/Apache

650KB
10K SLoC

类型化路径

Crates.io Docs.rs CI RustC 1.58.1+

为 Unix 和 Windows 提供 PathPathBuf 的类型化变体。

安装

[dependencies]
typed-path = "0.9"

截至版本 0.7,此库还支持依赖 allocno_std 环境。要按这种方式构建,请移除默认的 std 功能

[dependencies]
typed-path = { version = "...", default-features = false }

为什么?

一些应用程序需要在不同平台上操作 Windows 或 UNIX 路径,原因多种多样:构建可移植文件格式、解析来自其他平台的文件、处理归档格式、处理某些网络协议等。

-- Josh Triplett

查看这个讨论的 issue。实际上,该功能存在于标准库中,但没有公开!

这意味着路径如 C:\path\to\file.txt 的解析将根据您所在的平台而有所不同!

use std::path::Path;

// On Windows, this prints out:
//
// * Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
// But on Unix, this prints out:
//
// * Normal("C:\\path\\to\\file.txt")
let path = Path::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

用法

字节路径

该库提供了一种通用的 Path<T>PathBuf<u8>,它们使用 [u8]Vec<u8> 而不是 OsStrOsString。提供了一种编码泛型类型,以指定如何解析底层字节,从而支持在任何操作系统上编译时都一致的路由功能!

use typed_path::WindowsPath;

// On all platforms, this prints out:
//
// * Prefix(PrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
let path = WindowsPath::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

UTF8 强制路径

除了字节路径之外,这个库还通过 Utf8Path<T>Utf8PathBuf<T> 支持UTF8强制的路径,它们在内部使用 strString。提供了一个编码泛型类型,以指定底层字符的解析方式,以便支持无论针对哪个操作系统编译,路径功能都能保持一致!

use typed_path::Utf8WindowsPath;

// On all platforms, this prints out:
//
// * Prefix(Utf8WindowsPrefixComponent { raw: "C:", parsed: Disk(67) })
// * RootDir
// * Normal("path")
// * Normal("to")
// * Normal("file.txt")]
//
let path = Utf8WindowsPath::new(r"C:\path\to\file.txt");
for component in path.components() {
    println!("* {:?}", component);
}

检查路径

当处理用户定义的路径时,需要额外的防御层来防止滥用,避免路径遍历攻击和其他风险。

为此,您可以使用 PathBuf::push_checkedPath::join_checked(及其等效方法)来确保创建的路径不会以意外的方式改变现有的路径。

use typed_path::{CheckedPathError, Path, PathBuf, UnixEncoding};

let path = Path::<UnixEncoding>::new("/etc");

// A valid path can be joined onto the existing one
assert_eq!(path.join_checked("passwd"), Ok(PathBuf::from("/etc/passwd")));

// An invalid path will result in an error
assert_eq!(
    path.join_checked("/sneaky/replacement"), 
    Err(CheckedPathError::UnexpectedRoot)
);

let mut path = PathBuf::<UnixEncoding>::from("/etc");

// Pushing a relative path that contains parent directory references that cannot be
// resolved within the path is considered an error as this is considered a path
// traversal attack!
assert_eq!(
    path.push_checked(".."), 
    Err(CheckedPathError::PathTraversalAttack)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing an absolute path will fail with an error
assert_eq!(
    path.push_checked("/sneaky/replacement"), 
    Err(CheckedPathError::UnexpectedRoot)
);
assert_eq!(path, PathBuf::from("/etc"));

// Pushing a relative path that is safe will succeed
assert!(path.push_checked("abc/../def").is_ok());
assert_eq!(path, PathBuf::from("/etc/abc/../def"));

编码之间的转换

有时您可能需要在不同编码之间进行转换,例如,当您想加载本地路径并将其转换为另一种格式时。在这种情况下,您可以使用 with_encoding 方法(或特定的变体,如 with_unix_encodingwith_windows_encoding)将 PathUtf8Path 转换为其相应的 PathBufUtf8PathBuf,并使用显式编码。

use typed_path::{Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding::<Utf8WindowsEncoding>();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Converting from Windows to Unix will drop any prefix
let windows_path = Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt");
let unix_path = windows_path.with_encoding::<Utf8UnixEncoding>();
assert_eq!(unix_path, Utf8Path::<Utf8UnixEncoding>::new(r"/tmp/foo.txt"));

// Converting to itself should retain everything
let path = Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt");
assert_eq!(
    path.with_encoding::<Utf8WindowsEncoding>(),
    Utf8Path::<Utf8WindowsEncoding>::new(r"C:\tmp\foo.txt"),
);

就像使用 checked 变体推送和连接路径一样,我们还可以确保从更改编码创建的路径仍然有效。

use typed_path::{CheckedPathError, Utf8Path, Utf8UnixEncoding, Utf8WindowsEncoding};

// Convert from Unix to Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/foo.txt");
let windows_path = unix_path.with_encoding_checked::<Utf8WindowsEncoding>().unwrap();
assert_eq!(windows_path, Utf8Path::<Utf8WindowsEncoding>::new(r"\tmp\foo.txt"));

// Convert from Unix to Windows will fail if there are characters that are valid in Unix but not in Windows
let unix_path = Utf8Path::<Utf8UnixEncoding>::new("/tmp/|invalid|/foo.txt");
assert_eq!(
    unix_path.with_encoding_checked::<Utf8WindowsEncoding>(),
    Err(CheckedPathError::InvalidFilename),
);

类型化路径

在上面的示例中,我们使用的是在编译时已知的编码(Unix或Windows)的路径。可能存在需要运行时支持以决定和切换编码的情况。为此,此crate提供了 TypedPathTypedPathBuf 枚举(及其 Utf8TypedPathUtf8TypedPathBuf 变体)。

use typed_path::Utf8TypedPath;

// Derive the path by determining if it is Unix or Windows
let path = Utf8TypedPath::derive(r"C:\path\to\file.txt");
assert!(path.is_windows());

// Change the encoding to Unix
let path = path.with_unix_encoding();
assert_eq!(path, "/path/to/file.txt");

规范化

除了实现标准库中与 PathPathBuf 相关的标准方法外,此crate还实现了其他一些方法,包括通过解析 ... 来规范化路径的能力,而无需路径存在。

use typed_path::Utf8UnixPath;

assert_eq!(
    Utf8UnixPath::new("foo/bar//baz/./asdf/quux/..").normalize(),
    Utf8UnixPath::new("foo/bar/baz/asdf"),
);

此外,您还可以使用 absolutize 将路径转换为绝对形式,如果路径是相对的,则将其当前工作目录前置,然后对其进行规范化(需要 std 功能)。

use typed_path::{utils, Utf8UnixPath};

// With an absolute path, it is just normalized
// NOTE: This requires `std` feature, otherwise `absolutize` is missing!
let path = Utf8UnixPath::new("/a/b/../c/./d");
assert_eq!(path.absolutize().unwrap(), Utf8UnixPath::new("/a/c/d"));

// With a relative path, it is first joined with the current working directory
// and then normalized
// NOTE: This requires `std` feature, otherwise `utf8_current_dir` and
//       `absolutize` are missing!
let cwd = utils::utf8_current_dir().unwrap().with_unix_encoding();
let path = cwd.join(Utf8UnixPath::new("a/b/../c/./d"));
assert_eq!(path.absolutize().unwrap(), cwd.join(Utf8UnixPath::new("a/c/d")));

实用函数

辅助函数可在 utils 模块中使用(需要 std 功能)。

今天,有三种与std::env中找到的方法相对应的方法。

每个方法都有一个实现,可以生成一个NativePathBuf和一个Utf8NativePathBuf

当前目录

// Retrieves the current directory as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `current_dir` is missing!
let _cwd = typed_path::utils::current_dir().unwrap();

// Retrieves the current directory as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_current_dir` is missing!
let _utf8_cwd = typed_path::utils::utf8_current_dir().unwrap();

当前可执行文件

// Returns the full filesystem path of the current running executable as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `current_exe` is missing!
let _exe = typed_path::utils::current_exe().unwrap();

// Returns the full filesystem path of the current running executable as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_current_exe` is missing!
let _utf8_exe = typed_path::utils::utf8_current_exe().unwrap();

临时目录

// Returns the path of a temporary directory as a NativePathBuf:
//
// * For Unix family, this would be PathBuf<UnixEncoding>
// * For Windows family, this would be PathBuf<WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `temp_dir` is missing!
let _temp_dir = typed_path::utils::temp_dir().unwrap();

// Returns the path of a temporary directory as a Utf8NativePathBuf:
//
// * For Unix family, this would be Utf8PathBuf<Utf8UnixEncoding>
// * For Windows family, this would be Utf8PathBuf<Utf8WindowsEncoding>
//
// NOTE: This requires `std` feature, otherwise `utf8_temp_dir` is missing!
let _utf8_temp_dir = typed_path::utils::utf8_temp_dir().unwrap();

许可证

本项目许可协议为以下之一,您可根据需要选择:

Apache许可证,版本2.0(LICENSE-APACHE或apache-license)、MIT许可证(LICENSE-MIT或mit-license)。

无运行时依赖

功能