7个版本

0.1.2	2024年3月6日
0.1.1	2024年2月8日
0.1.0	2024年2月5日
0.1.0-beta.3	2024年2月1日
0.1.0-beta.2	2024年1月29日

#196 in HTTP客户端

每月797次下载
用于bed-reader

MIT/Apache

45KB
387 行

cloud-file

Rust中云文件的简单读取

亮点

HTTP、AWS S3、Azure、Google或本地
顺序或随机访问
简化了object_store crate的使用，专注于其有用子集的功能
通过URL和基于字符串的选项访问文件
读取二进制或文本
完全异步
由genomics crate BedReader 使用，该crate被其他Rust和Python项目使用
另请参阅从Rust代码中访问云文件的最佳实践：从Bed-Reader升级的生物信息学库的经验教训，发布在 Towards Data Science。

安装

cargo add cloud-file

示例

查找云文件的大小。

use cloud_file::CloudFile;
# Runtime::new().unwrap().block_on(async {  // '#' needed for doctest

let url = "https://raw.githubusercontent.com/fastlmm/bed-sample-files/main/toydata.5chrom.fam";
let cloud_file = CloudFile::new(url)?;
let file_size = cloud_file.read_file_size().await?;
assert_eq!(file_size, 14_361);
# Ok::<(), Box<dyn std::error::Error>>(()) }).unwrap();
# use {cloud_file::CloudFileError, tokio::runtime::Runtime};

查找云文件中的行数。

use cloud_file::CloudFile;
use futures::StreamExt; // Enables `.next()` on streams.
# Runtime::new().unwrap().block_on(async { // '#' needed for doctest

let url = "https://raw.githubusercontent.com/fastlmm/bed-sample-files/main/toydata.5chrom.fam";
let cloud_file = CloudFile::new_with_options(url, [("timeout", "30s")])?;
let mut chunks = cloud_file.stream_chunks().await?;
let mut newline_count: usize = 0;
while let Some(chunk) = chunks.next().await {
    let chunk = chunk?;
    newline_count += bytecount::count(&chunk, b'\n');
}
assert_eq!(newline_count, 500);
# Ok::<(), Box<dyn std::error::Error>>(()) }).unwrap();
# use {cloud_file::CloudFileError, tokio::runtime::Runtime};

示例	演示
`line_count`	以二进制块读取文件。
`nth_line`	以文本行读取文件。
`bigram_counts`	读取文件的无序随机区域。
`aws_file_size`	在AWS上查找文件的大小。

项目链接

依赖关系

~8–18MB
~253K SLoC

7个版本

cloud-file

亮点

安装

示例

更多示例

项目链接

依赖关系