| 0.5.0 |  | 
|---|
#53 in #object-store
365KB
 7.5K  SLoC
Rust对象存储
lib.rs:
object_store
该crate提供了一个统一的API,通过ObjectStore trait与对象存储服务和本地文件进行交互。
创建一个ObjectStore实现
- Google Cloud Storage: GoogleCloudStorageBuilder
- Amazon S3: AmazonS3Builder
- Azure Blob Storage: MicrosoftAzureBuilder
- 内存中: InMemory
- 本地文件系统: LocalFileSystem
适配器
ObjectStore实例可以与各种适配器组合,以添加额外的功能
- 速率限制: ThrottleConfig
- 并发请求限制: LimitStore
列出对象
使用ObjectStore::list方法遍历远程存储中的对象或本地文件系统中的文件
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Recursively list all files below the 'data' path.
// 1. On AWS S3 this would be the 'data/' prefix
// 2. On a local filesystem, this would be the 'data' directory
let prefix: Path = "data".try_into().unwrap();
// Get an `async` stream of Metadata objects:
 let list_stream = object_store
     .list(Some(&prefix))
     .await
     .expect("Error listing files");
 // Print a line about each object based on its metadata
 // using for_each from `StreamExt` trait.
 list_stream
     .for_each(move |meta|  {
         async {
             let meta = meta.expect("Error listing");
             println!("Name: {}, size: {}", meta.location, meta.size);
         }
     })
     .await;
将打印出类似以下内容
Name: data/file01.parquet, size: 112832
Name: data/file02.parquet, size: 143119
Name: data/child/file03.parquet, size: 100
...
获取对象
使用ObjectStore::get方法从远程存储或本地文件系统中的文件获取数据字节作为流。
use std::sync::Arc;
use object_store::{path::Path, ObjectStore};
use futures::stream::StreamExt;
// create an ObjectStore
let object_store: Arc<dyn ObjectStore> = Arc::new(get_object_store());
// Retrieve a specific file
let path: Path = "data/file01.parquet".try_into().unwrap();
// fetch the bytes from object store
let stream = object_store
    .get(&path)
    .await
    .unwrap()
    .into_stream();
// Count the '0's using `map` from `StreamExt` trait
let num_zeros = stream
    .map(|bytes| {
        let bytes = bytes.unwrap();
       bytes.iter().filter(|b| **b == 0).count()
    })
    .collect::<Vec<usize>>()
    .await
    .into_iter()
    .sum::<usize>();
println!("Num zeros in {} is {}", path, num_zeros);
将打印出类似以下内容
Num zeros in data/file01.parquet is 657
依赖项
~6–24MB
~339K SLoC