#storage-api #storage #big-query #google

bigquery-storage

围绕 Google BigQuery Storage API 的小型包装器

3 个版本

0.1.2 2021 年 2 月 4 日
0.1.1 2020 年 5 月 2 日
0.1.0 2020 年 5 月 2 日

3#bigquery

Download history 32/week @ 2024-03-30 8/week @ 2024-04-06 24/week @ 2024-05-11

每月 115 次下载

Apache-2.0

4MB
54K SLoC

Bazel 53K SLoC // 0.2% comments Rust 386 SLoC // 0.0% comments Go 151 SLoC // 0.2% comments Shell 105 SLoC // 0.3% comments Forge Config 2 SLoC // 0.5% comments

bigquery-storage

围绕 Google BigQuery Storage API 的小型包装器。

BigQuery Storage API 允许通过将内容序列化为高效的并发流来读取 BigQuery 表。官方 API 支持二进制序列化的 Arrow 和 AVRO 格式,但此 crate 目前仅支持输出 Arrow RecordBatch

请参阅 文档 以获取更多信息。

示例

use bigquery_storage::{Table, Client};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Load the desired secret (here, a service account key)
    let sa_key = yup_oauth2::read_service_account_key("clientsecret.json")
        .await?;

    // 2. Create an Authenticator
    let auth = yup_oauth2::ServiceAccountAuthenticator::builder(sa_key)
        .build()
        .await?;

    // 3. Create a Client
    let mut client = Client::new(auth).await?;

    // Reading the content of a table `bigquery-public-beta:london_bicycles.cycle_stations`
    let test_table = Table::new(
        "bigquery-public-data",
        "london_bicycles",
        "cycle_stations"
    );

    // Create a new ReadSession; the `parent_project_id` is the ID of the GCP project
    // that owns the read job. This does not download any data.
    let mut read_session = client
        .read_session_builder(test_table)
        .parent_project_id("openquery-dev".to_string())
        .build()
        .await?;

    // Take the first stream in the queue for this ReadSession.
    let stream_reader = read_session
        .next_stream()
        .await?
        .expect("did not get any stream");

    // The stream is consumed to yield an Arrow StreamReader, which does download the
    // data.
    let mut arrow_stream_reader = stream_reader
        .into_arrow_reader()
        .await?;

    let arrow_record_batch = arrow_stream_reader
        .next()
        .expect("no record batch")?;

    Ok(())
}

许可协议

此项目采用 Apache-2.0 许可协议

依赖项

~23–40MB
~693K SLoC