6 个版本
0.18.1 | 2023 年 11 月 13 日 |
---|---|
0.17.2 | 2023 年 9 月 20 日 |
0.17.0-dev1 | 2023 年 8 月 31 日 |
0.16.5 | 2023 年 8 月 29 日 |
#16 in #distributed-computing
每月下载量 47 次
265KB
6K SLoC
Vineyard Rust SDK
[!NOTE] 需要Rust nightly版本。Vineyard Rust SDK还在开发中。API可能在将来发生变化。
连接到 Vineyard
-
从环境变量
VINEYARD_IPC_SOCKET
解析 UNIX-domain 套接字use vineyard::client::*; let mut client = vineyard::default().unwrap();
-
或者,使用显式参数
use vineyard::client::*; let mut client = vineyard::connect("/var/run/vineyard.sock").unwrap();
与 Vineyard 交互
-
创建 blob
let mut blob_writer = client.create_blob(N)?;
-
获取对象
let mut meta_writer = client.get::<DataFrame>(object_id)?;
与 Python 的互操作:numpy.ndarray
-
Python
import numpy as np import vineyard client = vineyard.connect() np_array = np.random.rand(10, 20).astype(np.int32) object_id = int(client.put(np_array))
-
Rust
let mut client = IPCClient::default()?; let tensor = client.get::<Int32Tensor>(object_id)?; assert_that!(tensor.shape().to_vec()).is_equal_to(vec![10, 20]);
与 Python 的互操作:pandas.DataFrame
-
Python
import pandas as pd import vineyard client = vineyard.connect() df = pd.DataFrame({'a': ["1", "2", "3", "4"], 'b': ["5", "6", "7", "8"]}) object_id = int(client.put(df))
-
Rust
let mut client = IPCClient::default()?; let dataframe = client.get::<DataFrame>(object_id)?; assert_that!(dataframe.num_columns()).is_equal_to(2); assert_that!(dataframe.names().to_vec()).is_equal_to(vec!["a".into(), "b".into()]); for index in 0..dataframe.num_columns() { let column = dataframe.column(index); assert_that!(column.len()).is_equal_to(4); }
与 Python 的互操作:pyarrow.RecordBatch
-
Python
import pandas as pd import pyarrow as pa import vineyard client = vineyard.connect() arrays = [ pa.array([1, 2, 3, 4]), pa.array(["foo", "bar", "baz", "qux"]), pa.array([3.0, 5.0, 7.0, 9.0]), ] batch = pa.RecordBatch.from_arrays(arrays, ["f0", "f1", "f2"]) object_id = int(client.put(batch))
-
Rust
let batch = client.get::<RecordBatch>(object_id)?; assert_that!(batch.num_columns()).is_equal_to(3); assert_that!(batch.num_rows()).is_equal_to(4); let schema = batch.schema(); let names = ["f0", "f1", "f2"]; let recordbatch = batch.as_ref().as_ref();
与 Python 的互操作:pyarrow.Table
-
Python
batches = [batch] * 5 table = pa.Table.from_batches(batches) object_id = int(client.put(table))
-
Rust
let mut client = IPCClient::default()?; let table = client.get::<Table>(object_id)?; assert_that!(table.num_batches()).is_equal_to(5); for batch in table.batches().iter() { // ... }
与 Python 的互操作:polars.DataFrame
-
Python
import polars dataframe = polars.DataFrame(table) object_id = int(client.put(dataframe))
-
Rust
use vineyard_polars::ds::dataframe::DataFrame; let mut client = IPCClient::default()?; let dataframe = client.get::<DataFrame>(object_id)?; let dataframe = dataframe.as_ref().as_ref(); assert_that!(dataframe.width()).is_equal_to(3); for column in dataframe.get_columns() { // ... }
与 Python 的互操作:polars.DataFrame
-
Python
batches = [batch] * 5 table = pa.Table.from_batches(batches) object_id = int(client.put(table))
-
Rust
use vineyard_datafusion::ds::dataframe::DataFrame; let mut client = IPCClient::default()?; let dataframe = client.get::<DataFrame>(object_id)?; let ctx = SessionContext::new(); let table = ctx.read_table(dataframe.table_provider()).unwrap(); assert_that!(block_on(table.count()).unwrap()).is_equal_to(1000);
依赖
~21–28MB
~464K SLoC