15 个版本
0.6.0-beta.1 | 2024年1月25日 |
---|---|
0.5.2 | 2023年12月22日 |
0.5.1 | 2022年6月16日 |
0.4.0 | 2022年3月24日 |
#282 in 文本处理
43 每月下载量
用于 2 crates
46KB
983 行
Cindex,CSV 索引器
Cindex 是一个易于使用的 CSV 索引器,支持简单的类似 SQL 的查询。
Cindex 不适用于重量级的数据库索引,而适用于简单的内存查询。如果您正在使用大量 CSV 文件,请使用其他数据库交互层。
即将添加二进制文件。
功能
使用类似 SQL 的查询索引 CSV 表
有时您可能希望以原始方式索引表并从中获取原始值。只需将 CSV 表通过管道传递到其他程序,可能还有可选的 CSV 标头。
人性化的 CSV 读取
手动写入 CSV 值并不那么简单,但在某些情况下您必须这样做。cindex 允许缺失值,甚至可以使用特定的 FLAG 语法允许缺失列。不允许缺失逗号。
使用方法
[dependencies]
cindex = "*" # Use the latest version if possible
# Use rayon feature if you want parrelel iteration of rows
features = ["rayon"]
use std::fs::File;
use cindex::{Indexer, CsvType, Predicate, Query, OutOption, Operator};
let mut indexer = Indexer::new();
// Add table from file
indexer
.add_table(
"table1",
BufReader::new(File::open("test.csv").expect("Failed to open a file")),
)
.expect("Failed to add table");
// Add table from stdin
let stdin = std::io::stdin();
indexer
.add_table("table2", stdin.lock())
.expect("Failed to add table");
// Indexing
// Create query object and print queried output to terminal
use std::str::FromStr;
let query = Query::from_str("SELECT a,b,c FROM table1 WHERE a = 10")
.expect("Failed to create a query from str");
indexer
.index(query, OutOption::Term)
.expect("Failed to index a table");
// Use raw query and yield output to a file
indexer
.index_raw(
"SELECT * FROM table2 WHERE id = 10",
OutOption::File(std::fs::File::create("out.csv").expect("Failed to create a file")),
)
.expect("Failed to index a table");
// Use builder pattern to construct query and index a table
let query = Query::build()
.table("table1")
.columns(vec!["id", "address"])
.predicate(Predicate::new("id", Operator::Equal).args(vec!["10"]))
.predicate(
Predicate::build()
.column("address")
.operator(Operator::NotEqual)
.raw_args("111-2222"),
);
let mut acc = String::new();
indexer
.index(query, OutOption::Value(&mut acc))
.expect("Failed to index a table");
// Always use unix newline for formatting
indexer.always_use_unix_newline(true);
查询语法
Cindex 的查询语法类似于 SQL,但有一些细微差别。
WHERE 子句的比较符应在列名之后
/* Select everythig from given table*/
SELECT * FROM table1
/* Select everything from given table and order by column with descending
order*/
SELECT * FROM table1 ORDER BY col1 DESC
/* Same with previous commands but map header to different array */
SELECT * FROM table1 ORDER BY col1 DESC HMAP 'new h','new h2','new h3'
/* You can use OFFSET and LIMIT syntax to control how much lines to print*/
/* Keep in mind that this doesn't early return from indexing, but it works as
final_records[offset..offset+limit] */
/* e.g. next line gets records[1..3] */
SELECT * FROM table1 OFFSET 1 LIMIT 2
/* Select given columns from table where column's value is equal to given
condition and also other column's value matches regex expression */
SELECT col1,col2 FROM table1 WHERE col1 = 10 AND col2 LIKE ^start
/* There is a flag syntax which changes query behaviour*/
SELECT * FROM table_name FLAG PHD SUP
/* In this case each flag does next operation
- PHD (PRINT-HEADER) : Print a header in result output
- SUP (SUPPLEMENT) : Enable a selection of non-existent column with empty values
- TP (Transpose) : Transpose(Invert) csv value as of linalg
*/
支持的 WHERE 操作包括
>=
>
<=
<
=
!=
IN
BETWEEN
LIKE ( with regeular expression )
待办事项
- 支持多个 WHERE 子句
- 连接表
依赖关系
~6MB
~62K SLoC