5 个版本
0.1.4 | 2023年7月14日 |
---|---|
0.1.3 | 2022年10月3日 |
0.1.2 | 2022年9月24日 |
0.1.1 | 2022年9月20日 |
0.1.0 | 2022年9月20日 |
#11 in #elasticsearch
140KB
3K SLoC
search-query-parser
这个库是做什么的
search-query-parser 被设计用来将复杂的搜索查询解析为分层搜索条件,以便轻松构建 Elasticsearch 查询 DSL 或其他。
例如,以下复杂的搜索查询:↓↓↓
(word1 and-word2)or(("phrase word 1"or-"phrase word 2")and-("a long phrase word"or word3))
将被解析为以下分层搜索条件:↓↓↓
Condition::Operator(
Operator::Or,
vec![
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("word1".into()),
Condition::Not(Box::new(Condition::Keyword("word2".into()))),
]
),
Condition::Operator(
Operator::And,
vec![
Condition::Operator(
Operator::Or,
vec![
Condition::PhraseKeyword("phrase word 1".into()),
Condition::Not(Box::new(Condition::PhraseKeyword(
"phrase word 2".into()
)))
]
),
Condition::Not(Box::new(Condition::Operator(
Operator::Or,
vec![
Condition::PhraseKeyword(" a long phrase word ".into()),
Condition::Keyword("word3".into())
]
)))
]
),
]
)
条件是通过 enum Condition
和 enum Operator
构建的。
#[derive(Debug, Clone, Eq, PartialEq)]
pub enum Condition {
None,
Keyword(String),
PhraseKeyword(String),
Not(Box<Condition>),
Operator(Operator, Vec<Condition>),
}
#[derive(Debug, Clone, Eq, PartialEq)]
pub enum Operator {
And,
Or,
}
用法
1. 用于 Rust 项目
[dependencies]
search-query-parser = "0.1.4"
use search_query_parser::parse_query_to_condition;
let condition = parse_query_to_condition("any query string you like")?;
2. 用于 REST Api
3. 通过 JNI 用于 JVM 语言
参考 search-query-parser-cdylib 仓库
解析规则
1. 空格 {\u0020} 或全角空格 {\u3000} 被识别为 AND
操作符
fn test_keywords_concat_with_spaces() {
let actual = parse_query_to_condition("word1 word2").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("word1".into()),
Condition::Keyword("word2".into())
]
)
)
}
2. AND
操作符的优先级高于 OR
操作符
fn test_keywords_concat_with_and_or() {
let actual =
parse_query_to_condition("word1 OR word2 AND word3").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("word1".into()),
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("word2".into()),
Condition::Keyword("word3".into()),
]
)
]
)
)
}
3. 括号内的条件优先级更高
fn test_brackets() {
let actual =
parse_query_to_condition("word1 AND (word2 OR word3)")
.unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("word1".into()),
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("word2".into()),
Condition::Keyword("word3".into()),
]
)
]
)
)
}
4. 双引号将用于解析短语关键词
fn test_double_quote() {
let actual = parse_query_to_condition(
"\"word1 AND (word2 OR word3)\" word4",
)
.unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::PhraseKeyword(
"word1 AND (word2 OR word3)".into()
),
Condition::Keyword("word4".into()),
]
)
)
}
5. 减号(hyphen)将用于解析负条件
※ 它可以用于关键词、短语关键词或括号之前
fn test_minus() {
let actual = parse_query_to_condition(
"-word1 -\"word2\" -(word3 OR word4)",
)
.unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Not(Box::new(Condition::Keyword("word1".into()))),
Condition::Not(Box::new(Condition::PhraseKeyword("word2".into()))),
Condition::Not(Box::new(Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("word3".into()),
Condition::Keyword("word4".into())
]
))),
]
)
)
}
6. 修复错误的搜索查询
- 空括号
fn test_empty_brackets() {
let actual = parse_query_to_condition("A AND () AND B").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("A".into()),
Condition::Keyword("B".into()),
]
)
)
}
- 反向括号
fn test_reverse_brackets() {
let actual = parse_query_to_condition("A OR B) AND (C OR D").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("A".into()),
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("B".into()),
Condition::Keyword("C".into()),
]
),
Condition::Keyword("D".into()),
]
)
)
}
- 括号数量错误
fn test_missing_brackets() {
let actual = parse_query_to_condition("(A OR B) AND (C").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("A".into()),
Condition::Keyword("B".into()),
]
),
Condition::Keyword("C".into()),
]
)
)
}
- 空短语关键词
fn test_empty_phrase_keywords() {
let actual = parse_query_to_condition("A AND \"\" AND B").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Keyword("A".into()),
Condition::Keyword("B".into()),
]
)
)
}
- 引号数量错误
fn test_invalid_double_quote() {
let actual = parse_query_to_condition("\"A\" OR \"B OR C").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::Or,
vec![
Condition::PhraseKeyword("A".into()),
Condition::Keyword("B".into()),
Condition::Keyword("C".into()),
]
)
)
}
- and 或相邻
fn test_invalid_and_or() {
let actual = parse_query_to_condition("A AND OR B").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("A".into()),
Condition::Keyword("B".into()),
]
)
)
}
7. 搜索查询优化
fn test_unnecessary_nest_brackets() {
let actual = parse_query_to_condition("(A OR (B OR C)) AND D").unwrap();
assert_eq!(
actual,
Condition::Operator(
Operator::And,
vec![
Condition::Operator(
Operator::Or,
vec![
Condition::Keyword("A".into()),
Condition::Keyword("B".into()),
Condition::Keyword("C".into()),
]
),
Condition::Keyword("D".into()),
]
)
)
}
依赖关系
~2.7–4.5MB
~78K SLoC