2个不稳定版本

0.1.0	2023年11月29日
0.0.0	2023年8月8日

#1103 在文本处理

MIT 许可证

16KB
212 代码行

webreg (网页正则表达式)

用于测试正则表达式与网页的命令行工具。

测试网站列表是否匹配给定的正则表达式

安装

cargo install webreg

用法

webreg [OPTIONS] <REGEX>

Arguments:
<REGEX> A regular expression to match against the site content

Options:
-u, --urls <URLS>       Comma separated list of urls
-i, --file <FILE>       A file containing a list of urls
-c, --case-insensitive  Case insensitive search
-f, --fix-urls          Fix urls that don't start with http:// or https://
-r, --retry             Retry failed urls
-s, --save              Saves the output to the results folder (./results/<regex>)
-h, --help              Print help

示例

基本用法

webreg -u "https://example.com" "Hello World"

这将检查字符串 "Hello World" 是否存在于 https://example.com 的内容中。如果存在，它将打印到标准输出。

多个网址

webreg -u "https://example.com,https://example.org" "Hello World"

域名

webreg -u -f "example.com,example.org" "Hello World"

-f 标志会修复不以 http:// 或 https:// 开头的网址。

不区分大小写

webreg -u -c "https://example.com" "hello world"

-c 标志将使搜索不区分大小写。

文件输入

webreg -i urls.txt "Hello World"

urls.txt:

https://example.com
https://example.org

-i 标志将从文件中读取网址。该文件应包含每行一个网址。空行将被忽略，空白将被删除。

管道输入

cat urls.txt | webreg -i "Hello World"

urls.txt:

https://example.com
https://example.org

保存输出

webreg -u -s "https://example.com" "Hello World"

-s 标志将输出保存到结果文件夹（./results/<regex>）。这也会输出无法获取的网址和未匹配正则表达式的网址。

依赖项

~31–46MB
~830K SLoC