1个不稳定版本

0.4.0	2024年5月17日

#669在解析器实现中

每月下载 237次
在4个crate中使用了（通过topiary-core）

Apache-2.0 WITH LLVM-exception

145KB
2.5K SLoC

Topiary

Topiary旨在成为简单语言的统一格式化工具，作为Tree-sitter生态系统的一部分。它以修剪树木成奇特形状的艺术命名。

Topiary是为格式化工具作者和用户设计的。作者可以为语言创建格式化工具，而无需编写自己的格式化引擎或解析器。用户将从统一的代码风格中受益，并可能在代码库中使用单个格式化工具的便利性，跨越多种语言，每种语言都应用了相似的样式。

动机

代码编写的风格在历史上主要留给个人选择。当然，这是主观的，导致了大量时间浪费在审查格式选择上，而不是代码本身。规定风格指南是早期解决方案之一，产生了检查开发者格式化的工具，最终导致了自动格式化工具的出现。后者由gofmt普及，其开发者有洞察力，即“足够好”的统一格式化，强加在代码库上，在很大程度上解决了这些问题。

Topiary遵循这一趋势，力求成为一个“通用格式化引擎”，允许开发者不仅自动以统一风格格式化他们的代码库，而且可以使用简单的DSL为新语言定义该风格。如果为该语言定义了Tree-sitter语法，这将允许快速开发格式化工具。

设计原则

Topiary的创建考虑以下目标

使用Tree-sitter进行解析，以避免为格式化工具编写另一套语法。
期望幂等性。也就是说，对已经格式化的代码的格式化不改变任何内容。
为了使捆绑的格式化风格满足以下约束
- 与实际使用的该语言格式化风格兼容。
- 忠实于作者的意图：如果代码已经写成多行，那么这个决定将被保留。
- 最小化提交之间的更改，以便差异主要关注更改的代码，而不是表面上的工件。也就是说，一行上的更改不会影响其他行，而格式化不会在您修改代码时强迫您进行后续的表面性更改。
- 经过良好测试且健壮，以便格式化工具可以在大型项目中得到信任。
对于最终用户（即不是格式化风格作者）来说，格式化工具应该
- 规定一种格式化风格，虽然可以自定义，但对于他们的代码库来说是统一且“足够好”的。
- 运行效率高。
- 与其他开发者工具（如编辑器和语言服务器）简单集成。

语言支持

目前，Topiary针对的语言的Tree-sitter语法是静态链接的。这些语言的格式化风格分为两个成熟度级别：受支持和实验性。

受支持

这些格式化风格覆盖了它们的靶语言并满足Topiary声明的设计目标。它们通过命令行标志在Topiary中公开。

OCaml（包括实现和接口）
OCamllex
Nickel
JSON
TOML

实验性

这些语言的格式化风格可能会改变，或者尚未被认为是生产就绪的。可以通过指定它们的查询文件路径在Topiary中访问它们。

入门指南

安装

可以从仓库目录使用Cargo构建和安装该项目

cargo install --path topiary-cli

Topiary需要找到语言查询文件（.scm）才能正常工作。默认情况下，topiary在当前工作目录中查找名为languages的目录。

如果您从除此存储库之外的目录运行Topiary，这将不起作用。为了无限制地使用Topiary，您必须将环境变量TOPIARY_LANGUAGE_DIR设置为指向Topiary的语言查询文件（.scm）所在的目录。默认情况下，您应该将其设置为<topiary存储库的本地路径>/topiary-queries/queries，例如

export TOPIARY_LANGUAGE_DIR=/home/me/tools/topiary/topiary-queries/queries topiary fmt ./projects/helloworld/hello.ml

TOPIARY_LANGUAGE_DIR还可以在构建时设置。Topiary将选择相应的路径并将其嵌入到topiary二进制文件中。在这种情况下，您不再需要担心在运行时提供TOPIARY_LANGUAGE_DIR。当TOPIARY_LANGUAGE_DIR在构建时设置并在运行时也设置时，运行时的值具有优先权。

有关设置开发环境的详细信息，请参阅CONTRIBUTING.md

设置预提交钩子

Topiary与pre-commit-hooks.nix无缝集成：将Topiary添加为flake的输入，并在pre-commit-hooks.nix的设置中使用

pre-commit-check = nix-pre-commit-hooks.run { hooks = { nixfmt.enable = true; ## keep your normal hooks ... ## Add the following: topiary = topiary.lib.${system}.pre-commit-hook; }; };

用法

Topiary CLI使用多个子命令来划分功能。可以使用topiary --help列出它们；然后每个子命令都有自己的专用帮助文本。

CLI app for Topiary, the universal code formatter. Usage: topiary [OPTIONS] <COMMAND> Commands: format Format inputs visualise Visualise the input's Tree-sitter parse tree config Print the current configuration completion Generate shell completion script help Print this message or the help of the given subcommand(s) Options: -C, --configuration <CONFIGURATION> Configuration file [env: TOPIARY_CONFIG_FILE] --configuration-collation <CONFIGURATION_COLLATION> Configuration collation mode [env: TOPIARY_CONFIG_COLLATION] [default: merge] Possible values: - merge: When multiple sources of configuration are available, matching items are updated from the higher priority source, with collections merged as the union of sets - revise: When multiple sources of configuration are available, matching items (including collections) are superseded from the higher priority source - override: When multiple sources of configuration are available, the highest priority source is taken. All values from lower priority sources are discarded -v, --verbose... Logging verbosity (increased per occurrence) -h, --help Print help (see a summary with '-h') -V, --version Print version

格式

Format inputs Usage: topiary format [OPTIONS] <--language <LANGUAGE>|FILES> Arguments: [FILES]... Input files and directories (omit to read from stdin) Language detection and query selection is automatic, mapped from file extensions defined in the Topiary configuration. Options: -t, --tolerate-parsing-errors Consume as much as possible in the presence of parsing errors -s, --skip-idempotence Do not check that formatting twice gives the same output -l, --language <LANGUAGE> Topiary language identifier (for formatting stdin) -q, --query <QUERY> Topiary query file override (when formatting stdin) -C, --configuration <CONFIGURATION> Configuration file [env: TOPIARY_CONFIG_FILE] --configuration-collation <CONFIGURATION_COLLATION> Configuration collation mode [env: TOPIARY_CONFIG_COLLATION] [default: merge] Possible values: - merge: When multiple sources of configuration are available, matching items are updated from the higher priority source, with collections merged as the union of sets - revise: When multiple sources of configuration are available, matching items (including collections) are superseded from the higher priority source - override: When multiple sources of configuration are available, the highest priority source is taken. All values from lower priority sources are discarded -v, --verbose... Logging verbosity (increased per occurrence) -h, --help Print help (see a summary with '-h')

当从磁盘格式化输入时，语言选择是通过检测输入文件的扩展名来实现的。为了格式化标准输入，您必须指定--language参数，可选地指定--query参数，并且省略任何输入文件。

注意：fmt是format子命令的一个已识别别名。

可视化

Visualise the input's Tree-sitter parse tree Usage: topiary visualise [OPTIONS] <--language <LANGUAGE>|FILE> Arguments: [FILE] Input file (omit to read from stdin) Language detection and query selection is automatic, mapped from file extensions defined in the Topiary configuration. Options: -f, --format <FORMAT> Visualisation format [default: dot] Possible values: - dot: GraphViz DOT serialisation - json: JSON serialisation -l, --language <LANGUAGE> Topiary language identifier (for formatting stdin) -q, --query <QUERY> Topiary query file override (when formatting stdin) -C, --configuration <CONFIGURATION> Configuration file [env: TOPIARY_CONFIG_FILE] --configuration-collation <CONFIGURATION_COLLATION> Configuration collation mode [env: TOPIARY_CONFIG_COLLATION] [default: merge] Possible values: - merge: When multiple sources of configuration are available, matching items are updated from the higher priority source, with collections merged as the union of sets - revise: When multiple sources of configuration are available, matching items (including collections) are superseded from the higher priority source - override: When multiple sources of configuration are available, the highest priority source is taken. All values from lower priority sources are discarded -v, --verbose... Logging verbosity (increased per occurrence) -h, --help Print help (see a summary with '-h')

当从磁盘可视化输入时，语言选择是通过检测输入文件的扩展名来实现的。为了可视化标准输入，您必须指定--language参数，可选地指定--query参数，并且省略输入文件。可视化输出将写入标准输出。

注意：vis、visualize和view是visualise子命令的已识别别名。

配置

Print the current configuration Usage: topiary config [OPTIONS] Options: -C, --configuration <CONFIGURATION> Configuration file [env: TOPIARY_CONFIG_FILE] --configuration-collation <CONFIGURATION_COLLATION> Configuration collation mode [env: TOPIARY_CONFIG_COLLATION] [default: merge] Possible values: - merge: When multiple sources of configuration are available, matching items are updated from the higher priority source, with collections merged as the union of sets - revise: When multiple sources of configuration are available, matching items (including collections) are superseded from the higher priority source - override: When multiple sources of configuration are available, the highest priority source is taken. All values from lower priority sources are discarded -v, --verbose... Logging verbosity (increased per occurrence) -h, --help Print help (see a summary with '-h')

请参考下面的配置部分，了解配置和归并模式的来源。

注意：cfg是config子命令的一个已识别别名。

Shell 完成脚本

可以使用completion子命令生成Topiary的Shell完成脚本。输出可以按需源到您的shell会话或配置文件中。

Generate shell completion script Usage: topiary completion [OPTIONS] [SHELL] Arguments: [SHELL] Shell (omit to detect from the environment) [possible values: bash, elvish, fish, powershell, zsh] Options: -C, --configuration <CONFIGURATION> Configuration file [env: TOPIARY_CONFIG_FILE] --configuration-collation <CONFIGURATION_COLLATION> Configuration collation mode [env: TOPIARY_CONFIG_COLLATION] [default: merge] Possible values: - merge: When multiple sources of configuration are available, matching items are updated from the higher priority source, with collections merged as the union of sets - revise: When multiple sources of configuration are available, matching items (including collections) are superseded from the higher priority source - override: When multiple sources of configuration are available, the highest priority source is taken. All values from lower priority sources are discarded -v, --verbose... Logging verbosity (increased per occurrence) -h, --help Print help (see a summary with '-h')

例如，在Bash中

source <(topiary completion)

日志记录

默认情况下，Topiary CLI只会输出错误消息。您可以使用相应的--v/--verbose标志来增加日志详细程度

详细程度标志日志级别

无错误

-v ...以及警告

-vv ...以及信息

-vvv ...以及调试输出

-vvvv ...以及跟踪输出

退出代码

当格式化成功时，Topiary进程将以零退出代码退出。否则，定义了以下退出代码

原因代码

未指定错误 1

CLI参数解析错误 2

I/O错误 3

Topiary查询错误 4

源解析错误 5

语言检测错误 6

幂等性错误 7

未指定格式化错误 8

多个错误 9

当给出多个输入时，即使存在错误，Topiary也会尽可能处理所有输入。如果发生任何错误，Topiary将返回非零退出代码。有关这些错误的详细情况，请在warn日志级别（带有--v）下运行Topiary。

示例

构建完成后，程序可以像这样运行

echo '{"foo":"bar"}' | topiary fmt --language json

topiary也可以通过Cargo或Nix构建和运行，如果您已经安装了它们

echo '{"foo":"bar"}' | cargo run -- fmt --language json echo '{"foo":"bar"}' | nix run . -- fmt --language json

它将输出以下格式化后的代码

{ "foo": "bar" }

配置

Topiary使用languages.toml文件进行配置。Topiary最多检查四个来源的此类文件。

配置来源

在构建时，此存储库根目录中的languages.toml将被嵌入到Topiary中。此文件在运行时被解析。此languages.toml文件的目的是为Topiary的用户（无论是库还是二进制文件）提供合理的默认值。

接下来的两个是在运行时由Topiary二进制文件读取的，允许用户根据需要配置Topiary。第一个是针对用户特定的，因此可以在操作系统的配置目录中找到

操作系统典型配置路径

Unix /home/alice/.config/topiary/languages.toml

Windows C:\Users\Alice\AppData\Roaming\Topiary\config\languages.toml

macOS /Users/Alice/Library/Application Support/Topiary/languages.toml

此文件不是由Topiary自动创建的。

下一个源旨在成为Topiary的项目特定设置文件。当在某个目录中运行Topiary时，它将遍历文件树，直到找到一个.topiary目录。然后它将读取该目录中存在的任何languages.toml文件。

最后，可以使用-C/--configuration命令行参数（或TOPIARY_CONFIG_FILE环境变量）指定显式配置文件。这适用于在特定情况下驱动Topiary。

Topiary二进制按照以下顺序解析这些源。合并匹配项所采取的操作取决于合并模式。

内置配置文件。

操作系统配置目录中的用户配置文件。

项目特定的Topiary配置。

作为CLI参数指定的显式配置文件。

配置选项

配置文件包含语言列表，每个语言配置以[[language]]开头。例如，Nickel的配置如下定义

[[language]] name = "nickel" extensions = ["ncl"]

name字段由Topiary用来将语言条目与查询文件和Tree-sitter语法关联起来。此值应小写书写。每个配置文件中的每个[[language]]块中的name字段是必需的。

每个语言的扩展列表是必需的，但并不一定需要存在于每个配置文件中。对于每种语言，只要有一个配置文件定义了该语言的扩展列表就足够了。

存在一个可选的最终字段，称为indent，用于定义该语言的缩进方法。如果Topiary在特定语言的任何配置文件中找不到缩进字段，则默认为两个空格" "。

配置合并

当从多个来源解析配置时，Topiary可以以不同的方式合并匹配的配置项（根据语言名称匹配）。合并模式由--configuration-collation命令行参数（或TOPIARY_CONFIG_COLLATION环境变量）设置。

不同的模式最好通过示例来说明。考虑以下两个配置，按优先级从低到高排序（以下注释仅供参考）

# Lowest priority configuration [[language]] name = "example" extensions = ["eg"] [[language]] name = "demo" extensions = ["demo"]

# Highest priority configuration [[language]] name = "example" extensions = ["example"] indent = " "

合并模式（默认）

匹配项从优先级较高的源更新，集合合并为集合的并集。

# For the "example" language: # * The collated extensions is the union of the source extensions # * The indentation is taken from the highest priority source [[language]] name = "example" extensions = ["eg", "example"] indent = " " # The "demo" language is unchanged [[language]] name = "demo" extensions = ["demo"]

修订模式

匹配项（包括集合）由优先级较高的源替代。

# The "example" language's values are taken from the highest priority source [[language]] name = "example" extensions = ["example"] indent = " " # The "demo" language is unchanged [[language]] name = "demo" extensions = ["demo"]

覆盖模式

采用最高优先级的源。丢弃来自较低优先级源的所有值。

# The "example" language's values are taken from the highest priority source [[language]] name = "example" extensions = ["example"] indent = " " # The "demo" language does not exist in the highest priority source, so is omitted

设计

只要为一种语言定义了Tree-sitter语法，Tree-sitter就可以解析它并提供具体的语法树（CST）。Tree-sitter还将允许我们对该树运行查询。我们可以利用这一点来定义语言的格式化方式。以下是一个查询示例

[ (infix_operator) "if" ":" ] @append_space

这将匹配语法已识别为 infix_operator 的任何节点，以及包含 if 或 : 的任何匿名节点。匹配将以 @append_space 的名称捕获。我们的格式化程序将遍历所有匹配项和捕获，当我们处理任何名为 @append_space 的捕获时，我们将在匹配节点之后添加一个空格。

格式化程序会遍历 CST 节点并检测所有跨越多行的节点。这被视为程序员编写输入时的一个指示，即相关节点应格式化为多行。任何其他节点将被格式化为单行。每当查询匹配插入了一个软行，如果节点是多行的，则将其扩展为换行符；如果节点是单行的，则根据是否使用了 @append_spaced_softline 或 @append_empty_softline，将扩展为空格或无内容。

在渲染输出之前，格式化程序将执行一些清理操作，例如将连续的空格和新行缩减为一个，删除行尾和行首的空白行，并使缩进和新行指令一致。

这意味着例如，您可以在 if 和 true 前后添加空格，而我们仍然会输出只有一个空格分隔单词的 if true。

支持的捕获指令

这假设您已经熟悉 Tree-sitter 查询语言。

请注意，捕获将放在与之关联的节点之后。如果您想在节点前添加空格，可以这样操作

(infix_operator) @prepend_space

另一方面，这不会起作用

@append_space (infix_operator)

@allow_blank_line_before

如果输入中指定了，则允许匹配节点之前有空白行。对于任何其他节点，将删除空白行。

示例

; Allow comments and type definitions to have a blank line above them [ (comment) (type_definition) ] @allow_blank_line_before

@append_delimiter / @prepend_delimiter

将向匹配节点附加一个分隔符。分隔符必须使用谓词 #delimiter! 指定。

示例

; Put a semicolon delimiter after field declarations, unless they already have ; one, in which case we do nothing. ( (field_declaration) @append_delimiter . ";"* @do_nothing (#delimiter! ";") )

如果已经有一个分号，则将激活 @do_nothing 指令并防止查询（此处为 @append_delimiter）中的其他指令应用。否则，";"* 捕获无内容，在这种情况下，相关的指令（@do_nothing）不会激活。

请注意，当分隔符设置为 " "（即空格）时，@append_delimiter 与 @append_space 相同。

@append_multiline_delimiter / @prepend_multiline_delimiter

将向匹配节点附加一个仅适用于多行的分隔符。它仅在多行节点中打印，在单行节点中省略。必须使用谓词 #delimiter! 指定分隔符。

示例

; Add a semicolon at the end of lists only if they are multi-line, to avoid [1; 2; 3;]. (list_expression (#delimiter! ";") (_) @append_multiline_delimiter . ";"? @do_nothing . "]" . )

如果已经存在分号，则将激活 @do_nothing 指令，从而防止查询（此处为 @append_multiline_delimiter）中的其他指令生效。同样，如果节点是单行的，也不会添加分隔符。

@append_empty_softline / @prepend_empty_softline

匹配的节点将添加或前置一个空软行。对于多行节点，这将被扩展为换行，对于单行节点则无变化。

示例

; Put an empty softline before dots, so that in multi-line constructs we start ; new lines for each dot. (_ "." @prepend_empty_softline )

@append_hardline / @prepend_hardline

匹配的节点将添加或前置一个换行。

示例

; Consecutive definitions must be separated by line breaks ( (value_definition) @append_hardline . (value_definition) )

@append_indent_start / @prepend_indent_start

匹配的节点将在其前后触发缩进。这仅适用于之后的行，直到有缩进结束的信号。如果同一行上开始和结束缩进，则不会发生任何操作。这很有用，因为我们无论节点是格式化为单行还是多行，都能得到正确的行为。所有缩进的开始和结束都应该是平衡的。

示例

; Start an indented block after these [ "begin" "else" "then" "{" ] @append_indent_start

@append_indent_end / @prepend_indent_end

匹配的节点将在其前后触发缩进结束。

示例

; End the indented block before these [ "end" "}" ] @prepend_indent_end ; End the indented block after these [ (else_clause) (then_clause) ] @append_indent_end

@append_input_softline / @prepend_input_softline

匹配的节点将添加或前置一个输入软行。输入软行是输入文档中节点前面有换行符时的换行符，否则是空格。

示例

; Input softlines before and after all comments. This means that the input ; decides if a comment should have line breaks before or after. But don't put a ; softline directly in front of commas or semicolons. (comment) @prepend_input_softline ( (comment) @append_input_softline . [ "," ";" ]* @do_nothing )

@append_space / @prepend_space

匹配的节点将添加或前置一个空格。注意，这与 @append_delimiter / @prepend_delimiter 相同，其中分隔符是空格。

示例

[ (infix_operator) "if" ":" ] @append_space

@append_antispace / @prepend_antispace

通常情况下，标记需要与空格相邻，除了少数孤立的环境中。为了避免编写复杂的规则来列举每个例外，可以使用 @append_antispace / @prepend_antispace 插入“反空格”，这将破坏该节点上的任何空格（不是换行符），包括其他格式化规则添加的空格。

示例

[ "," ";" ":" "." ] @prepend_antispace

@append_spaced_softline / @prepend_spaced_softline

匹配的节点将添加或前置一个带空格的软行。对于多行节点，这将被扩展为换行，对于单行节点则扩展为空格。

示例

; Append spaced softlines, unless there is a comment following. ( [ "begin" "else" "then" "->" "{" ";" ] @append_spaced_softline . (comment)* @do_nothing )

@删除

从输出中删除匹配的节点。

示例

; Move semicolon after comments. ( ";" @delete . (comment)+ @append_delimiter (#delimiter! ";") )

@什么都不做

如果查询中匹配的任何捕获都是 @do_nothing，则忽略匹配。

示例

; Put a semicolon delimiter after field declarations, unless they already have ; one, in which case we do nothing. ( (field_declaration) @append_delimiter . ";"* @do_nothing (#delimiter! ";") )

@多行缩进全部

用于注释或其他叶子节点，表示应缩进其所有行，而不仅仅是第一行。

示例

(#language! ocaml) (comment) @multi_line_indent_all

@单行无缩进

匹配的节点将单独打印在单行上，不进行缩进。

示例

(#language! ocaml) ; line number directives must be alone on their line, and can't be indented (line_number_directive) @single_line_no_indent

理解不同的换行捕获

类型单行上下文多行上下文

硬行换行换行

空软行无换行

带空格的软行空格换行

输入软行输入相关输入相关

“输入软行”在目标节点在输入后跟随换行时作为换行渲染。否则，作为空格渲染。

示例

考虑以下JSON，它已经被手动格式化以展示不同换行捕获名称在哪些上下文中操作

{ "single-line": [1, 2, 3, 4], "multi-line": [ 1, 2, 3 , 4 ] }

我们将应用一组简化的JSON格式查询，

为对象打开（并关闭）缩进块；

每个键值对都占用一行，值被分割到第二行；

在数组分隔符上应用不同的换行捕获名称。

也就是说，遍历每个@NEWLINE类型，我们应用以下

(#language! json) (object . "{" @append_hardline @append_indent_start) (object "}" @prepend_hardline @prepend_indent_end .) (object (pair) @prepend_hardline) (pair . _ ":" @append_hardline) (array "," @NEWLINE)

前两条格式化规则只是为了清晰起见。最后一条规则才是重要的；其结果如下所示

@append_hardline

{ "single-line": [1, 2, 3, 4], "multi-line": [1, 2, 3, 4] }

@prepend_hardline

{ "single-line": [1 ,2 ,3 ,4], "multi-line": [1 ,2 ,3 ,4] }

@append_empty_softline

{ "single-line": [1,2,3,4], "multi-line": [1, 2, 3, 4] }

@prepend_empty_softline

{ "single-line": [1,2,3,4], "multi-line": [1 ,2 ,3 ,4] }

@append_spaced_softline

{ "single-line": [1, 2, 3, 4], "multi-line": [1, 2, 3, 4] }

@prepend_spaced_softline

{ "single-line": [1 ,2 ,3 ,4], "multi-line": [1 ,2 ,3 ,4] }

@append_input_softline

{ "single-line": [1, 2, 3, 4], "multi-line": [1, 2, 3, 4] }

@prepend_input_softline

{ "single-line": [1 ,2 ,3 ,4], "multi-line": [1 ,2 ,3 ,4] }

自定义作用域和软行

到目前为止，我们已经根据它们关联的CST节点是否多行将软行扩展到换行符。有时，CST节点定义的作用域要么太大，要么太小，不适合我们的需求。例如，考虑以下OCaml代码片段

(1,2, 3)

其CST如下所示

{Node parenthesized_expression (0, 0) - (1, 2)} - Named: true {Node ( (0, 0) - (0, 1)} - Named: false {Node product_expression (0, 1) - (1, 1)} - Named: true {Node product_expression (0, 1) - (0, 4)} - Named: true {Node number (0, 1) - (0, 2)} - Named: true {Node , (0, 2) - (0, 3)} - Named: false {Node number (0, 3) - (0, 4)} - Named: true {Node , (0, 4) - (0, 5)} - Named: false {Node number (1, 0) - (1, 1)} - Named: true {Node ) (1, 1) - (1, 2)} - Named: false

我们想在第一个逗号后添加一个换行符，但由于CST结构是嵌套的，包含此逗号（product_expression (0, 1) - (0, 4)）的节点不是多行的。只有顶级节点product_expression (0, 1) - (1, 1)是多行的。

为了解决这个问题，我们引入用户定义的作用域和软行。

@prepend_begin_scope / @append_begin_scope / @prepend_end_scope / @append_end_scope

这些标签用于定义自定义作用域。与#scope_id! 谓词结合使用，它们定义可以跨越多个CST节点或仅跨越一个节点部分的作用域。例如，此作用域匹配parenthesized_expression中的括号之间的任何内容

(parenthesized_expression "(" @append_begin_scope ")" @prepend_end_scope (#scope_id! "tuple") )

作用域软行

我们有四个谓词，在自定义作用域中使用时，与#scope_id!谓词结合使用

@prepend_empty_scoped_softline

@prepend_spaced_scoped_softline

@append_empty_scoped_softline

@append_spaced_scoped_softline

当使用这些作用域软行之一时，它们的操作依赖于具有相应scope_id的最内层包含作用域。如果该作用域是多行的，则软行扩展为换行符。在其他方面，它们的行为与它们的非scoped对应物相同。

示例

此Tree-sitter查询

(#language! ocaml) (parenthesized_expression "(" @begin_scope @append_empty_softline @append_indent_start ")" @end_scope @prepend_empty_softline @prepend_indent_end (#scope_id! "tuple") ) (product_expression "," @append_spaced_scoped_softline (#scope_id! "tuple") )

...格式化此代码片段

(1,2, 3)

...如下所示

( 1, 2, 3 )

...而单行的(1, 2, 3)保持不变。
如果我们使用 @append_spaced_softline 而不是 @append_spaced_scoped_softline，则数字 1, 将后面跟着一个空格而不是换行符，因为它在单行 product_expression 内。使用谓词进行测试上下文有时，类似于软线的情况，我们希望查询只在没有换行符的单行上下文或多行上下文中匹配。Topiary 有几个谓词可以完成这个结果。 #single_line_only! / #multi_line_only! 这些谓词允许查询仅在匹配的节点位于单行（或多行）上下文中时触发。示例 ; Allow (and enforce) the optional "|" before the first match case ; in OCaml if and only if the context is multi-line ( "with" . "|" @delete . (match_case) (#single_line_only!) ) ( "with" . "|"? @do_nothing . (match_case) @prepend_delimiter (#delimiter! "| ") (#multi_line_only!) ) #single_line_scope_only! / #multi_line_scope_only! 这些谓词允许查询仅在关联的包含匹配节点的自定义作用域为单行（或多行）时触发。示例 ; Allow (and enforce) the optional "|" before the first match case ; in function expressions in OCaml if and only if the scope is multi-line (function_expression (match_case)? @do_nothing . "|" @delete . (match_case) (#single_line_scope_only! "function_definition") ) (function_expression "|"? @do_nothing . (match_case) @prepend_delimiter (#multi_line_scope_only! "function_definition") (#delimiter! "| ") ; sic ) 建议的工作流程为了有效地处理查询文件，以下是一种建议的工作方式将一个示例文件添加到 topiary-cli/tests/samples/input。将相同的文件复制到 topiary-cli/tests/samples/expected，并按照您想要输出的格式进行更改。如果是新语言，添加其 Tree-sitter 语法，扩展 crate::language::Language 并在所有地方进行处理，然后创建一个主要为空的查询文件，只需配置 (#language!)。运行 RUST_LOG=debug \ cargo test -p topiary-cli \ input_output_tester \ -- --nocapture 如果它工作正常，它应该输出大量的日志消息。将输出复制到文本编辑器中。你特别感兴趣的是以类似于以下行开始的 CST 输出：CST node: {Node compilation_unit (0, 0) - (5942, 0)} - Named: true。 💡 作为使用调试输出的替代方案，存在 vis 可视化子命令行选项，可以以各种格式输出 Tree-sitter 语法树。测试运行将输出实际输出和预期输出之间的所有差异，例如标记之间的缺少空格。选择您想要修复的差异，并找到输入文件中的行号和列号。 💡 请记住，CST 输出使用基于 0 的行和列号，因此如果您的编辑器报告行 40，列 37，您可能想要行 39，列 36。在 CST 调试或可视化输出中找到该区域的节点，如下所示 [DEBUG atom_collection] CST node: {Node constructed_type (39, 15) - (39, 42)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor_path (39, 15) - (39, 35)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor (39, 15) - (39, 35)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor_path (39, 36) - (39, 42)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor (39, 36) - (39, 42)} - Named: true 这可能表明您希望在所有 type_constructor_path 节点后面添加空格 (type_constructor_path) @append_space 或者，更有可能的是，您只想在成对节点之间添加空格 ( (type_constructor_path) @append_space . (type_constructor_path) ) 或者，也许您想在 constructed_type 的所有子节点之间添加空格 (constructed_type (_) @append_space . (_) ) 再次运行 cargo test，看看输出是否有所改善，然后返回步骤5。语法树可视化为了支持格式化查询的开发，可以使用命令行界面选项 --visualise 来生成给定输入的 Tree-sitter 语法树。目前支持 JSON 输出，包含与调试输出相同的信息，以及 GraphViz DOT 输出，这对于生成语法图很有用。（注意，可视化输出中的文本位置序列是1为基础的，与调试输出的0为基础的位置不同。）基于终端的游乐场 Nix 用户可能也会发现 playground.sh 脚本有助于辅助查询文件的交互式开发。在终端中运行时，它将使用请求的查询文件格式化给定的源输入，并更新任何针对这些文件的 inotify 事件上的输出。 Usage: ${PROGNAME} LANGUAGE [QUERY_FILE] [INPUT_SOURCE] LANGUAGE can be one of the supported languages (e.g., "ocaml", "rust", etc.). The packaged formatting queries for this language can be overridden by specifying a QUERY_FILE. The INPUT_SOURCE is optional. If not specified, it defaults to trying to find the bundled integration test input file for the given language. 例如，游乐场可以在 tmux 窗格中运行，同时打开您选择的编辑器。相关工具 Tree-Sitter 特定语法树游乐场：一个交互式、在线的 Tree-sitter 和其查询语言的实验平台。 Neovim Treesitter 游乐场：Neovim 的 Tree-sitter 游乐场插件。 Difftastic：一个利用 Tree-sitter 进行语法比较的工具。元和多语言格式化器 format-all：Emacs 的格式化器协调器。 null-ls.nvim：Neovim 的 LSP 框架，便于格式化器协调。 prettier：支持多种（与 Web 开发相关的）语言的格式化器。 treefmt：通用格式化器协调器，统一了格式化器接口。相关格式化器 gofmt：Go 的既定标准格式化器，也是我们格式化器风格的灵感来源。 ocamlformat：OCaml 的格式化器。 ocp-indent：用于缩进 OCaml 代码的工具。 Ormolu：我们的 Haskell 格式化器，遵循与 Topiary 类似的设计原则。 rustfmt：Rust 的既定标准格式化器。 shfmt：Bash 等的解析器、格式化器和解释器。

详细程度标志	日志级别
无	错误
`-v`	...以及警告
`-vv`	...以及信息
`-vvv`	...以及调试输出
`-vvvv`	...以及跟踪输出

原因	代码
未指定错误	1
CLI参数解析错误	2
I/O错误	3
Topiary查询错误	4
源解析错误	5
语言检测错误	6
幂等性错误	7
未指定格式化错误	8
多个错误	9

操作系统	典型配置路径
Unix	`/home/alice/.config/topiary/languages.toml`
Windows	`C:\Users\Alice\AppData\Roaming\Topiary\config\languages.toml`
macOS	`/Users/Alice/Library/Application Support/Topiary/languages.toml`

类型	单行上下文	多行上下文
硬行	换行	换行
空软行	无	换行
带空格的软行	空格	换行
输入软行	输入相关	输入相关

依赖关系 ~0–2.9MB ~54K SLoC js-sys wasm32 topiary-web-tree-sitter-sys wasm32 wasm-bindgen =0.2.91+strict-macro wasm32 web-sys wasm32 tree-sitter =0.20.10 not wasm32 dev wasm-bindgen-futures wasm32 dev wasm-bindgen-test wasm32

1个不稳定版本

Topiary

动机

设计原则

语言支持

受支持

实验性

入门指南

安装

设置预提交钩子

用法

格式

可视化

配置

Shell 完成脚本

日志记录

退出代码

示例

配置

配置来源

配置选项

配置合并

合并模式（默认）

修订模式

覆盖模式

设计

支持的捕获指令

@allow_blank_line_before

示例

@append_delimiter / @prepend_delimiter

示例

@append_multiline_delimiter / @prepend_multiline_delimiter

示例

@append_empty_softline / @prepend_empty_softline

示例

@append_hardline / @prepend_hardline

示例

@append_indent_start / @prepend_indent_start

示例

@append_indent_end / @prepend_indent_end

示例

@append_input_softline / @prepend_input_softline

示例

@append_space / @prepend_space

示例

@append_antispace / @prepend_antispace

示例

@append_spaced_softline / @prepend_spaced_softline

示例

@删除

示例

@什么都不做

示例

@多行缩进全部

示例

@单行无缩进

示例

理解不同的换行捕获

示例

@append_hardline

@prepend_hardline

@append_empty_softline

@prepend_empty_softline

@append_spaced_softline

@prepend_spaced_softline

@append_input_softline

@prepend_input_softline

自定义作用域和软行

@prepend_begin_scope / @append_begin_scope / @prepend_end_scope / @append_end_scope

作用域软行

示例

使用谓词进行测试上下文

#single_line_only! / #multi_line_only!

示例

#single_line_scope_only! / #multi_line_scope_only!

示例

建议的工作流程

语法树可视化

基于终端的游乐场

相关工具

`@allow_blank_line_before`

`@append_delimiter` / `@prepend_delimiter`

`@append_multiline_delimiter` / `@prepend_multiline_delimiter`

`@append_empty_softline` / `@prepend_empty_softline`

`@append_hardline` / `@prepend_hardline`

`@append_indent_start` / `@prepend_indent_start`

`@append_indent_end` / `@prepend_indent_end`

`@append_input_softline` / `@prepend_input_softline`

`@append_space` / `@prepend_space`

`@append_antispace` / `@prepend_antispace`

`@append_spaced_softline` / `@prepend_spaced_softline`

`@删除`

`@什么都不做`

`@多行缩进全部`

`@单行无缩进`

`@append_hardline`

`@prepend_hardline`

`@append_empty_softline`

`@prepend_empty_softline`

`@append_spaced_softline`

`@prepend_spaced_softline`

`@append_input_softline`

`@prepend_input_softline`

`@prepend_begin_scope` / `@append_begin_scope` / `@prepend_end_scope` / `@append_end_scope`

`#single_line_only!` / `#multi_line_only!`

`#single_line_scope_only!` / `#multi_line_scope_only!`