2个版本

0.4.1	2024年6月6日
0.4.0	2024年5月17日

在开发工具中排名第256

每月下载量137次
在 3 个开源软件中使用 (通过 nickel-lang-core)

MIT 许可证

120KB
2.5K SLoC

Topiary

Topiary 旨在为简单语言提供一个统一的格式化工具，作为 Tree-sitter 生态系统的一部分。它的名字来源于将树木修剪成奇特形状的艺术。

Topiary 是为格式化工具作者和用户设计的。作者可以为一种语言创建格式化工具，而无需编写自己的格式化引擎或解析器。用户可以从统一的代码风格中受益，并且有可能在代码库中方便地使用单个格式化工具，跨多种语言应用相似的格式。

动机

代码编写风格在历史上主要取决于个人选择。当然，这从定义上就是主观的，导致了在审查格式化选择上浪费了大量时间，而不是代码本身。规定的样式指南是早期解决方案之一，产生了检查开发者格式化的工具，最终导致了自动格式化工具的出现。其中，由 gofmt 开发者普及了这种做法，他们认为“足够好”的统一格式化，对代码库施加影响，在很大程度上解决了这些问题。

Topiary 跟随这一趋势，力求成为一个“通用格式化引擎”，允许开发者不仅能够使用统一的样式自动格式化代码库，还能使用一个简单的领域特定语言（DSL）定义新语言的格式化风格。这允许快速开发格式化工具，前提是该语言已定义了 Tree-sitter 语法。

设计原则

Topiary 的创建考虑到了以下目标

使用 Tree-sitter 进行解析，以避免为格式化器编写另一套语法。
预期具有幂等性。也就是说，已格式化代码的格式化不会改变任何内容。
为了使内置的格式化风格满足以下约束
- 与野外使用的该语言已证实的格式化风格兼容。
- 忠实于作者的意图：如果代码已经写成跨多行的形式，则保留该决定。
- 最小化提交之间的更改，以便差异主要关注已更改的代码，而不是表面的艺术作品。也就是说，一行上的更改不会影响其他行，而格式化不会在你修改代码时强迫你做出后来的、外观上的更改。
- 经过良好测试和健壮，以便格式化器可以在大型项目中受到信任。
对于最终用户（即不是格式化风格作者）来说，格式化器应该
- 指定一种格式化风格，虽然可以自定义，但对于代码库来说是统一且“足够好”的。
- 运行高效。
- 提供与其他开发者工具（如编辑器和语言服务器）简单集成的功能。

语言支持

目前，Topiary 目标语言的 Tree-sitter 语法是静态链接的。这些语言的格式化风格分为两个成熟度级别：支持和实验性。

支持

这些格式化风格涵盖了其目标语言，并满足了 Topiary 的设计目标。在 Topiary 中，它们通过命令行标志公开。

OCaml（包括实现和接口）
OCamllex
Nickel
JSON
TOML

实验性

这些语言的格式化风格可能会更改，并且/或者尚未被认为是生产就绪的。可以通过在 Topiary 中指定其查询文件的路径来访问它们。

入门指南

安装

可以从存储库目录使用 Cargo 构建和安装该项目

cargo install --path topiary-cli

Topiary 需要找到语言查询文件（.scm）以正常工作。默认情况下，topiary 在当前工作目录中查找名为 languages 的目录。

如果你不是从这个存储库以外的目录运行 Topiary，这将不起作用。为了无限制地使用 Topiary，你必须设置环境变量 TOPIARY_LANGUAGE_DIR 以指向 Topiary 的语言查询文件（.scm）所在的目录。默认情况下，你应该将其设置为 <topiary 存储库的本地路径>/topiary-queries/queries，例如

export TOPIARY_LANGUAGE_DIR=/home/me/tools/topiary/topiary-queries/queries
topiary fmt ./projects/helloworld/hello.ml

TOPIARY_LANGUAGE_DIR也可以在构建时设置。Topiary会自动选择相应的路径并将其嵌入到topiary二进制文件中。在这种情况下，您不再需要担心在运行时提供TOPIARY_LANGUAGE_DIR。如果TOPIARY_LANGUAGE_DIR在构建时设置并在运行时也设置，则运行时值具有优先权。

有关设置开发环境的详细信息，请参阅CONTRIBUTING.md。

作为pre-commit钩子设置

Topiary可以无缝集成到pre-commit-hooks.nix：将Topiary作为输入添加到您的flake中，并在pre-commit-hooks.nix的设置中使用

pre-commit-check = nix-pre-commit-hooks.run {
  hooks = {
    nixfmt.enable = true; ## keep your normal hooks
    ...
    ## Add the following:
    topiary = topiary.lib.${system}.pre-commit-hook;
  };
};

用法

Topiary CLI使用多个子命令来定义功能。这些可以通过topiary --help列出；每个子命令都有自己的专用帮助文本。

CLI app for Topiary, the universal code formatter.

Usage: topiary [OPTIONS] <COMMAND>

Commands:
  format      Format inputs
  visualise   Visualise the input's Tree-sitter parse tree
  config      Print the current configuration
  completion  Generate shell completion script
  help        Print this message or the help of the given subcommand(s)

Options:
  -C, --configuration <CONFIGURATION>
          Configuration file

          [env: TOPIARY_CONFIG_FILE]

      --configuration-collation <CONFIGURATION_COLLATION>
          Configuration collation mode

          [env: TOPIARY_CONFIG_COLLATION]
          [default: merge]

          Possible values:
          - merge:    When multiple sources of configuration are available, matching items are
            updated from the higher priority source, with collections merged as the union of sets
          - revise:   When multiple sources of configuration are available, matching items
            (including collections) are superseded from the higher priority source
          - override: When multiple sources of configuration are available, the highest priority
            source is taken. All values from lower priority sources are discarded

  -v, --verbose...
          Logging verbosity (increased per occurrence)

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

格式

Format inputs

Usage: topiary format [OPTIONS] <--language <LANGUAGE>|FILES>

Arguments:
  [FILES]...
          Input files and directories (omit to read from stdin)

          Language detection and query selection is automatic, mapped from file extensions defined
          in the Topiary configuration.

Options:
  -t, --tolerate-parsing-errors
          Consume as much as possible in the presence of parsing errors

  -s, --skip-idempotence
          Do not check that formatting twice gives the same output

  -l, --language <LANGUAGE>
          Topiary language identifier (for formatting stdin)

  -q, --query <QUERY>
          Topiary query file override (when formatting stdin)

  -C, --configuration <CONFIGURATION>
          Configuration file

          [env: TOPIARY_CONFIG_FILE]

      --configuration-collation <CONFIGURATION_COLLATION>
          Configuration collation mode

          [env: TOPIARY_CONFIG_COLLATION]
          [default: merge]

          Possible values:
          - merge:    When multiple sources of configuration are available, matching items are
            updated from the higher priority source, with collections merged as the union of sets
          - revise:   When multiple sources of configuration are available, matching items
            (including collections) are superseded from the higher priority source
          - override: When multiple sources of configuration are available, the highest priority
            source is taken. All values from lower priority sources are discarded

  -v, --verbose...
          Logging verbosity (increased per occurrence)

  -h, --help
          Print help (see a summary with '-h')

当格式化磁盘上的输入时，语言选择是从输入文件的扩展名检测到的。要格式化标准输入，您必须指定--language和可选的--query参数，省略任何输入文件。

注意：fmt是format子命令的已识别别名。

可视化

Visualise the input's Tree-sitter parse tree

Usage: topiary visualise [OPTIONS] <--language <LANGUAGE>|FILE>

Arguments:
  [FILE]
          Input file (omit to read from stdin)

          Language detection and query selection is automatic, mapped from file extensions defined
          in the Topiary configuration.

Options:
  -f, --format <FORMAT>
          Visualisation format

          [default: dot]

          Possible values:
          - dot:  GraphViz DOT serialisation
          - json: JSON serialisation

  -l, --language <LANGUAGE>
          Topiary language identifier (for formatting stdin)

  -q, --query <QUERY>
          Topiary query file override (when formatting stdin)

  -C, --configuration <CONFIGURATION>
          Configuration file

          [env: TOPIARY_CONFIG_FILE]

      --configuration-collation <CONFIGURATION_COLLATION>
          Configuration collation mode

          [env: TOPIARY_CONFIG_COLLATION]
          [default: merge]

          Possible values:
          - merge:    When multiple sources of configuration are available, matching items are
            updated from the higher priority source, with collections merged as the union of sets
          - revise:   When multiple sources of configuration are available, matching items
            (including collections) are superseded from the higher priority source
          - override: When multiple sources of configuration are available, the highest priority
            source is taken. All values from lower priority sources are discarded

  -v, --verbose...
          Logging verbosity (increased per occurrence)

  -h, --help
          Print help (see a summary with '-h')

当从磁盘可视化输入时，语言选择是从输入文件的扩展名检测到的。要可视化标准输入，您必须指定--language和可选的--query参数，省略输入文件。可视化输出将写入标准输出。

注意：vis、visualize和view是visualise子命令的已识别别名。

配置

Print the current configuration

Usage: topiary config [OPTIONS]

Options:
  -C, --configuration <CONFIGURATION>
          Configuration file

          [env: TOPIARY_CONFIG_FILE]

      --configuration-collation <CONFIGURATION_COLLATION>
          Configuration collation mode

          [env: TOPIARY_CONFIG_COLLATION]
          [default: merge]

          Possible values:
          - merge:    When multiple sources of configuration are available, matching items are
            updated from the higher priority source, with collections merged as the union of sets
          - revise:   When multiple sources of configuration are available, matching items
            (including collections) are superseded from the higher priority source
          - override: When multiple sources of configuration are available, the highest priority
            source is taken. All values from lower priority sources are discarded

  -v, --verbose...
          Logging verbosity (increased per occurrence)

  -h, --help
          Print help (see a summary with '-h')

请参阅下面的配置部分，了解不同配置来源和合并模式。

注意：cfg是config子命令的已识别别名。

Shell补全

可以使用completion子命令生成Topiary的Shell补全脚本。输出可以根据需要将其添加到您的Shell会话或配置文件中。

Generate shell completion script

Usage: topiary completion [OPTIONS] [SHELL]

Arguments:
  [SHELL]
          Shell (omit to detect from the environment)

          [possible values: bash, elvish, fish, powershell, zsh]

Options:
  -C, --configuration <CONFIGURATION>
          Configuration file

          [env: TOPIARY_CONFIG_FILE]

      --configuration-collation <CONFIGURATION_COLLATION>
          Configuration collation mode

          [env: TOPIARY_CONFIG_COLLATION]
          [default: merge]

          Possible values:
          - merge:    When multiple sources of configuration are available, matching items are
            updated from the higher priority source, with collections merged as the union of sets
          - revise:   When multiple sources of configuration are available, matching items
            (including collections) are superseded from the higher priority source
          - override: When multiple sources of configuration are available, the highest priority
            source is taken. All values from lower priority sources are discarded

  -v, --verbose...
          Logging verbosity (increased per occurrence)

  -h, --help
          Print help (see a summary with '-h')

例如，在Bash中

source <(topiary completion)

日志记录

默认情况下，Topiary CLI只会输出错误消息。您可以通过相应的-v/--verbose标志来增加日志记录的详细程度

详细程度标志	日志记录级别
无	错误
`-v`	...以及警告
`-vv`	...以及信息
`-vvv`	...以及调试输出
`-vvvv`	...以及跟踪输出

退出代码

在成功格式化后，Topiary进程将使用零退出代码退出。否则，定义以下退出代码

原因	代码
未指定错误	1
CLI参数解析错误	2
I/O错误	3
Topiary查询错误	4
源解析错误	5
语言检测错误	6
幂等性错误	7
未指定格式化错误	8
多个错误	9

当提供多个输入时，即使出现错误，Topiary也会尽力处理它们。如果发生任何错误，Topiary将返回非零退出代码。有关这些错误性质的更多详细信息，请在warn日志级别（使用-v）运行Topiary。

示例

一旦构建完成，程序可以按以下方式运行

echo '{"foo":"bar"}' | topiary fmt --language json

如果你已经安装了Cargo或Nix，可以使用它们从源代码构建和运行topiary

echo '{"foo":"bar"}' | cargo run -- fmt --language json
echo '{"foo":"bar"}' | nix run . -- fmt --language json

它将输出以下格式化的代码

{ "foo": "bar" }

配置

Topiary通过languages.toml文件进行配置。Topiary最多检查四个位置来查找此类文件。

配置来源

在构建时，此存储库根目录下的languages.toml被嵌入到Topiary中。此文件在运行时被解析。此languages.toml文件的作用是为Topiary的用户（库和二进制文件）提供合理的默认值。

接下来的两个文件由Topiary二进制文件在运行时读取，允许用户根据需要配置Topiary。第一个是为用户特定的，因此可以在操作系统的配置目录中找到

操作系统	典型配置路径
Unix	`/home/alice/.config/topiary/languages.toml`
Windows	`C:\Users\Alice\AppData\Roaming\Topiary\config\languages.toml`
macOS	`/Users/Alice/Library/Application Support/Topiary/languages.toml`

此文件不是由Topiary自动创建的。

下一个来源是为了Topiary的特定项目设置文件。在某个目录下运行Topiary时，它会遍历文件树，直到找到.topiary目录。然后，它会读取该目录中存在的任何languages.toml文件。

最后，可以使用-C/--configuration命令行参数（或TOPIARY_CONFIG_FILE环境变量）指定显式的配置文件。这适用于非常特定的使用案例。

Topiary二进制文件按照以下顺序解析这些来源。对匹配项合并采取的操作取决于合并模式。

内置配置文件。
操作系统配置目录中的用户配置文件。
特定项目的Topiary配置。
作为CLI参数指定的显式配置文件。

配置选项

配置文件包含一个语言列表，每个语言配置都以[[language]]开头。例如，Nickel的配置如下所示

[[language]]
name = "nickel"
extensions = ["ncl"]

name字段由Topiary用于将语言条目与查询文件和Tree-sitter语法关联起来。此值应小写。每个配置文件中的每个[[language]]块中的name字段是必需的。

每个语言的扩展名列表是必需的，但并不一定需要在每个配置文件中都存在。对于每种语言，如果有单个配置文件定义了该语言的扩展名列表就足够了。

一个最后的可选字段称为indent，用于定义该语言的缩进方法。如果Topiary在任何配置文件中找不到特定语言的缩进字段，它将默认为两个空格" "。

配置合并

当从多个来源解析配置时，Topiary可以以各种方式合并匹配的配置项（根据语言名称匹配）。合并模式由--configuration-collation命令行参数（或TOPIARY_CONFIG_COLLATION环境变量）设置。

不同的模式最好通过示例来解释。考虑以下两个配置，按优先级从低到高排列（以下添加了注释以供说明）

# Lowest priority configuration

[[language]]
name = "example"
extensions = ["eg"]

[[language]]
name = "demo"
extensions = ["demo"]

# Highest priority configuration

[[language]]
name = "example"
extensions = ["example"]
indent = "    "

合并模式（默认）

匹配项从优先级更高的源更新，集合合并为集合的并集。

# For the "example" language:
# * The collated extensions is the union of the source extensions
# * The indentation is taken from the highest priority source
[[language]]
name = "example"
extensions = ["eg", "example"]
indent = "    "

# The "demo" language is unchanged
[[language]]
name = "demo"
extensions = ["demo"]

修订模式

匹配项（包括集合）由优先级更高的源替代。

# The "example" language's values are taken from the highest priority source
[[language]]
name = "example"
extensions = ["example"]
indent = "    "

# The "demo" language is unchanged
[[language]]
name = "demo"
extensions = ["demo"]

覆盖模式

采用最高优先级的源。丢弃来自优先级较低源的所有值。

# The "example" language's values are taken from the highest priority source
[[language]]
name = "example"
extensions = ["example"]
indent = "    "

# The "demo" language does not exist in the highest priority source, so is omitted

设计

只要为一种语言定义了Tree-sitter语法，Tree-sitter就可以解析它并提供一个具体的语法树（CST）。Tree-sitter还会允许我们对这个树运行查询。我们可以利用这一点来定义语言应该如何格式化。以下是一个示例查询

[
  (infix_operator)
  "if"
  ":"
] @append_space

这将匹配语法已标识为infix_operator的任何节点，以及包含if或:的任何匿名节点。匹配将以@append_space的名称捕获。我们的格式化器会遍历所有匹配项和捕获，当我们处理任何名为@append_space的捕获时，我们将在匹配节点后添加一个空格。

格式化器遍历CST节点并检测所有跨越多行的节点。这被解释为程序员对输入的指示，即相关节点应该格式化为多行。任何其他节点将以单行格式化。每当查询匹配插入一个软换行符时，如果节点是多行的，则将其扩展为换行符，如果节点是单行的，则根据是否使用了@append_spaced_softline或@append_empty_softline，则将其扩展为空格或无内容。

在渲染输出之前，格式化器将执行多项清理操作，例如将连续的空格和换行符缩减为一个，修剪行尾的空格和前后空白行，并使缩进和换行指令保持一致。

这意味着例如，您可以在if和true前追加和附加空格，我们仍然会在单词之间输出只有一个空格的if true。

支持的捕获指令

这假设您已经熟悉Tree-sitter查询语言。

注意，捕获放在与它关联的节点之后。如果您想在节点前添加空格，可以这样操作

(infix_operator) @prepend_space

另一方面，这将不起作用

@append_space (infix_operator)

`@allow_blank_line_before`

如果输入中指定，匹配的节点将被允许在它们之前有一个空白行。对于任何其他节点，空白行将被删除。

示例

; Allow comments and type definitions to have a blank line above them
[
  (comment)
  (type_definition)
] @allow_blank_line_before

`@append_delimiter` / `@prepend_delimiter`

匹配的节点将附加一个分隔符。分隔符必须使用谓词#delimiter!指定。

示例

; Put a semicolon delimiter after field declarations, unless they already have
; one, in which case we do nothing.
(
  (field_declaration) @append_delimiter
  .
  ";"* @do_nothing
  (#delimiter! ";")
)

如果已经存在分号，则将激活 @do_nothing 指令，并防止查询中的其他指令（如这里的 @append_delimiter）应用。否则，";"* 不会捕获任何内容，在这种情况下，相关的指令（@do_nothing）不会激活。

注意，当分隔符设置为 " "（即空格）时，@append_delimiter 与 @append_space 是相同的。

`@append_multiline_delimiter` / `@prepend_multiline_delimiter`

匹配的节点将附加一个多行仅分隔符。它仅在多行节点中打印，在单行节点中省略。必须使用谓词 #delimiter! 指定分隔符。

示例

; Add a semicolon at the end of lists only if they are multi-line, to avoid [1; 2; 3;].
(list_expression
  (#delimiter! ";")
  (_) @append_multiline_delimiter
  .
  ";"? @do_nothing
  .
  "]"
  .
)

如果已经存在分号，则将激活 @do_nothing 指令，并防止查询中的其他指令（如这里的 @append_multiline_delimiter）应用。同样，如果节点是单行的，分隔符也不会附加。

`@append_empty_softline` / `@prepend_empty_softline`

匹配的节点将附加或前置一个空软行。对于多行节点，这将展开为换行符，对于单行节点则不展开。

示例

; Put an empty softline before dots, so that in multi-line constructs we start
; new lines for each dot.
(_
  "." @prepend_empty_softline
)

`@append_hardline` / `@prepend_hardline`

匹配的节点将附加或前置一个换行符。

示例

; Consecutive definitions must be separated by line breaks
(
  (value_definition) @append_hardline
  .
  (value_definition)
)

`@append_indent_start` / `@prepend_indent_start`

匹配的节点将在它们之前或之后触发缩进。这仅适用于之后的行，直到发出缩进结束信号。如果在同一行开始和结束缩进，则不会发生任何操作。这很有用，因为我们无论节点是单行还是多行格式化，都能获得正确的行为。所有缩进的开始和结束必须平衡。

示例

; Start an indented block after these
[
  "begin"
  "else"
  "then"
  "{"
] @append_indent_start

`@append_indent_end` / `@prepend_indent_end`

匹配的节点将触发缩进在它们之前或之后结束。

示例

; End the indented block before these
[
  "end"
  "}"
] @prepend_indent_end

; End the indented block after these
[
  (else_clause)
  (then_clause)
] @append_indent_end

`@append_input_softline` / `@prepend_input_softline`

匹配的节点将附加或前置一个输入软行。如果节点在输入文档的前面有一个换行符，则输入软行是一个换行符，否则它是一个空格。

示例

; Input softlines before and after all comments. This means that the input
; decides if a comment should have line breaks before or after. But don't put a
; softline directly in front of commas or semicolons.

(comment) @prepend_input_softline

(
  (comment) @append_input_softline
  .
  [ "," ";" ]* @do_nothing
)

`@append_space` / `@prepend_space`

匹配的节点将附加或前置一个空格。请注意，这与 @append_delimiter / @prepend_delimiter 相同，其中分隔符是空格。

示例

[
  (infix_operator)
  "if"
  ":"
] @append_space

`@append_antispace` / `@prepend_antispace`

通常，标记需要与空格相邻，除了少数孤立的环境。而不是编写复杂的规则来枚举每个异常，可以使用 @append_antispace / @prepend_antispace 插入“反空格”；这将销毁该节点上所有的空格（不是换行符），包括其他格式化规则添加的空格。

示例

[
  ","
  ";"
  ":"
  "."
] @prepend_antispace

`@append_spaced_softline` / `@prepend_spaced_softline`

匹配的节点将被添加或插入一个带有空格的软线。对于多行节点，这将被展开为一个换行符；对于单行节点，这将被展开为一个空格。

示例

; Append spaced softlines, unless there is a comment following.
(
  [
    "begin"
    "else"
    "then"
    "->"
    "{"
    ";"
  ] @append_spaced_softline
  .
  (comment)* @do_nothing
)

`@删除`

从输出中删除匹配的节点。

示例

; Move semicolon after comments.
(
  ";" @delete
  .
  (comment)+ @append_delimiter
  (#delimiter! ";")
)

`@不执行任何操作`

如果查询中匹配的任何捕获为 @do_nothing，则将忽略匹配。

示例

; Put a semicolon delimiter after field declarations, unless they already have
; one, in which case we do nothing.
(
  (field_declaration) @append_delimiter
  .
  ";"* @do_nothing
  (#delimiter! ";")
)

`@多行缩进所有内容`

用于注释或其他叶子节点，表示我们应该缩进所有行，而不仅仅是第一行。

示例

(#language! ocaml)
(comment) @multi_line_indent_all

`@单行不缩进`

匹配的节点将被单独打印在单行上，没有缩进。

示例

(#language! ocaml)
; line number directives must be alone on their line, and can't be indented
(line_number_directive) @single_line_no_indent

理解不同的换行捕获

类型	单行上下文	多行上下文
硬行	换行符	换行符
空软线	无	换行符
带空格的软线	空格	换行符
输入软线	输入相关	输入相关

"输入软线"在目标节点后跟换行符时渲染为新行。否则，它们被渲染为空格。

示例

考虑以下JSON，它已被手动格式化以展示不同换行捕获名称操作的每个上下文

{
  "single-line": [1, 2, 3, 4],
  "multi-line": [
    1, 2,
    3
    , 4
  ]
}

我们将应用一组简化的JSON格式查询

为对象打开（和关闭）缩进的块；
每个键值对都独占一行，值被分割到第二行；
在数组分隔符上应用不同的换行捕获名称。

即，遍历每个 @NEWLINE 类型，我们应用以下规则

(#language! json)

(object . "{" @append_hardline @append_indent_start)
(object "}" @prepend_hardline @prepend_indent_end .)
(object (pair) @prepend_hardline)
(pair . _ ":" @append_hardline)

(array "," @NEWLINE)

前两个格式化规则只是为了清晰起见。最后一个规则才是重要的；其结果如下所示

`@追加硬行`

{
  "single-line":
  [1,
  2,
  3,
  4],
  "multi-line":
  [1,
  2,
  3,
  4]
}

`@前置硬行`

{
  "single-line":
  [1
  ,2
  ,3
  ,4],
  "multi-line":
  [1
  ,2
  ,3
  ,4]
}

`@追加空软线`

{
  "single-line":
  [1,2,3,4],
  "multi-line":
  [1,
  2,
  3,
  4]
}

`@前置空软线`

{
  "single-line":
  [1,2,3,4],
  "multi-line":
  [1
  ,2
  ,3
  ,4]
}

`@追加带空格的软线`

{
  "single-line":
  [1, 2, 3, 4],
  "multi-line":
  [1,
  2,
  3,
  4]
}

`@前置带空格的软线`

{
  "single-line":
  [1 ,2 ,3 ,4],
  "multi-line":
  [1
  ,2
  ,3
  ,4]
}

`@追加输入软线`

{
  "single-line":
  [1, 2, 3, 4],
  "multi-line":
  [1, 2,
  3, 4]
}

`@前置输入软线`

{
  "single-line":
  [1 ,2 ,3 ,4],
  "multi-line":
  [1 ,2 ,3
  ,4]
}

自定义作用域和软线

到目前为止，我们已经根据与之关联的CST节点是多行还是单行将软线展开为换行符。有时，CST节点定义的作用域对于我们的需求来说太大或太小。例如，考虑以下OCaml代码片段

(1,2,
3)

其CST如下

{Node parenthesized_expression (0, 0) - (1, 2)} - Named: true
  {Node ( (0, 0) - (0, 1)} - Named: false
  {Node product_expression (0, 1) - (1, 1)} - Named: true
    {Node product_expression (0, 1) - (0, 4)} - Named: true
      {Node number (0, 1) - (0, 2)} - Named: true
      {Node , (0, 2) - (0, 3)} - Named: false
      {Node number (0, 3) - (0, 4)} - Named: true
    {Node , (0, 4) - (0, 5)} - Named: false
    {Node number (1, 0) - (1, 1)} - Named: true
  {Node ) (1, 1) - (1, 2)} - Named: false

我们想在第一个逗号后添加一个换行符，但由于CST结构是嵌套的，包含此逗号的节点（product_expression (0, 1) - (0, 4)）不是多行的，只有顶层节点 product_expression (0, 1) - (1, 1) 是多行的。

为了解决这个问题，我们引入了用户定义的作用域和软线。

`@prepend_begin_scope` / `@append_begin_scope` / `@prepend_end_scope` / `@append_end_scope`

这些标签用于定义自定义作用域。与 #scope_id! 谓词结合使用，它们定义了可以跨越多个CST节点或仅跨越一个节点的部分的作用域。例如，此作用域匹配 parenthesized_expression 中括号之间的任何内容

(parenthesized_expression
  "(" @append_begin_scope
  ")" @prepend_end_scope
  (#scope_id! "tuple")
)

作用域内的软线

我们定义了四个谓词，在自定义作用域中插入软线，并与 #scope_id! 谓词结合使用

@前置空作用域软线
@前置带空格作用域软线
@追加空作用域软线
@追加带空格作用域软线

当使用这些作用域内的软线之一时，它们的行为取决于最内层的包含作用域和相应的scope_id。如果该作用域是多行的，软线会扩展为换行符。在其他任何方面，它们的行为与它们的非scoped对应项相同。

示例

此Tree-sitter查询

(#language! ocaml)

(parenthesized_expression
  "(" @begin_scope @append_empty_softline @append_indent_start
  ")" @end_scope @prepend_empty_softline @prepend_indent_end
  (#scope_id! "tuple")
)

(product_expression
  "," @append_spaced_scoped_softline
  (#scope_id! "tuple")
)

...格式化此段代码

(1,2,
3)

...如下

(
  1,
  2,
  3
)

...同时保持单行的(1, 2, 3, 3)不变。

如果我们使用@append_spaced_softline而不是@append_spaced_scoped_softline，由于它位于单行的product_expression中，1,将会后面跟着一个空格而不是换行符。使用谓词测试上下文有时，类似于软线的情况，我们希望查询只在上下文为单行或多行时匹配。Topiary有几个谓词可以实现这个结果。 #single_line_only! / #multi_line_only! 这些谓词允许查询只在匹配的节点位于单行（或多行）上下文时触发。示例 ; Allow (and enforce) the optional "|" before the first match case ; in OCaml if and only if the context is multi-line ( "with" . "|" @delete . (match_case) (#single_line_only!) ) ( "with" . "|"? @do_nothing . (match_case) @prepend_delimiter (#delimiter! "| ") (#multi_line_only!) ) #single_line_scope_only! / #multi_line_scope_only! 这些谓词允许查询只在包含匹配节点的相关自定义作用域为单行（或多行）时触发。示例 ; Allow (and enforce) the optional "|" before the first match case ; in function expressions in OCaml if and only if the scope is multi-line (function_expression (match_case)? @do_nothing . "|" @delete . (match_case) (#single_line_scope_only! "function_definition") ) (function_expression "|"? @do_nothing . (match_case) @prepend_delimiter (#multi_line_scope_only! "function_definition") (#delimiter! "| ") ; sic ) 建议的工作流程为了有效地在查询文件上工作，以下是一种建议的工作方式将一个示例文件添加到topiary-cli/tests/samples/input。将相同的文件复制到topiary-cli/tests/samples/expected，并对输出格式进行任何更改。如果是新的语言，添加其Tree-sitter语法，扩展crate::language::Language并在所有地方处理它，然后创建一个主要为空的查询文件，只包含(#language!)配置。运行 RUST_LOG=debug \ cargo test -p topiary-cli \ input_output_tester \ -- --nocapture 如果一切顺利，它应该会输出大量的日志消息。将输出复制到文本编辑器中。你特别感兴趣的是以类似以下行开始的CST输出：CST node: {Node compilation_unit (0, 0) - (5942, 0)} - Named: true。 💡 作为使用调试输出的替代方案，存在vis可视化子命令行选项，可以以多种格式输出Tree-sitter语法树。测试运行将输出实际输出和预期输出之间的所有差异，例如标记之间的缺失空格。选择你想要修复的差异，并找到输入文件中的行号和列号。 💡 请注意，CST 输出使用基于0的行和列编号，因此如果您的编辑器报告行40，列37，您可能想要行39，列36。在 CST 调试或可视化输出中，找到该区域的节点，例如以下内容 [DEBUG atom_collection] CST node: {Node constructed_type (39, 15) - (39, 42)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor_path (39, 15) - (39, 35)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor (39, 15) - (39, 35)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor_path (39, 36) - (39, 42)} - Named: true [DEBUG atom_collection] CST node: {Node type_constructor (39, 36) - (39, 42)} - Named: true 这可能会表明您希望所有 type_constructor_path 节点之后都添加空格 (type_constructor_path) @append_space 或者，更有可能的是，您只想在成对节点之间添加空格 ( (type_constructor_path) @append_space . (type_constructor_path) ) 或者，也许您希望在 constructed_type 的所有子节点之间添加空格 (constructed_type (_) @append_space . (_) ) 再次运行 cargo test，看看输出是否有所改善，然后返回到步骤5。语法树可视化为了支持格式化查询的开发，可以使用 --visualise 命令行选项生成给定输入的 Tree-sitter 语法树。目前支持 JSON 输出，涵盖与调试输出相同的信息，以及 GraphViz DOT 输出，这对于生成语法图很有用。（注意，与调试输出的基于0的位置不同，可视化输出的文本位置是1-based。）基于终端的游乐场 Nix 用户可能会发现 playground.sh 脚本在辅助查询文件的交互式开发中很有用。在终端中运行时，它将使用请求的查询文件格式化给定的源输入，并在针对这些文件发生 inotify 事件时更新输出。 Usage: ${PROGNAME} LANGUAGE [QUERY_FILE] [INPUT_SOURCE] LANGUAGE can be one of the supported languages (e.g., "ocaml", "rust", etc.). The packaged formatting queries for this language can be overridden by specifying a QUERY_FILE. The INPUT_SOURCE is optional. If not specified, it defaults to trying to find the bundled integration test input file for the given language. 例如，游乐场可以在 tmux 窗格中运行，同时您可以在另一个窗格中打开您选择的编辑器。相关工具 Tree-Sitter 特定语法树游乐场：一个交互式、在线的 Tree-sitter 和其查询语言的实验平台。 Neovim Treesitter 游乐场：一个 Neovim 的 Tree-sitter 游乐场插件。 Difftastic：一个利用 Tree-sitter 进行语法差异比较的工具。元和多语言格式化器 format-all：一个用于 Emacs 的格式化器协调器。 null-ls.nvim：一个 Neovim 的 LSP 框架，它简化了格式化器的协调。 prettier：一个支持多种（与 Web 开发相关的）语言的格式化器。 treefmt：一个通用的格式化器协调器，它将格式化器统一在公共接口下。相关格式化器 gofmt：Go 的默认格式化器，也是我们格式化器风格的灵感来源。 ocamlformat：OCaml 的格式化器。 ocp-indent：一个用于缩进 OCaml 代码的工具。 Ormolu：我们的 Haskell 格式化器，它遵循与 Topiary 相似的设计原则。 rustfmt：Rust 的默认格式化器。 shfmt：Bash 等的解析器、格式化和解释器。

无运行时依赖特性 bash css json nickel ocaml ocaml_interface ocamllex rust toml tree_sitter_query

2个版本

Topiary

动机

设计原则

语言支持

支持

实验性

入门指南

安装

作为pre-commit钩子设置

用法

格式

可视化

配置

Shell补全

日志记录

退出代码

示例

配置

配置来源

配置选项

配置合并

合并模式（默认）

修订模式

覆盖模式

设计

支持的捕获指令

@allow_blank_line_before

示例

@append_delimiter / @prepend_delimiter

示例

@append_multiline_delimiter / @prepend_multiline_delimiter

示例

@append_empty_softline / @prepend_empty_softline

示例

@append_hardline / @prepend_hardline

示例

@append_indent_start / @prepend_indent_start

示例

@append_indent_end / @prepend_indent_end

示例

@append_input_softline / @prepend_input_softline

示例

@append_space / @prepend_space

示例

@append_antispace / @prepend_antispace

示例

@append_spaced_softline / @prepend_spaced_softline

示例

@删除

示例

@不执行任何操作

示例

@多行缩进所有内容

示例

@单行不缩进

示例

理解不同的换行捕获

示例

@追加硬行

@前置硬行

@追加空软线

@前置空软线

@追加带空格的软线

@前置带空格的软线

@追加输入软线

@前置输入软线