2 个版本

0.1.1	2024 年 3 月 6 日
0.1.0	2024 年 3 月 5 日

#133 在编程语言中

MIT 许可证

115KB
3.5K SLoC

PL/0（又称 PL_0）

❤️ 如果您喜欢这个项目，请给我 Star / Follow！❤️

开始使用

这是南京航空航天大学（又称 NUAA）《编译原理》课程的设计。

简介

PL/0 是 Pascal 的一个子集语言。

这是 PL/0 编译器的简单 Rust 实现。

BNF

<prog> -> program <id> ; <block>
<block> -> [<const-decl>][<var-decl>][<proc>]<body>
<const-decl> -> const <const> {, <const>} ;
<const> -> <id> := <integer>
<var-decl> -> var <id> {, <id>} ;
<proc> -> procedure <id> ([<id> {, <id>}]) ; <block> {; <proc>}
<body> -> begin <statement> {; <statement>} end
<statement> -> <id> := <exp>
              | if <l-exp> then <statement> [else <statement>]
              | while <l-exp> do <statement>
              | call <id> ([<exp> {, <exp>}])
              | <body>
              | read (<id> {, <id>})
              | write (<exp> {, <exp>})
<l-exp> -> <exp> <lop> <exp> | odd <exp>
<exp> -> [+|-] <term> {<aop> <term>}
<term> -> <factor> {<mop> <factor>}
<factor> -> <id> | <integer> | (<exp>)
<lop> -> = | <> | < | <= | > | >=
<aop> -> + | -
<mop> -> * | /
<id> -> <letter> {<letter> | <digit>}
<integer> -> <digit> {<digit>}
<letter> -> a | b | ... | z | A | B | ... | Z
<digit> -> 0 | 1 | ... | 9

结构

$$ \set{源代码} \Longrightarrow \textbf{词法分析器} \stackrel{标记}{\Longrightarrow} \textbf{语法分析器} \stackrel{抽象语法树}{\Longrightarrow} \textbf{代码生成} \Longrightarrow \set{PCode} \longrightarrow \textbf{虚拟机} \longrightarrow \set{结果} $$

部分	分析列表
词法分析器	`词法分析`
语法分析器	`语法分析`
代码生成	`语义分析`

概述

词法分析器/标记器

这部分非常简单，我完全手动实现了它，没有使用任何其他工具。

（但是，如果您愿意，可以使用像 flex 或 pest 这样的工具来自动生成词法分析器/标记器）

语法分析器

借助 递归下降算法，语法分析器也不是很难实现。

然而，在用递归下降算法实现语法分析器之前，有必要证明给定的 BNF 满足 LL(1) 的定义。

证明将在后面给出。错误处理我采用了受欢迎的 panic-mode-liked 错误处理策略，以确保编译器在一次运行中尽可能多地找到错误，而不是被第一个错误阻止。为了确保错误可以同步处理，必须有一个 FIRST-FOLLOW 表（我已经手动构建了它，这可以通过使用自动工具进一步改进）。代码生成 AST 到 PCode 的代码生成器是这一部分的默认策略。我还在开发一个 AST 到 Lua-Backend-Adapted-Representation（LBAR）的代码生成器（尚未实现）。虚拟机（简称 VM / 解释器）注意 PCode 是 codegen 的默认执行结果，而 Simple-PCode-Interpreter 是 Virtual Machine 的默认实现。尽管如此，我仍在尝试为 LBAR 实现一个 Lua-VM-Liked-VM。可行性分析证明：BNF 是 LL(1) 为了满足这一点，需要满足以下 3 个条件 $$ \begin{align*} \text{条件 1} &\dots \text{语法中没有检测到 \textit{左递归模式}} \ \text{条件 2} &\dots \forall A \in V_N (A \rightarrow \alpha_1 | \alpha_2 | \dots | \alpha_n) \Rightarrow First(\alpha_i) \cap First(\alpha_j) = \Phi ~ (i \ne j) \ \text{条件 3} &\dots \forall A \in V_N (\epsilon \in First(A)) \Rightarrow First(A) \cap Follow(A) = \Phi \end{align*} $$ 现在，让我们逐一证明它们！条件 1 ~ 语法中没有检测到左递归模式仔细查看给定的 BNF 后，我们可以轻松证明 $$ \forall A \in V_N (A \rightarrow B \wedge B \in V_N ) \Rightarrow A \ne B $$ 这意味着在语法中没有检测到左递归模式。条件 2 这可以很容易地通过参考 BNF 和 first_follow_table 来完成条件 3 与条件 2 相同斐波那契演示源代码 program fibonacci; const index := 30; var return,i,a; procedure fib(a,x); var sum; begin sum := 0; if x<2 then return := x else begin call fib(a+1,x-1); sum := sum+return; call fib(a+1,x-2); sum := sum+return; return := sum end end begin i := 1; a := 2; while i<=index do begin call fib(a+1,i); write(return); i := i+1 end end 结果控制台 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946 17711 28657 46368 75025 121393 196418 317811 514229 832040 PCode PCode List: ====================================================================== 0| JMP 0 39 1| JMP 0 4 2| STA 1 4 3| STA 2 3 4| INT 0 6 5| LIT 0 0 6| STO 0 5 7| LOD 0 4 8| LIT 0 2 9| OPR 0 10 10| JPC 0 14 11| LOD 0 4 12| STO 1 3 13| JMP 0 38 14| LOD 0 3 15| LIT 0 1 16| OPR 0 2 17| LOD 0 4 18| LIT 0 1 19| OPR 0 3 20| CAL 1 2 21| LOD 0 5 22| LOD 1 3 23| OPR 0 2 24| STO 0 5 25| LOD 0 3 26| LIT 0 1 27| OPR 0 2 28| LOD 0 4 29| LIT 0 2 30| OPR 0 3 31| CAL 1 2 32| LOD 0 5 33| LOD 1 3 34| OPR 0 2 35| STO 0 5 36| LOD 0 5 37| STO 1 3 38| OPR 0 0 39| INT 0 7 40| LIT 0 1 41| STO 0 4 42| LIT 0 2 43| STO 0 5 44| LOD 0 4 45| LIT 0 30 46| OPR 0 13 47| JPC 0 61 48| LOD 0 5 49| LIT 0 1 50| OPR 0 2 51| LOD 0 4 52| CAL 0 2 53| LOD 0 3 54| OPR 0 14 55| OPR 0 15 56| LOD 0 4 57| LIT 0 1 58| OPR 0 2 59| STO 0 4 60| JMP 0 44 61| OPR 0 0 ====================================================================== 符号表 Symbol Table: ====================================================================== name | type | val | level | addr | size | scope_list ====================================================================== index | const | 30 | 0 | 3 | 0 | ["#"] return | var | 0 | 0 | 3 | 0 | ["#"] i | var | 0 | 0 | 4 | 0 | ["#"] a | var | 0 | 0 | 5 | 0 | ["#"] fib | proc | 2 | 0 | 6 | 2 | ["#"] a | var | 0 | 1 | 3 | 0 | ["#", "fib"] x | var | 0 | 1 | 4 | 0 | ["#", "fib"] sum | var | 0 | 1 | 5 | 0 | ["#", "fib"] ====================================================================== 错误处理演示正如所述，这个 pl/0 编译器的实现具有完整的错误处理策略，这意味着它可以在一次运行中找到尽可能多的错误，而不是在第一个错误处停止。以下是一些简单的演示语法错误（可能与 Lexical Error 共同存在） src program ; var a, b, c; begin a 1; b := ; é : 3; if 1 = 1 then write(1 else write 0); write a + b + c; wrçte(1) end console SyntaxError{ Line: 1, Col: 9 } | ~~ Expected <id> field, but not found! SyntaxError{ Line: 4, Col: 8 } | ~~ Expected `:=`, but got `Integer(1)` SyntaxError{ Line: 5, Col: 9 } | ~~ Expected `<id>` / `<integer>` / `(<exp>)` field, but got an unmatchable token `;` LexicalError{ Line: 6, Col: 3 } | ~~ 'é' is not an ASCII character LexicalError{ Line: 6, Col: 5 } | ~~ ':' is an undefined sign, did you mean ':='? SyntaxError{ Line: 6, Col: 7 } | ~~ Expected `:=`, but got `Integer(3)` SyntaxError{ Line: 6, Col: 7 } | ~~ Expected <statement> field, but not found! SyntaxError{ Line: 9, Col: 6 } | ~~ Expected `)`, but got `Else` SyntaxError{ Line: 10, Col: 11 } | ~~ Expected `(`, but got `Integer(0)` SyntaxError{ Line: 11, Col: 9 } | ~~ Expected `(`, but got `Identifier("a")` SyntaxError{ Line: 11, Col: 18 } | ~~ Expected `)`, but got `;` LexicalError{ Line: 12, Col: 5 } | ~~ 'ç' is not an ASCII character SyntaxError{ Line: 12, Col: 7 } | ~~ Expected `:=`, but got `Identifier("te")` SyntaxError{ Line: 12, Col: 7 } | ~~ Expected <statement> field, but not found! thread 'main' panicked at src/parser/mod.rs:149:7: |> Errors above occurred (during `parsing`), compiling stopped ... <| note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace 语义错误重复定义 src program MultiDef; var a, a, a, a; procedure proc(); begin write(1) end; procedure proc(); begin write(2) end begin write(1) end console SemanticError{ Line: 3, Col: 8 } | ~~ `a` is defined before SemanticError{ Line: 3, Col: 11 } | ~~ `a` is defined before SemanticError{ Line: 3, Col: 14 } | ~~ `a` is defined before SemanticError{ Line: 10, Col: 14 } | ~~ `proc` is defined before thread 'main' panicked at src/translator/mod.rs:116:7: attempt to subtract with overflow note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace 未定义 src program undef; begin a := 1; b := 2; write(c) end console SemanticError{ Line: 3, Col: 3 } | ~~ `a` is undefined SemanticError{ Line: 4, Col: 3 } | ~~ `b` is undefined SemanticError{ Line: 5, Col: 9 } | ~~ `c` is undefined thread 'main' panicked at src/translator/mod.rs:73:7: |> Errors above occurred (during `translation/codegen`), compiling stopped ... <| note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace args_list.length 无法与定义（签名）匹配 src program WrongArgsListLength; var a; procedure proc(); begin write(1) end; procedure procc(x, t, z); begin write(1) end begin call proc(1, 1, 1); call procc(3) end console SemanticError{ Line: 16, Col: 11 } | ~~ `proc` expects 0 args, but received 3 SemanticError{ Line: 17, Col: 12 } | ~~ `procc` expects 3 args, but received 1 thread 'main' panicked at src/translator/mod.rs:73:7: |> Errors above occurred (during `translation/codegen`), compiling stopped ... <| 将值赋给 const / procedure src program AssignToConstProc; const i := 1; procedure proc(); begin write(i) end begin i := 16; proc := 16 end console SemanticError{ Line: 10, Col: 3 } | ~~ `i` is not a variable SemanticError{ Line: 11, Col: 6 } | ~~ `proc` is not a variable thread 'main' panicked at src/translator/mod.rs:73:7: |> Errors above occurred (during `translation/codegen`), compiling stopped ... <|

依赖项 ~3MB ~59K SLoC derive_more once_cell pest pest_derive project-root 其他特性 debug

bin+lib pl_0

2 个版本

PL/0（又称 PL_0）

❤️ 如果您喜欢这个项目，请给我 `Star` / `Follow`！❤️

开始使用

简介

BNF

结构

概述

词法分析器/标记器

语法分析器

错误处理

代码生成

虚拟机（简称 VM / 解释器）

可行性分析

证明：BNF 是 `LL(1)`

条件 1 ~ 语法中没有检测到左递归模式

条件 2

条件 3

斐波那契演示

错误处理演示

语法错误（可能与 `Lexical Error` 共同存在）

语义错误

重复定义

未定义

`args_list.length` 无法与定义（签名）匹配

将值赋给 `const` / `procedure`

依赖项

2 个版本

PL/0（又称 PL_0）

❤️ 如果您喜欢这个项目，请给我 Star / Follow！❤️

开始使用

简介

BNF

结构

概述

词法分析器/标记器

语法分析器

错误处理

代码生成

虚拟机（简称 VM / 解释器）

可行性分析

证明：BNF 是 LL(1)

条件 1 ~ 语法中没有检测到 左递归模式

条件 2

条件 3

斐波那契演示

错误处理演示

语法错误（可能与 Lexical Error 共同存在）

语义错误

重复定义

未定义

args_list.length 无法与定义（签名）匹配

将值赋给 const / procedure

依赖项

❤️ 如果您喜欢这个项目，请给我 `Star` / `Follow`！❤️

证明：BNF 是 `LL(1)`

条件 1 ~ 语法中没有检测到左递归模式

语法错误（可能与 `Lexical Error` 共同存在）

`args_list.length` 无法与定义（签名）匹配

将值赋给 `const` / `procedure`