Code reference:
200 lines of JS code to implement lambda interpreter
Interpreter construction
A lambda interpreter is mainly composed of the following aspects:
- Lexer: decomposes character stream into token stream
- Parser: build an abstract syntax tree AST by using the symbol stream according to the syntax
- Syntax Interpreter / syntax guided translator: traverse the AST to evaluate the syntax tree
Other tools:
- TokenType: enumeration class, token type
- AST: interface, which implements the syntax tree tostring() method and has three subclasses (according to syntax rules)
- Application
- Abstraction
- Identifier
TokenType
LPAREN: '(' RPAREN: ')' LAMBDA: '\' // For ease of use "\" DOT: '.' LCID: /[a-z][a-zA-Z]*/ EOF: Input stream termination
Lexer
Auxiliary methods for handling Tokens: (self defined)
- nextChar(): read in the next character
- nextToken(): read in the type and value (LCID) of the next token and skip the blank character
- next(Token t): judge whether the next token is t. if true, print the token. Otherwise, do not print and restore the index
- match(Token t): assert that the next token is t (i.e. the next(t) returns true), otherwise an error is reported to exit the program [exception handling can be added here, I only reported an error to exit]
Note: when calling next and match, if it is true, the Token type + line feed will be output to the console
Parser
rule of grammar
1.Term ::= Application| LAMBDA LCID DOT Term 2.Application ::= Application Atom| Atom //A special treatment is needed here. This rule is left recursion, which will lead to infinite recursion. Therefore, it is treated as right recursion application ::= atom application' application' ::= atom application'| ε 3.Atom ::= LPAREN Term RPAREN| LCID
Abstract syntax tree AST construction
The AST of lambda calculus is very simple because we only have three kinds of nodes:
- Abstraction: "\ x. t1"
- Application: "t1 t2"
- Identifier: "x"
Abstraction Identifier param;//variable AST body;//expression toString Displayed as: \.body.toString()
Application AST lhs;Left tree AST rhs;Right tree toString Displayed as: (lhs.toString()Space rhs.toString())
Indentifier String name;//Variable name String value;//De Bruijn index toString Displayed as: value
De Bruijn index
The DeBruine value is used to avoid different specification results caused by variable duplicate names
(\x.\y.x \f.\g.g) First, it is transformed into: (the variable remains unchanged, the number starts from 0, the code is the same level variable, and 1 represents the previous level variable.) (\x.\y.1 \f.\g.0) toString Displayed as: (to prevent alpha Inconsistency caused by transformation (remove variables) (\.\.1 \.\.0)
Interpreter
Evaluate evalAST (recursive thinking)
evalAST rule:
- First, check whether ast is application. If so, judge the left tree type:
- If the left tree is Abstraction, replace the right tree as param into the body expression of the left tree and replace the outermost variable
- If the left tree is Application, evaluate the left and right trees separately, and further judge the evaluation result form
- If the left tree is an Identifier, evaluate the right tree separately and update the right tree
- If ast is an Abstraction, the body expression part of ast is evaluated
- If ast is Identifier, the leaf node is returned directly
substitute
Here, we need to do a processing, because the outermost layer will eliminate a lambda after substitution, so all variables need to shift by - 1, and the leaf node in value needs to supplement the displacement by 1;
subst variable substitution
- If the node is an application, replace the left and right trees respectively;
- If the node node is abstraction, replace it with node Body depth outDepth+1;
- If node is an identifier, replace the identifier whose De Bruijn index value is equal to outDepth (the value of value after replacement is deepened to outDepth)
shift DeBruine displacement
- If the node is an application, shift the left and right trees respectively;
- If the value node is abstraction, the new body is equal to the old value Body displacement by (in depth + 1);
- If the value is an identifier, then if the De Bruijn index value of the new identifier is greater than or equal to inDepth, add by; otherwise, add 0 (only the shift by bit is required for the outer variable beyond the range of the inner layer)
Main method
public static void main(String[] args) { String source = "(\\x.\\y.x)(\\x.x)(\\y.y)"; Lexer lexer = new Lexer(source); Parser parser = new Parser(lexer); Interpreter interpreter = new Interpreter(parser); AST result = interpreter.eval(); System.out.println(result.toString()); }
Test
@Test public void testLexer() { Lexer lexer = new Lexer(sources[1]); Parser parser = new Parser(lexer); AST ast = parser.parse(); assertEquals("LPAREN" + lineBreak+ "LPAREN" + lineBreak+ "LAMBDA" + lineBreak+ "LCID" + lineBreak+ "DOT" + lineBreak+ "LAMBDA" + lineBreak+ "LCID" + lineBreak+ "DOT" + lineBreak+ "LAMBDA" + lineBreak+ "LCID" + lineBreak+ "DOT" + lineBreak+ "LCID" + lineBreak+ "LPAREN" + lineBreak+ "LCID" + lineBreak+ "LCID" + lineBreak+ "LCID" + lineBreak+ "RPAREN" + lineBreak+ "RPAREN" + lineBreak+ "LPAREN" + lineBreak+ "LAMBDA" + lineBreak+ "LCID" + lineBreak+ "DOT" + lineBreak+ "LAMBDA" + lineBreak+ "LCID" + lineBreak+ "DOT" + lineBreak+ "LCID" + lineBreak+ "RPAREN" + lineBreak+ "RPAREN"+lineBreak+ "EOF"+lineBreak,bytes.toString()); } @Test public void testParser() { Lexer lexer = new Lexer(sources[1]); Parser parser = new Parser(lexer); AST ast = parser.parse(); assertEquals("(\\.\\.\\.(1 ((2 1) 0)) \\.\\.0)",ast.toString()); } @Test public void testInterpreter() { Lexer lexer = new Lexer(sources[1]); Parser parser = new Parser(lexer); interpreter = new Interpreter(parser); AST ast = parser.parse(); AST result = interpreter.eval(ast); assertEquals("\\.\\.(1 0)",result.toString()); }