Two week self-made scripting language - day 5 design of parser

Posted by bostonmacosx on Fri, 17 Jan 2020 15:44:27 +0100

Day 5 design parser

5.1 syntax of stone

Listing 5.1 syntax definition of stone

    primary        : "(" expr ")"  | NUMBER | IDENTIFIER | STRING
    factor        : "-" primary | primary
    expr          : factor { OP factor }
    block        : "{" [ statement ] { (";" | EOL) [ statement ] } "}"
    simple        : expr
    statement    : "if" expr block [ "else" block ]
                | "while" expr block
                | simple
    program        : [ statement ] (";" | EOF)

5.2 using parsers and composers

Parser Library: a kind of parser combination sub type library work is to rewrite the syntax rules written by BNF into Java language program, which is explained in Chapter 17 of the book

Listing 5.2 syntax analyzer of stone language

//Listing 5.2 is a syntax analyzer converted from the syntax of the Stone language listed in listing 5.1.

  A basic Parser for Stone grammatical analysis
package stone;

import stone.Parser.Operators;
import stone.ast.*;

import java.util.HashSet;

import static stone.Parser.rule;

public class BasicParser {
    HashSet<String> reserved = new HashSet<String>();
    Operators operators = new Operators();
    Parser expr0 = rule();
    Parser primary = rule(PrimaryExpr.class)
            rule().identifier(Name.class, reserved),
    Parser factor = rule().or(rule(NegativeExpr.class).sep("-").ast(primary),
    Parser expr = expr0.expression(BinaryExpr.class, factor, operators);

    Parser statement0 = rule();
    Parser block = rule(BlockStmnt.class)
        .repeat(rule().sep(";", Token.EOL).option(statement0))
    Parser simple = rule(PrimaryExpr.class).ast(expr);
    Parser statement = statement0.or(

    Parser program = rule().or(statement, rule(NullStmnt.class))
                           .sep(";", Token.EOL);

    public BasicParser() {

        operators.add("=", 1, Operators.RIGHT);
        operators.add("==", 2, Operators.LEFT);
        operators.add(">", 2, Operators.LEFT);
        operators.add("<", 2, Operators.LEFT);
        operators.add(" ", 3, Operators.LEFT);
        operators.add("-", 3, Operators.LEFT);
        operators.add("*", 4, Operators.LEFT);
        operators.add("/", 4, Operators.LEFT);
        operators.add("%", 4, Operators.LEFT);
    public ASTree parse(Lexer lexer) throws ParseException {
        return program.parse(lexer);

Both the Parser class and the Operators class are provided by the library.

rule method is a static method in the Parser class

The definition of the primary field is based on the syntax rules of the nonterminal primary. In the same way, both factor and block are syntactic rules in the form of Java language.

Terminator: a symbol that cannot appear on the left side of a derivation alone, that is, the terminator can no longer be used


Nonterminal: those that are not terminators are nonterminal. A nonterminal can be understood as a separable element, and a terminator is the smallest element that cannot be split.

Table 5.1 method of parser class

Processing of grammatical rules

paren : "(" expr ")"

After conversion to Java language, you will get the following code

Parser paren = rule().sep("(").ast(expr).sep(")");

factor : "-" primary | primary

The corresponding factor field is defined as follows

Parser factor = rule().or(rule().sep("-").ast(primary), primary);

expr : factor { OP factor }


Parser expr = expr0.expression(BinaryExpr.class, factor, operators);

The operator table, which is saved as an Operators object, is the third parameter of the expression method.

operators.add("=", 1, Operators.RIGHT);
// Right combination of Operator.RIGHT
// Left combination of Operator.LEFT

The parameters of the add method are the string used to represent the operator, its priority, and the left and right combination order. The number used to represent the priority is an int type number starting at 1, the higher the value, the higher the priority.

5.3 abstract syntax tree generated by parser

The parse method of the Parser object will return the parsing result in the form of an abstract syntax tree after successful parsing.Rule of grammar

adder: NUMBER " " NUMBER

Rewrite to Java language

Parser adder = rule().number().token(" ").number();

Add number with token

Parser adder = rule(BinaryExpr.class).number(NumberLiteral.class)
.token(" ")

Using sep to add a separator to a pattern

Parser adder = rule().number().sep(" ").number();

The ast method adds a non terminator to the pattern.

Parser eq = rule().ast(adder).token("==").ast(adder);

Special rule: if there is only one child node, the Parser library will not create another one.Specifies that parameters that are not applicable to rule methods receive a class to create a subtree with the NegativeExpr object as the root


If you want to apply this special rule when the rule method accepts parameters, you need to define the signature method as follows

public static ASTree create(List<ASTree> c) {
        return c.size() == 1 ? c.get(0) : new PrimaryExpr(c);

Rewriting the syntax rule of non terminal progrm into Java language

program : [ statement ] ("," | EOF)
Parser program = rule().or(statement, rule(NullStmnt.class))

5.4 test parser

Knowledge supplement

What is DSL?

After a simple query of DSL, I found that it's still very important to have time to study in detail

Wikipedia Ink. Zhihu. COM /? Target = http: / / domain specific language) definition of DSL

A specialized computer language designed for a specific task.

A computer language specially designed to solve a certain kind of task.

DSL is the abbreviation of Domain Specific Language, which is translated into Domain Specific Language (hereinafter referred to as DSL) in Chinese, and GPL is the opposite of DSL

GPL is the abbreviation of General Purpose Language, i.e. general programming language, which is very familiar with Objective-C, Java, Python, C language, etc.

DSL gains efficiency in a certain field by compromising on expression ability

Reference resources

Published 53 original articles, won praise 5, visited 4452
Private letter follow

Topics: Java Programming Python C