Kinx Library-Parser Combinator (Part 2)

Introduction

** "looks JavaScript, brain (contents) Ruby, (stability is AC / DC)" ** Scripting language Kinx ). This time is also a parser combinator.

Reference
First motivation ... Script language KINX (introduction)
All links to individual articles are collected here.
Repository ... https://github.com/Kray-G/kinx
We are waiting for Pull Request etc.

Last time, I made an AST with Kinx Library-Parser Combinator (Part 1).

This time I will interpret this and execute it. Finally, JIT compiles and runs.

before that

review

Let's reconfirm the last final program.

using Parsek;

function makeAST(first, rest) {
    var expr = first;
    for (var i = 0, l = rest.length(); i < l; ++i) {
        expr = { lhs: expr, op: rest[i][0], rhs: rest[i][1] };
    }
    return expr;
}

var $ = new Parsek();
var number = $.regex(/[1-9][0-9]*|[0-9]/).map(Integer.parseInt);
var addsub = $.oneOf("+-");
var muldiv = $.oneOf("*/%");
var lbr = $.string("(");
var rbr = $.string(")");

var expression;
var factor = $.lazy(&() => $.alt(number, $.seq(lbr, expression, rbr).map(&(value) => value[1])));
var term = $.seqMap(factor, $.seq(muldiv, factor).many(), makeAST);
expression = $.seqMap(term, $.seq(addsub, term).many(), makeAST);

// test
System.println(expression.parseAll("1+2*3+2*(14-2)").value.toJsonString(true));

This. Let's fix just one problem with this. It's not a problem, though.

Actually, this parser will fail if there are whitespace characters. Of course, there is no place to interpret the space, so the character is not supposed to encounter the space, so he confidently answers, "There is no grammar!".

Whitespace skipping process

In ordinary programming languages, token breaks are any number of whitespace characters, and whitespace is generally something to skip. Therefore, let's skip the whitespace characters. There are two things to use:

$ .optWhitespace is a parser that interprets zero or more arbitrary whitespace characters. Specifically, it is the same as $ .regex (/ \ s * /). Therefore, it matches whitespace, newlines (CR / LF), and tabs. There is also $ .whitespace, which is$ .regex (/ \ s + /)to match one or more whitespace characters.
Use parser.skip (anotherParser) to match parser, then try to match ʻanotherParser, and if so, return the result of parser. That is, it skips the match result of ʻanotherParser.

Using these two, you can use .skip ($ .optWhitespace) to skip whitespace after a particular parser matching.

Here, prepare a function called lexeme () as shown below, and add a process to skip whitespace characters. Define ʻignore` so that you can add it here when there are more things to ignore, and you need to skip whitespace even at the first start.

var ignore = $.optWhitespace;
var lexeme = &(p) => p.skip(ignore);

This lexeme () function only needs to be at the token break, so it should be in the parser of the smallest unit of number, ʻaddsub, muldiv, lbr, rbr`. If you rewrite the first program, it will be as follows.

using Parsek;

function makeAST(first, rest) {
    var expr = first;
    for (var i = 0, l = rest.length(); i < l; ++i) {
        expr = { lhs: expr, op: rest[i][0], rhs: rest[i][1] };
    }
    return expr;
}

var $ = new Parsek();
var ignore = $.optWhitespace;
var lexeme = &(p) => p.skip(ignore);

var number = lexeme($.regex(/[1-9][0-9]*|[0-9]/)).map(Integer.parseInt);
var addsub = lexeme($.oneOf("+-"));
var muldiv = lexeme($.oneOf("*/%"));
var lbr = lexeme($.string("("));
var rbr = lexeme($.string(")"));

var expression;
var factor = $.lazy(&() => $.alt(number, $.seq(lbr, expression, rbr).map(&(value) => value[1])));
var term = $.seqMap(factor, $.seq(muldiv, factor).many(), makeAST);
expression = $.seqMap(term, $.seq(addsub, term).many(), makeAST);

// test
System.println(expression.parseAll("1+2*3+2*(14-2)").value.toJsonString(true));
System.println(ignore.then(expression).parseAll("  1 + 2 * 3 + 2 * ( 14 - 2 )  ").value.toJsonString(true));
/* Both results are the same.
{
    "lhs": {
        "lhs": 1,
        "op": "+",
        "rhs": {
            "lhs": 2,
            "op": "*",
            "rhs": 3
        }
    },
    "op": "+",
    "rhs": {
        "lhs": 2,
        "op": "*",
        "rhs": {
            "lhs": 14,
            "op": "-",
            "rhs": 2
        }
    }
}
*/

Both return the same result.

The first ʻignore.then ()is the part that ignores leading whitespace.parser.then (anotherParser) returns ʻanotherParser, unlike .skip (). This will cause the parser to be ignored.

Now let's get into the main subject.

Interpreter

First, let's implement an interpreter before JIT. This is easier. The interpreter runs sequentially while traversing the AST. Evaluate the left side, evaluate the right side, and calculate with the result. The ʻInterpreter` class looks like this:

class Interpreter(opts_) {
    private sequence(r, op, lhs, rhs) {
        return if (!opts_.enableSequence);
        System.println("%d %s %d -> %d" % lhs % op % rhs % r);
    }
    public eval(ast) {
        var lhs = ast.lhs.isObject ? eval(ast.lhs) : ast.lhs;
        var rhs = ast.rhs.isObject ? eval(ast.rhs) : ast.rhs;
        var r = 0;
        switch (ast.op) {
        case '+':
            r = lhs + rhs;
            break;
        case '-':
            r = lhs - rhs;
            break;
        case '*':
            r = lhs * rhs;
            break;
        case '/':
            r = lhs / rhs;
            break;
        case '%':
            r = lhs % rhs;
            break;
        default:
            throw RuntimeException('Invalid operator');
        }
        sequence(r, ast.op, lhs, rhs);
        return r;
    }
}

I also tried to make the calculation process visible (passing {enableSequence: true} to the constructor will show the calculation process). Let's run it like this.

System.println(
    "Result = ",
    new Interpreter({ enableSequence: true })
        .eval(ignore.then(expression)
                    .parseAll("  1 + 2 * 3 + 2 * ( 14 - 2 )  ").value)
);
/*
2 * 3 -> 6
1 + 6 -> 7
14 - 2 -> 12
2 * 12 -> 24
7 + 24 -> 31
Result = 31
*/

It is calculated according to the priority! Not surprisingly, the AST is built according to priority. Let's try another formula.

System.println(
    "Result = ",
    new Interpreter({ enableSequence: true })
        .eval(ignore.then(expression)
                    .parseAll("(( 123 ) * 2 * 4 - 3 * ( 12 + 10 )) % 100").value)
);
/*
123 * 2 -> 246
246 * 4 -> 984
12 + 10 -> 22
3 * 22 -> 66
984 - 66 -> 918
918 % 100 -> 18
Result = 18
*/

There is.

JIT compilation

This is the long-awaited JIT. However, although the JIT technology itself is wonderful, honestly, with this level of computational complexity (no loops or conditional branching), the above interpreter is honestly faster. Since I am compiling with the same root as the interpreter and the interpreter itself is not doing much, the JIT compilation time is like the interpreter time (+ α). Hmmm.

It's just a technology base. I should have made an example that makes a difference in speed. Would you like to rewrite Brainf * ck as a subject, referring to the article here?

By the way, in ʻInterpreter, the calculation formula was executed directly, but it is OK if you rewrite it so that the equivalent machine language code is output instead of executing the calculation. Looking at the [JIT Library](https://qiita.com/Kray-G/items/a237f195630f3f0a91df), the Compilerclass to convert to machine language code will be as follows. It's a little long, but I'll put it as it is. Again, you can output the intermediate format compile code by setting{enableListing: true}` in the constructor.

class Compiler(opts_) {
    var regs_, regsLen_;
    enum { MOV, BOP, DIVMOD, RET }
    private initialize() {
        regs_ = [
            // Jit.R0 and Jit.R1 is used as a work register when it is division.
            { reg: Jit.R2, used: false, name: "R2" },
            { reg: Jit.R3, used: false, name: "R3" },
            { reg: Jit.R4, used: false, name: "R4" },
            { reg: Jit.R5, used: false, name: "R5" },
            { reg: Jit.S0, used: false, name: "S0" },
            { reg: Jit.S1, used: false, name: "S1" },
            { reg: Jit.S2, used: false, name: "S2" },
            { reg: Jit.S3, used: false, name: "S3" },
            { reg: Jit.S4, used: false, name: "S4" },
            { reg: Jit.S5, used: false, name: "S5" },
        ];
        regsLen_ = regs_.length();
    }
    private listing(type, op, dst, op1, op2) {
        return if (!opts_.enableListing);
        switch (type) {
        case MOV:
            System.println("%s <- %s" % dst % op1);
            break;
        case BOP:
            System.println("%s <- %s %s %s" % dst % op1 % op % op2);
            break;
        case DIVMOD:
            System.println("R0 <- %s" % op1);
            System.println("R1 <- %s" % op2);
            System.println("%s <- R0 %s R1" % dst % op);
            break;
        case RET:
            System.println("ret %s" % dst);
            break;
        }
    }
    private getReg() {
        for (var i = 0; i < regsLen_; ++i) {
            if (!regs_[i].used) {
                regs_[i].used = true;
                return i;
            }
        }
        throw RuntimeException("Not enough register");
    }
    private releaseReg(i) {
        regs_[i].used = false;
    }
    private compileLeaf(c, leaf) {
        var r = getReg();
        c.mov(regs_[r].reg, Jit.IMM(leaf));
        listing(MOV, null, regs_[r].name, leaf);
        return r;
    }
    private compileNode(c, ast) {
        var rl = ast.lhs.isObject ? compileNode(c, ast.lhs) : compileLeaf(c, ast.lhs);
        var rr = ast.rhs.isObject ? compileNode(c, ast.rhs) : compileLeaf(c, ast.rhs);
        var r = getReg();
        switch (ast.op) {
        case '+':
            c.add(regs_[r].reg, regs_[rl].reg, regs_[rr].reg);
            listing(BOP, ast.op, regs_[r].name, regs_[rl].name, regs_[rr].name);
            break;
        case '-':
            c.sub(regs_[r].reg, regs_[rl].reg, regs_[rr].reg);
            listing(BOP, ast.op, regs_[r].name, regs_[rl].name, regs_[rr].name);
            break;
        case '*':
            c.mul(regs_[r].reg, regs_[rl].reg, regs_[rr].reg);
            listing(BOP, ast.op, regs_[r].name, regs_[rl].name, regs_[rr].name);
            break;
        case '/':
            c.mov(Jit.R0, regs_[rl].reg);
            c.mov(Jit.R1, regs_[rr].reg);
            c.sdiv();
            c.mov(regs_[r].reg, Jit.R0);
            listing(DIVMOD, ast.op, regs_[r].name, regs_[rl].name, regs_[rr].name);
            break;
        case '%':
            c.mov(Jit.R0, regs_[rl].reg);
            c.mov(Jit.R1, regs_[rr].reg);
            c.sdivmod();
            c.mov(regs_[r].reg, Jit.R1);
            listing(DIVMOD, ast.op, regs_[r].name, regs_[rl].name, regs_[rr].name);
            break;
        default:
            throw RuntimeException('Invalid operator');
        }
        releaseReg(rl);
        releaseReg(rr);
        return r;
    }
    public compile(ast) {
        var c = new Jit.Compiler();
        c.enter();
        var r = compileNode(c, ast);
        c.ret(regs_[r].reg);
        listing(RET, null, regs_[r].name);
        return c.generate();
    }
    public run(ast) {
        var code = compile(ast);
        return code.run();
    }
}

Registers are available from R2 to R5 and S0 to S5. Save R0 and R1 for division as they need to be used for division. The register allocation method is "Reserve when using and return when finished".

Note that the R type registers may be destroyed after the function is called, but this time the function is not called, so it can be treated in the same way as the S type. An error will occur if the available registers are used up, but many of the intermediate values are short-lived, so this should not be a problem. If the number of values to be saved increases, issue a code to save it to memory. Also, Kinx's JIT library can have a local variable area on the stack, so it's easiest to allocate it when you run out.

Considering register allocation in earnest, it is quite profound, but in the case of JIT, ** optimization does not take much time **, so this is enough. Because it's a waste of optimization time. In the case of AOT Compilation, the compilation time is not a concern, and the faster the execution time, the better the optimization, but in JIT, the compilation time is also within the execution time, so that is not the case.

Let's calculate it now. Don't forget ʻusing Jit;`.

using Parsek;
using Jit;

/*abridgement*/
System.println(
    "Result = ",
    new Compiler({ enableListing: true })
        .run(ignore.then(expression)
                   .parseAll("  1 + 2 * 3 + 2 * ( 14 - 2 )  ").value)
);
/*
R2 <- 1
R3 <- 2
R4 <- 3
R5 <- R3 * R4
R3 <- R2 + R5
R2 <- 2
R4 <- 14
R5 <- 2
S0 <- R4 - R5
R4 <- R2 * S0
R2 <- R3 + R4
ret R2
Result = 31
*/

There is. Let's try another example.

System.println(
    "Result = ",
    new Compiler({ enableListing: true })
        .run(ignore.then(expression)
                   .parseAll("(( 123 ) * 2 * 4 - 3 * ( 12 + 10 )) % 100").value)
);
/*
R2 <- 123
R3 <- 2
R4 <- R2 * R3
R2 <- 4
R3 <- R4 * R2
R2 <- 3
R4 <- 12
R5 <- 10
S0 <- R4 + R5
R4 <- R2 * S0
R2 <- R3 - R4
R3 <- 100
R0 <- R2
R1 <- R3
R4 <- R0 % R1
ret R4
Result = 18
*/

There is also this.

You can also dump the actual machine language code, so let's try it.

var code = new Compiler()
       .compile(ignore.then(expression)
                      .parseAll("  1 + 2 * 3 + 2 * ( 14 - 2 )  ").value);
code.dump();
System.println(code.run());

Output on Linux.

       0:   53                                          push rbx
       1:   41 57                                       push r15
       3:   41 56                                       push r14
       5:   48 8b df                                    mov rbx, rdi
       8:   4c 8b fe                                    mov r15, rsi
       b:   4c 8b f2                                    mov r14, rdx
       e:   48 83 ec 10                                 sub rsp, 0x10
      12:   48 c7 c7 01 00 00 00                        mov rdi, 0x1
      19:   48 c7 c1 02 00 00 00                        mov rcx, 0x2
      20:   49 c7 c0 03 00 00 00                        mov r8, 0x3
      27:   49 89 cb                                    mov r11, rcx
      2a:   4d 0f af d8                                 imul r11, r8
      2e:   4a 8d 0c 1f                                 lea rcx, [rdi+r11]
      32:   48 c7 c7 02 00 00 00                        mov rdi, 0x2
      39:   49 c7 c0 0e 00 00 00                        mov r8, 0xe
      40:   49 c7 c3 02 00 00 00                        mov r11, 0x2
      47:   4c 89 c3                                    mov rbx, r8
      4a:   49 2b db                                    sub rbx, r11
      4d:   49 89 f8                                    mov r8, rdi
      50:   4c 0f af c3                                 imul r8, rbx
      54:   4a 8d 3c 01                                 lea rdi, [rcx+r8]
      58:   48 89 f8                                    mov rax, rdi
      5b:   48 83 c4 10                                 add rsp, 0x10
      5f:   41 5e                                       pop r14
      61:   41 5f                                       pop r15
      63:   5b                                          pop rbx
      64:   c3                                          ret
31

Let's read it roughly.

The prologue is up to the 0xe line.
Assign 0x01 to the rdi register on line 0x12
Assign 0x02 to the rcx register on line 0x19
Assign 0x03 to the r8 register on line 0x20
Substitute the result of rcx * r8 into the r11 register on line 0x27-0x2a
Assign the result of rdi + r11 to the rcx register on line 0x2e
Assign 0x02 to the rdi register on line 0x32
Assign 0x0e to the r8 register on line 0x39
Substitute the result of r8 --r11 into the rbx register on line 0x40-0x4a
Substitute the result of rdi * rbx into the r8 register on line 0x4d-0x50
Assign the result of rcx + r8 to the rdi register on line 0x54
Assign rdi to the rax register on line 0x58
From the 0x5b line, it is an epilogue.

The return value of the function is determined to be the rax register on x64, so this will return the result.

Then in another example.

var code = new Compiler()
       .compile(ignore.then(expression)
                      .parseAll("(( 123 ) * 2 * 4 - 3 * ( 12 + 10 )) % 100").value);
code.dump();
System.println(code.run());

This is also the output on Linux.

       0:   53                                          push rbx
       1:   41 57                                       push r15
       3:   41 56                                       push r14
       5:   48 8b df                                    mov rbx, rdi
       8:   4c 8b fe                                    mov r15, rsi
       b:   4c 8b f2                                    mov r14, rdx
       e:   48 83 ec 10                                 sub rsp, 0x10
      12:   48 c7 c7 7b 00 00 00                        mov rdi, 0x7b
      19:   48 c7 c1 02 00 00 00                        mov rcx, 0x2
      20:   49 89 f8                                    mov r8, rdi
      23:   4c 0f af c1                                 imul r8, rcx
      27:   48 c7 c7 04 00 00 00                        mov rdi, 0x4
      2e:   4c 89 c1                                    mov rcx, r8
      31:   48 0f af cf                                 imul rcx, rdi
      35:   48 c7 c7 03 00 00 00                        mov rdi, 0x3
      3c:   49 c7 c0 0c 00 00 00                        mov r8, 0xc
      43:   49 c7 c3 0a 00 00 00                        mov r11, 0xa
      4a:   4b 8d 1c 18                                 lea rbx, [r8+r11]
      4e:   49 89 f8                                    mov r8, rdi
      51:   4c 0f af c3                                 imul r8, rbx
      55:   48 89 cf                                    mov rdi, rcx
      58:   49 2b f8                                    sub rdi, r8
      5b:   48 c7 c1 64 00 00 00                        mov rcx, 0x64
      62:   48 89 f8                                    mov rax, rdi
      65:   48 89 ce                                    mov rsi, rcx
      68:   48 99                                       cqo
      6a:   48 f7 fe                                    idiv rsi
      6d:   48 89 d6                                    mov rsi, rdx
      70:   49 89 f0                                    mov r8, rsi
      73:   4c 89 c0                                    mov rax, r8
      76:   48 83 c4 10                                 add rsp, 0x10
      7a:   41 5e                                       pop r14
      7c:   41 5f                                       pop r15
      7e:   5b                                          pop rbx
      7f:   c3                                          ret
18

There are also calculation results!

in conclusion

Normally, scripting languages convert AST to its own intermediate format code (bytecode) and execute it. So is Kinx. However, what you are doing is not much different from the above. It's more like doing something similar to Compiler and running it in your own ʻInterpreter. ʻIntrpreter is not needed if it can be converted to native machine code. Normally, it deals with multiple length operations and class objects, and dynamically typed languages do not know the type until runtime, so it is only troublesome to compile it into machine language. In particular, if you want to support x64 (Windows / Linux) / ARM and so on, you have to make a compiler for each. So bytecoding is a best practice.

That said, JIT is doing its best for serious products. Kinx can also be partially JIT-enabled with native functions. Moreover, since it is realized within the range that can be expressed by the abstract assembler (SLJIT), it should be highly portable to various platforms.

In any case, JIT has always been a hot topic in language processing, so I thought that the JIT library of Kinx would be fun as a little toy. I praise myself. Moreover, since I worked hard this time to implement a parser combinator, I think I can easily create my own DSL. I think it's quite fun, but how about it?

The catchphrase for the Kinx JIT library is ** "feel free to JIT" **, isn't it?

So, next time.