Ruby/Racc: Get the position (row, column) where parsing failed

When parsing fails, you want to display the failed position as follows, right?

parse error: line 2, col 5
foo bar
    ^

Since it is short, I will paste it all for the time being.

# parser.y

class Parser

rule
  program: "a" "b" "c" "d" ";"
  {
    puts "program found"
    result = val
  }
end

---- header

TokenValue = Struct.new(:value, :pos)

---- inner

def initialize(src, tokens)
  @src = src
  @tokens = tokens
end

def next_token
  @tokens.shift
end

def parse
  do_parse
end

def get_lineno_and_range(pos)
  lineno = 1
  line_start = 0
  range = nil

  @src.each_line do |line|
    next_line_start = line_start + line.size
    range = line_start .. (next_line_start - 1)
    break if range.include?(pos)

    line_start = next_line_start
    lineno += 1
  end

  [lineno, range]
end

def on_error(token_id, val, vstack)
  lineno, range = get_lineno_and_range(val.pos)
  colno = val.pos - range.begin + 1
  line = @src[range]

  puts "parse error: line #{lineno}, col #{colno}"

  puts line
  print " " * (colno - 1), "^\n"
end

---- footer

def tokenize(src)
  tokens = []

  pos = 0
  while pos < src.size
    c = src[pos]
    case c
    when /\s/
      pos += 1
    else
      tokens << [c, TokenValue.new(c, pos)]
      pos += 1
    end
  end

  tokens
end

src = File.read(ARGV[0])
tokens = tokenize(src)
parser = Parser.new(src, tokens)
result = parser.parse()
puts "result: " + result.inspect

A parser that accepts only the token sequence a`` b c`` d ;. In the input source code, separate tokens with spaces or line breaks.

Successful parsing example:

a
b

c d
;

For example, if you give the following source code with d replaced with X, parsing will fail.

a
b

c X
;

Execution example when it fails:

$ racc parser.y -o parser.rb

$ ruby parser.rb sample_ng.txt
parse error: line 4, col 3
c X
  ^
result: nil

It's short, so I think it's faster to read the code pasted above, but it's a quick note.

--The token value should have the start position of the token (the number of characters in the original source code). --Keep the original source code as Parser # src --If parsing fails, use Parser # src and the starting position of the failed token Determining lines and columns on the original source code

Note

All symbols, terminating / non-terminal, have a corresponding value (or semantic value in a cool way) that can be used to exchange information. (Omitted) You can use any one Ruby object you like as a value, which is synonymous with * sending anything *.

"Book for using Ruby 256 times (Amazon)" p46

So, in the above, we are using an instance of the TokenValue class as the token value.

Relation

-Ruby/Racc: Visualize stack movement during parsing like FlameGraph

The following is not related to Racc, but it is related to Ruby + parser.

-I made a simple compiler for my own language in Ruby -I made an expr command with only four arithmetic operations and remainders in Ruby -Regular expression engine (Rob Pike's backtrack implementation) copied in Ruby

Recommended Posts

Ruby/Racc: Get the position (row, column) where parsing failed
Get the column name from the Model instance
[Android] Get the tapped position (coordinates) on the screen