I wrote a C parser (like) using PEG in Ruby

It is an article that I wrote something like a C parser in Ruby. Even though I wrote a C parser, it is not a strict and highly complete C parser like pycparser implemented in Python, but it is a miscellaneous implementation that took about 3 days to implement.

Repository: github.com/hsssnow23/Captain

sample input:

typedef struct {
    unsigned int id;
    float x;
    float y;
} Actor;

output:

#<CTypedef:0x000000037809a8
 @from=
  #<CStruct:0x0000000376a068
   @body=
    [#<CVariable:0x000000034f0350
      @name="id",
      @type=
       #<CType:0x000000034f3780
        @const=false,
        @name="int",
        @pointer=false,
        @prefix="unsigned">,
      @value=nil>,
     #<CVariable:0x000000035b6ca8
      @name="x",
      @type=
       #<CType:0x000000035ad950
        @const=false,
        @name="float",
        @pointer=false,
        @prefix=nil>,
      @value=nil>,
     #<CVariable:0x000000036a0df8
      @name="y",
      @type=
       #<CType:0x000000036a3a30
        @const=false,
        @name="float",
        @pointer=false,
        @prefix=nil>,
      @value=nil>],
   @name=nil>,
 @to="Actor">

Originally, it was a parser created for a tool that automatically generates code by adding additional information to C with annotations, but it is really slow. I think the main reason is that the PEG parser used for the implementation is not Packrat Parsing in my implementation. So, in this article, I would like to write about how it was when I actually used the PEG parser.

Roughly summarized first

Benefits of PEG

--In a language like Ruby that can overload operators, it is easy to understand because it can be written like DSL. (Parser generators such as lex yacc have a lot of tricks, so I think it's a little hard to get along with.) ――It is easy to implement the PEG parser itself if it is a simple one that ignores speed. (Finally, the implementation of the PEG parser itself is 269 lines.) ――Since lexical analysis and parsing can be performed at the same time, it saves time and effort. Therefore, it can be used as easily as a regular expression. --Unlike regular expressions, you can parse parentheses.

Disadvantages of PEG

--Simple implementation cannot be executed with O (n). --Left recursion is not possible.

Impressions

easy. It's overwhelmingly easy. I think the big advantage compared to other parsers is that you can start writing as soon as you think what is easy. You can skip lexical analysis and write a parser to create a syntax tree, so it's a simple parser, but I think it's best suited when you want to do it richer than regular expressions. However, I thought it would be a little difficult to write a parser in PEG, although it is already a specification like the C parser. Many existing programming languages are made with parser generators such as lex and yacc, and it is difficult to ensure consistency with them, and PEG is still young and it is very clear how far it can be parsed. It seems that it is not. (Honestly, I'm not confident that I can parse it if it's a C language source that pokes in the corner)

However, I felt that it would be easier to write a parser whose parser itself changes depending on the content of the parsing. (I think there are few situations where it is needed)

Summary

In my final conclusion, PEG is the most recommended parser. However, while I'm sure a small format parser that you can specify yourself is a good choice, I thought it might be subtle to use elsewhere.

Reference article: http://kmizu.hatenablog.com/entry/20100203/1265183754

Recommended Posts

I wrote a C parser (like) using PEG in Ruby
I made a Ruby extension library in C
I tried to write code like a type declaration in Ruby
I tried a calendar problem in Ruby
I wrote a primality test program in Java
I wrote a prime factorization program in Java
When installing a gem with C extension in Ruby, I want to finish it quickly using multiple CPU cores like make -j4
Do something like a JS immediate function in Ruby
What you write in C # is written in Ruby like this
Implemented XPath 1.0 parser in Ruby
I want to find a relative path in a situation using Path
Creating a calendar using Ruby
I searched for a web framework with Gem in Ruby
Multiplication in a Ruby array
Create a fortune using Ruby
I want to create a Parquet file even in Ruby
Try using gRPC in Ruby
I wrote a code to convert numbers to romaji in TDD
[Ruby 3.0] A memo that I added a type definition to a library I wrote
I wrote a route search program in TDD and refactored it
Sorting hashes in a Ruby array
Ruby: I made a FizzBuzz program!
Write Ruby methods using C (Part 1)
I created a PDF in Java.
Implement a gRPC client in Ruby
I wrote Goldbach's theorem in java
Make a SOAP call in C #
How to create a query using variables in GraphQL [Using Ruby on Rails]
I wrote a Lambda function in Java and deployed it with SAM
I thought about the best way to create a ValueObject in Ruby
I tried to make a talk application in Java using AI "A3RT"
I tried using Elasticsearch API in Java
[Ruby] I made a simple Ping client
Methods that I found useful in Ruby
I made a risky die with Ruby
Write Ruby methods using C (Numo :: NArray)
Implement something like a stack in Java
Write Ruby methods using C ++ (Part 2) Benchmark
I made a bulletin board using Docker 1
I tried embedding a formula in Javadoc
[Rails / JavaScript / Ajax] I tried to create a like function in two ways.
I made a sample of how to write delegate in SwiftUI 2.0 using MapKit
[When using MiniMagick] A memorandum because I stumbled in the CircleCI test environment.