Saturday, December 1, 2012

Rapid Programming Language Prototypes with Ruby & Racc, Commentary

I just watched a tolerable ruby conference video by Tom Lee on Rapid Programming Language Prototypes with Ruby & Racc.

What he showed he showed fairly well. His decision to "introduce compiler theory" was, he admitted, last-minute and the hesitation in its delivery bore testimony to that. The demonstration of the compiler pipeline using his intended tools (ruby and racc) was done quite well with a natural progression through the dependent concepts along the way. By the end of the talk he has a functional compiler construction tool chain going from EBNF-ish grammar through to generated (and using gcc, compiled) C code.

I was surprised that nobody in the audience asked the question I was burning to ask from half way through the live-coding session: Why not use Treetop? (or the more generic: why not use a peg parser generator or a parser generator that does more of the heavy lifting for you?)

The whole point of Tom's presentation is: use ruby+racc because it saves you from all the headaches of setting up the equivalent tool chain in C/C++. And it does, he's right. But it feels to me that Treetop does even more of that hard work for you, allowing you to more quickly get to the fun part of actually building your new language. I'm angling for simplicity here.

I could be wrong, though, so let me ask it here (as Confreaks seems to not allow comments): Why not treetop (or an equally 'simple' parser generator) for something like this? (and answers along the lines of EBNF > PEG are not really what I'm after, but if you have a concrete example of that I'd like to hear it too.)

On a completely separate note: Tom, you need to add some flying love to your Vim habits. :-)

1 comment:

  1. Hey Barry,

    Thanks for watching the video. :)

    "Why not use Treetop/PEG?"

    Folks actually did come up and ask about Treetop & Parslet after the presentation. I looked at Treetop a year or so back, got cranky with the whitespace thing & purged what little I knew from my fickle memory. I didn't have a great answer for the folks at the talk, but you get the benefit of further contemplation. Yay!

    I think I kinda mentioned in passing (parsing? haw haw) that I had originally imagined the audience of the presentation to be folks with some exposure to the C/Flex/Bison tool chain. Racc and Bison both generate LALR(1) parsers out of the box, and the whole thing just feels more like "home" for somebody familiar with Bison.

    It was also nice to show the explicit lexical analysis step. And I hate seeing my grammars polluted by rules dealing with whitespace. But that's just me. :) I guess you might be able to do a pass over the source to remove said whitespace in advance, but then you're kinda doing a poor-man's lex anyway. Meh.

    And porting a Racc grammar to Bison would be a snap. Well, comparatively speaking. :)

    If you're looking for *real* reasons why PEGs may not be everybody's cup of tea (contrast to my opinionated, handy-wavey, borderline bullshit reasons):
    http://en.wikipedia.org/wiki/Parsing_expression_grammar#Disadvantages

    Of course, all these reasons are complete bunk if you're just screwing around anyway, so hey -- use what works for you. :)

    "... it feels to me that Treetop does even more of that hard work for you, allowing you to more quickly get to the fun part of actually building your new language."

    Assuming you're talking about productivity with Racc vs. something else here.

    Again I admit I didn't give PEGs much of a chance (and I'm sure there are real, proper language implementation experts out there shaking their head at me), but my knee-jerk reaction is to disagree with you in that I suspect productivity would be comparable once you get over the learning curve.

    "Tom, you need to add some flying love to your Vim habits."

    Done and done. :)

    Happy to talk more about any of this stuff. Feel free to drop me a line.

    Cheers,
    Tom

    ReplyDelete