The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::RecDescent::Simple - Quick and dirty use of the excellent Parse::RecDescent if you just want a simple tree structure

VERSION

Version 0.02

SYNOPSIS

I love Parse:RecDescent, but I don't love writing the callback code for every stupid little thing I write, and I really don't love the overcomplicated structure you get back from autotree mode. So this module uses XML::xmlapi (a structure manipulation module of my own authorship that bears, at this juncture, only a fleeting relation to XML-specific usage) to produce a simple tree structure based on the recursive descent parser specification you pass it.

This structure does not bless tree nodes into classes; you can do that yourself. It doesn't preserve information about which branch of a rule is selected, either. The name of each node will correspond to the rule name in the parser that sanctioned it.

I originally started dabbling in Parse::RecDescent when I wanted to parse strings like this: dialog (parameter, parameter) [option, option] "title" Here, the function word is mandatory, while the parameter list (which parameterizes the dialog), the option list (which determines how the dialog fits into its parent, not shown), and the title are all optional. I used to write string manipulation routines - badly - to handle this kind of thing, and this year I finally got fed up with it. Parse::RecDescent is the obvious answer.

But Parse::RecDescent has too damn many options. All I want is to grab the stuff you see up there. The <autotree> directive kinda sorta gives me something like what I want, but it rapidly gets too difficult to extract the good stuff as soon as the grammar has any alternate subrules, because each alternate subrule is reflected in the output structure. Not what I want. Hence this.

The grammar I want for the line above is pretty simple:

    line: word parmlist(?) optionlist(?) label(?) colon(?)
    colon: ":"
    parmlist: "(" option(s /,\s*|\s+/) ")"
    optionlist: "[" option(s /,\s*|\s+/) "]"
    label: <perl_quotelike>
    word: /[A-Za-z0-9_\-]+/
    option: /[A-Za-z0-9_\- ]+/ | <perl_quotelike>
        

The <perl_quotelike> directive rocks my world, by the way. Anyway, we'd use that grammar as follows:

    use Parse::RecDescent::Simple;

    my $parser = Parse::RecDescent::Simple->new(q{
            parse: line
            line: word parmlist(?) optionlist(?) label(?) colon(?)
        colon: ":"
        parmlist: "(" option(s /,\s*|\s+/) ")"
        optionlist: "[" option(s /,\s*|\s+/) "]"
        label: <perl_quotelike>
        word: /[A-Za-z0-9_\-]+/
        option: /[A-Za-z0-9_\- ]+/ | <perl_quotelike>
        });
        
        my $parse = $parser->parse("this (is, a) ['test, ing', of all] \"this stuff\"");
        print $parse->string() . "\n";
        

This just prints the XML representation of the parse tree for the string passed, and it returns this if all goes well:

   <line>
   <word>this</word>
   <parmlist>(
   <option>is</option>
   <option>a</option>
   )</parmlist>
   <optionlist>[
   <option>test, ing</option>
   <option>of all</option>
   ]</optionlist>
   <label>this stuff</label>
   </line>
   

There are obviously a lot of things that could be improved to make this more flexible, but let's face it - this already gives me everything I need today. Perhaps it will help you, too. And frankly, it's about twenty lines of Perl and two hours invested making it something that works - it's just that it's two hours I'll never have to spend again, unlike every other time I've wanted to do something with Parse::RecDescent.

SUBROUTINES/METHODS

new(specification)

Creates a new parser object based on the specification you pass. Version 0.1 requires that one rule is named 'parse'; that is the rule that will be called. Very Nasty Errors will occur if you forget this. If parse just fronts for another rule, though, the topmost tag will be named for the other rule, so it all kind of makes sense in the end.

parse($string)

Given a parser object you created earlier, parses the string you pass into an XML::xmlapi structure.

process(@item)

Called by Parse::RecDescent during parsing. This part you could actually override if you wanted to subclass.

AUTHOR

Michael Roberts, <michael at vivtek.com>

BUGS

Please report any bugs or feature requests to bug-parse-recdescent-simple at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Parse-RecDescent-Simple. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

Let me assure you, there will be bugs. I haven't even started to write a hundredth of the test cases this should actually be run through. It will stop silently mid-parse and give you perfectly legitimate-looking results if things don't match later; it will fail in horrible and unexpected ways on perfectly reasonable grammars; it won't really do what you expect, unless you expect it to do what I intended it to do today. You've been warned. That said, please send me your grammars and what you expected them to do; the Muse willing, I will write up test cases and (someday) make them work. Or send money. The Muse is sometimes convinced by money, and it's worth a try.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc Parse::RecDescent::Simple

You can also look for information at:

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

Copyright 2010 Michael Roberts.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.