The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Parse::Pyapp - PCFG Parser

SYNOPSIS

  use Parse::Pyapp;

  my $parser = Parse::Pyapp->new();

  $parser->addrule($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);

  $parser->addlex($LHS, [ $RHS_1, $P_RHS_1 ], [ $RHS_2, $P_RHS_2 ]);

  $parser->start($LHS);

  $parser->parse(@words) or print "Parse error\n";

DESCRIPTION

This module is a (PCFG | SCFG) parser. You may use this module to do stochastic parsing.

USAGE

Initiation of a parser

    $parser = Parse::Pyapp->new();

Adding lexicons

    $parser->addlex('N',
                      [ 'house', .5 ],
                      [ 'book', .5 ]
                      );

You can hook an semantic action to alexicon. For instance,

    $parser->addlex('N',
                        [ 'house', .5 ],
                        [ 'book', .5 ],
                        sub { print $_[1] }
                      );

Parse::Pyapp passes the parser itself as the first parameter, and the lexicon comes in the second place. The left-hand-side symbol can be accessed with $_[0]->{lhs}.

Adding rules

    $parser->addrule('VP',
                   [ 'V', 0.5 ],
                   [ 'V', 'NP', .5 ]
                   );

First one is the LHS symbol, and then follow all the possible right-hand-side derivations with their probabilities.

Similarly, you can hook semantic actions to the end of a derivation. For instance,

    $parser->addrule('VP',
                   [ 'V', 0.5, sub { print $_[1] } ],
                   [ 'V', 'NP', .5 ]
                   );

Parse::Pyapp passes the parser itself as the first parameter, and the corresponding tokens as the rest. The left-hand-side symbol can be accessed with $_[0]->{lhs}, and right-hand POS tags with @{$_->{pos}}

Currently, this module does not check if the sum of probabilities going out from a non-terminal is equal to 1.

Setting the starting symbol

    $parser->start('S');

Parsing a sentence

You need to tokenize the sentence yourself.

    $parser->parse(@words);

It returns non-undef if there is no error.

CAVEATS

This is still an alpha version, and everything is subject to change. Use it with your cautions. By the way, since it's all written in Perl, thus slowness is the fate.

TO DO

    Grammar learning, lexical relations, structural modeling, yacc-like input, error handling, etc. There is a lot of room for improvement.

COPYRIGHT

xern <xern@cpan.org>

This module is free software; you can redistribute it or modify it under the same terms as Perl itself.