Parse::Token::Lite - Simply parse String into tokens with rules which are similar to Lex.
version 0.200
use Parse::Token::Lite; my %rules = ( MAIN=>[ { name=>'NUM', re=> qr/\d[\d,\.]*/ }, { name=>'STR', re=> qr/\w+/ }, { name=>'SPC', re=> qr/\s+/ }, { name=>'ERR', re=> qr/.*/ }, ], ); my $parser = Parse::Token::Lite->new(rulemap=>\%rules); $parser->from("This costs 1,000won."); while( ! $parser->eof ){ my ($token,@extra) = $parser->nextToken; print $token->rule->name."-->".$token->data."<--\n"; }
Results are
STR -->This<-- SPC --> <-- STR -->costs<-- SPC --> <-- NUM -->1,000<-- STR -->won<-- ERR -->.<--
rulemap contains hash refrence of rule objects grouped by STATE. rulemap should have 'MAIN' item.
my %rule = ( MAIN => [ Parse::Token::Lite::Rule->new(name=>'any', re=>qr/./), ], ); $parser->rulemap(\%rule);
In constructor, it can be replaced with hash reference descripting attributes of Parse::Token::Lite::Rule class, intead of Rule Object.
my %rule = ( MAIN => [ {name=>'any', re=>qr/./}, # ditto ], ); my $parser = Parse::Token::Lite->new( rulemap=>\%rule );
'data' is set by from() method. 'data' contains a rest of text which is not processed by nextToken(). Please remember, 'data' is changing.
If a length of 'data' is 0, eof() returns 1.
At first time, it contains ['MAIN']. It is reset by from().
Setting data to parse.
This causes resetting state_stack.
On Scalar context : Returns 1 On Array context : Returns array of [Parse::Token::Lite::Token,@return_values_of_callback].
Parse all tokens on Event driven. Just call nextToken() during that eof() is not 1.
Defined $data causes calling from($data).
You should set a callback function at 'func' attribute in 'rulemap' to do something with tokens.
Returns an array reference of rules of current state.
See Parse::Token::Lite::Rule.
On Scalar context : Returns Parse::Token::Lite::Token object. On Array context : Returns (Parse::Token::Lite::Token,@return_values_of_callback).
my ($token, @ret) = $parser->nextToken; print $token->rule->name . '->' . $token->data . "\n";
See Parse::Token::Lite::Token and Parse::Token::Lite::Rule.
Returns 1 when no more text is.
Push/Pop the state on state_stack to implement AUTOMATA.
Also, this is called by a 'state' definition of Parse::Token::Lite::Rule.
You can set rules as Lexer like.
my $rulemap = { MAIN => [ { name=>'QUOTE', re=>qr/'/, func=> sub{ my ($parser,$token) = @_; $parser->start('STATE_QUOTE'); # push } }, { name=>'ANY', re=>qr/.+/ }, ], STATE_QUOTE => [ { name=>'QUOTE_PAIR', re=>qr/'/, func=> sub{ my ($parser,$token) = @_; $parser->end('STATE_QUOTE'); # pop } }, { name=>'QUOTED_TEXT', re=>qr/.+/ } ], };
You can also do it in simple way.
my $rulemap = { MAIN => [ { name=>'QUOTE', re=>qr/'/, state=>['+STATE_QUOTE'] }, # push { name=>'ANY', re=>qr/.+/ }, ], STATE_QUOTE => [ { name=>'QUOTE_PAIR', re=>qr/'/, state=>['-STATE_QUOTE] }, #pop { name=>'QUOTED_TEXT', re=>qr/.+/ } ], };
Returns current state by peeking top of 'state_stack'.
And see 'samples' directory in source.
khs <sng2nara@gmail.com>
This software is copyright (c) 2013 by khs.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install Parse::Token::Lite, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::Token::Lite
CPAN shell
perl -MCPAN -e shell install Parse::Token::Lite
For more information on module installation, please visit the detailed CPAN module installation guide.