Language::Basic::Token - Module to handle lexing BASIC statements.
See Language::Basic for the overview of how the Language::Basic module works. This pod page is more technical.
# lex a line of basic into a bunch of tokens. my $token_group = new Language::Basic::Token::Group; $token_group->lex('PRINT "YES","NO" : A=A+1'); # Look at tokens my $tok = $token_group->lookahead && print $tok->text; # Eat expected tokens my $tok = $token_group->eat_if_string(","); my $tok = $token_group->eat_if_class("Keyword");
BASIC tokens are pretty simple. They include Keywords, Identifiers (Variable or Function names), String and Numeric Constants, and a few one- or two-character operators, like ':' and '<='. Tokens aren't very ambiguous, so for example, you don't need to know what type of Statement you're looking at in order to lex a line of BASIC. (The only remotely ambiguous thing is that '=' can be either a Relational Operator or an Assignment statement.)
The subclasses of LB::Token represent the various sorts of tokens. The Token::Group class isn't really a subclass at all; it's a group of tokens. See "Language::Basic::Token::Group" for more info.
The "text" method returns the text that makes up the token. Note that text is stored in upper case (except for string constants, which are stored exactly as entered).
This important class handles a group of tokens. Text from the BASIC program is lexed and turned into LB::Tokens which are stored in a Token::Group. Any access to these Tokens (including creating them) is through the Token::Group methods. Other classes' parse methods will usually eat their way through the tokens in the Token::Group until it's empty.
This method just creates a new LBT::Group.
This method breaks BASIC text arg1 into LB::Tokens and puts them in Token::Group arg0.
This method returns the next token in the Token::Group without removing it from the group. That means lookahead can be called many times and keep getting the same token (as long as eat is never called). It returns undef if there are no more Tokens left.
This method eats the next Token from the Token::Group and returns it. It returns undef if there are no more Tokens left.
This method eats the next token from Group arg0 if it matches string arg1 If it ate a token, it returns it. Otherwise (or if there are no tokens left) it returns undef.
Note that the string to match should be upper case, since all \w tokens are stored as uppercase.
This method eats the next token from Group arg0 if the token is of class "Language::Basic::Token::" . arg1. (I.e., it's called with "Keyword" to get a Language::Basic::Token::Keyword Token.) If it ate a token, it returns it. Otherwise (or if there are no tokens left) it returns undef.
Eats tokens from Group arg1 and puts them in Group arg0 until it gets to a Token whose text matches string arg2 or it reaches the end of arg1. (The matching Token is left in arg1.)
Returns true if there's stuff left in the Statement we're parsing (i.e. if there are still tokens left in the Token::Group and the next token isn't a colon)
For debugging purposes. Returns the Tokens in Group arg0 nicely formatted.
The other subclasses are actually kinds of Tokens, unlike Token::Group. There are no "new" methods for these classes. Creation of Tokens is done by Token::Group::lex. In fact, these classes don't have any public methods. They're mostly there to use "isa" on.
A BASIC keyword (reserved word)
An Identifier matches /[A-Z][A-Z0-9]*\$?/. It's a variable or function name.
Stuff inside double quotes.
A float (or integer, currently)
Comma or semicolon (separators in arglists, PRINT statements)
Plus or minus
Multiply or divide operators ('*' and '/')
Greater than, less than, equals, and their combinations. Note that equals sign is also used to assign values in BASIC.
AND, OR, NOT
REM statement (includes the whole rest of the line, even if there are colons in it)
End of a statement (i.e., a colon)