Amir Karger > Language-Basic-1.44 > Language::Basic::Token

Download:
Language-Basic-1.44.tar.gz

Dependencies

Annotate this POD

CPAN RT

New  1
Open  0
View/Report Bugs
Source  

NAME ^

Language::Basic::Token - Module to handle lexing BASIC statements.

SYNOPSIS ^

See Language::Basic for the overview of how the Language::Basic module works. This pod page is more technical.

     # lex a line of basic into a bunch of tokens.
     my $token_group = new Language::Basic::Token::Group;
     $token_group->lex('PRINT "YES","NO" : A=A+1');

     # Look at tokens
     my $tok = $token_group->lookahead && print $tok->text;
     # Eat expected tokens
     my $tok = $token_group->eat_if_string(",");
     my $tok = $token_group->eat_if_class("Keyword");

DESCRIPTION ^

BASIC tokens are pretty simple. They include Keywords, Identifiers (Variable or Function names), String and Numeric Constants, and a few one- or two-character operators, like ':' and '<='. Tokens aren't very ambiguous, so for example, you don't need to know what type of Statement you're looking at in order to lex a line of BASIC. (The only remotely ambiguous thing is that '=' can be either a Relational Operator or an Assignment statement.)

The subclasses of LB::Token represent the various sorts of tokens. The Token::Group class isn't really a subclass at all; it's a group of tokens. See "Language::Basic::Token::Group" for more info.

The "text" method returns the text that makes up the token. Note that text is stored in upper case (except for string constants, which are stored exactly as entered).

class Language::Basic::Token::Group

This important class handles a group of tokens. Text from the BASIC program is lexed and turned into LB::Tokens which are stored in a Token::Group. Any access to these Tokens (including creating them) is through the Token::Group methods. Other classes' parse methods will usually eat their way through the tokens in the Token::Group until it's empty.

new

This method just creates a new LBT::Group.

lex

This method breaks BASIC text arg1 into LB::Tokens and puts them in Token::Group arg0.

lookahead

This method returns the next token in the Token::Group without removing it from the group. That means lookahead can be called many times and keep getting the same token (as long as eat is never called). It returns undef if there are no more Tokens left.

eat

This method eats the next Token from the Token::Group and returns it. It returns undef if there are no more Tokens left.

eat_if_string

This method eats the next token from Group arg0 if it matches string arg1 If it ate a token, it returns it. Otherwise (or if there are no tokens left) it returns undef.

Note that the string to match should be upper case, since all \w tokens are stored as uppercase.

eat_if_class

This method eats the next token from Group arg0 if the token is of class "Language::Basic::Token::" . arg1. (I.e., it's called with "Keyword" to get a Language::Basic::Token::Keyword Token.) If it ate a token, it returns it. Otherwise (or if there are no tokens left) it returns undef.

slurp

Eats tokens from Group arg1 and puts them in Group arg0 until it gets to a Token whose text matches string arg2 or it reaches the end of arg1. (The matching Token is left in arg1.)

stuff_left

Returns true if there's stuff left in the Statement we're parsing (i.e. if there are still tokens left in the Token::Group and the next token isn't a colon)

print

For debugging purposes. Returns the Tokens in Group arg0 nicely formatted.

Other Language::Basic::Token subclasses

The other subclasses are actually kinds of Tokens, unlike Token::Group. There are no "new" methods for these classes. Creation of Tokens is done by Token::Group::lex. In fact, these classes don't have any public methods. They're mostly there to use "isa" on.

Keyword

A BASIC keyword (reserved word)

Identifier

An Identifier matches /[A-Z][A-Z0-9]*\$?/. It's a variable or function name.

String_Constant

Stuff inside double quotes.

Numeric_Constant

A float (or integer, currently)

Left_Paren

A "("

Right_Paren

A ")"

Separator

Comma or semicolon (separators in arglists, PRINT statements)

Arithmetic_Operator

Plus or minus

Multiplicative_Operator

Multiply or divide operators ('*' and '/')

Relational_Operator

Greater than, less than, equals, and their combinations. Note that equals sign is also used to assign values in BASIC.

Logical_Operator

AND, OR, NOT

Comment

REM statement (includes the whole rest of the line, even if there are colons in it)

Statement_End

End of a statement (i.e., a colon)

syntax highlighting: