=encoding utf8
=head1 API documentation: The AST API from the Macro Perspective
Macros in Perl have access to the raw, just-parsed representation of
the program through their parameters. This representation is called
the C<AST>. The AST is documented in full elsewhere, but this document
is written from the perspective of the macro itself, and demonstrates
the major components of the AST that must be in place for typical
macros to function.
=head2 Macro definition
A macro definition looks like this:
macro fizzle(...) { ... }
Within the parameter list, any parameters can be defined, but they will
always be of type C<AST>, some sub-type of C<AST> or plain scalars.
An invocant may be declared. If so, it will be the C<$/> object that
contains the macro invocation itself, and thus contains all state
relevant to the macro.
A macro can have an operator type:
macro quote:<q> (...) { ... }
If it does, the operator type will constrain the parameters (if any)
that are available to the macro, and a mis-match between the operator
type and the parameter count will generate a compile-time error.
=head2 Macro types
Macros can be defined for each of the following operator types:
=over
=item list
List operators are typical function-like macros that take arguments
as a list of named or positional parameters. Each parameter is an
C<Expression> node.
=item term
Term macros are like list macros, but never take any arguments.
=item quote
[ Note: notice camel-casing of LiteralString. Is that what was intended?
-ajs
]
Quote macros take a single C<LiteralString> node. Backslash escaping
is performed only for the balancing quote terminator. Should some other
sort of parsing be required, use the "is parsed" trait.
=item prefix
Prefix macros take only a single C<Expression> parameter.
=item infix
Infix macros take two C<Expression> parameters.
=item postfix
Postfix macros can never be "is parsed" because their only parameter
(an C<Expression> node) is found before the macro name in the
program text.
=item circumfix
Circumfix macros take a single C<Expression> as a parameter.
=item postcircumfix
Postcircumfix macros take two parameters. The first a C<Value>. The
second is an C<Expression> that comes inside of the circumfix delimeters.
=item regex_metachar
=item regex_backslash
=item regex_mod_internal
=item regex_assertion
=item regex_mod_external
TBD
=item trait_verb
=item trait_auxiliary
[ Ok, here's a scary kind of question... can a macro be multi,
dispatched at compile-time based on AST sub-types? If so, that
answers the question of how "class Foo is Bar" is distinguished
from "my Foo $x is Bar" which is similar, but certainly has very
different types of parameters (the LHS is an AST node that
is either a class name or a variable declaration). -ajs
]
=item scope_declarator
Scope_delcatator macros take a C<Pad> node which contains the information
about the variable being declared.
=item statement_control
Statement_controls are like if or while, and take an C<Expression>
and a C<Block>.
[ Question: How is elsif/else handled? Are they named as part of the
macro somehow, or must any elsif block ever be called "elsif"? -ajs
]
=item statement_modifier
Statement_modifier macros take two parameters, a C<Statement> and
a C<expression>.
=item infix_prefix_meta_operator
=item infix_postfix_meta_operator
=item prefix_circumfix_meta_operator
All of these macro types take three paramters: a C<Expression> for
the LHS, an C<InfixOperator> node for the operator that it is modifying,
and an C<Expression> for the RHS.
[ Question: infix_postfix_meta_operator is designed for defining
'=' which will take an LValue for its first parameter, but
not all such operators will be for assignment, potentially.
Again, this brings up the question of multi dispatch on AST
node types. Is that what's intended, or should macro operator
types be richer? -ajs
]
=item postfix_prefix_meta_operator
Postfix_prefix_meta_operator macros take an C<LValue> and a
C<PostfixOperator> as parameters.
=item prefix_postfix_meta_operator
Prefix_postfix_meta_operator macros take a C<PrefixOperator>
and an C<LValue> as parameters.
=item infix_circumfix_meta_operator
Infix_circumfix_meta_operator macros take an C<InfixOperator> and
an C<Expression> as parameters.
[ Question: What about sub, is that a statement_control? What
about use? Is that just a list op that does its thing at
run-time? -ajs ]
=head2 Accessing AST Internals
C<AST>s can be treated much the same as any rule state object. They
can be indexed like a hash to extract their match terms (in this case,
subrule names that match AST nodes). However, they also carry state
information about the file being parsed.
[ Question: how is that state information extracted? Is there
a method that can be called to get line number for example? -ajs ]
Expressions are the most often-seen element of a macro's parameter list.
Expressions
An Expression is either a Literal, a LValue (Variable/Apply/Call), or
one of the special forms (Binding/Assignment). It can be tested like so:
macro debug(*@exprs) {
for @exprs -> $expr {
if exists $expr<Literal> {
...
} elsif exists $expr<LValue> {
...
} elsif exists $expr<Binding> or exists $expr<Assignment> {
...
} else {
die "Unknown Expression type '$expr'\n";
}
}
}
But in many cases, such a test is not required, as Expressions are so
universally useful:
use AST::Tools :all;
macro debug(*@exprs) {
q:code{ say {{{ astlist(@exprs) }}} };
}
[ Question: we need some tools for constructing AST nodes from other
AST nodes. One of the most obvious to me is the above, but my name
for it ("astlist") is just a suggestion. It's probably exported
by something like AST::Tools -ajs ]
[ Question: Another way to do that would be to have a generic
AST initializer so that this worked:
q:code{ say {{{ AST.new('List',@exprs) }}} };
which might make more sense, as you could do arbitrarily complex
things with the combination of high-level tools and initializers like:
macro curry($subroutine, *@args) {
my $body = call_as_block($subroutine,\(=@args));
return AST.new('Closure', :$body);
}
-ajs
]
... More to come once we work out the questions above ...
=cut