The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.

=encoding utf8

=head1 API documentation: The AST API from the Macro Perspective

Macros in Perl have access to the raw, just-parsed representation of
the program through their parameters. This representation is called
the C<AST>. The AST is documented in full elsewhere, but this document
is written from the perspective of the macro itself, and demonstrates
the major components of the AST that must be in place for typical
macros to function.

=head2 Macro definition

A macro definition looks like this:

 macro fizzle(...) { ... }

Within the parameter list, any parameters can be defined, but they will
always be of type C<AST>, some sub-type of C<AST> or plain scalars.

An invocant may be declared. If so, it will be the C<$/> object that
contains the macro invocation itself, and thus contains all state
relevant to the macro.

A macro can have an operator type:

 macro quote:<q> (...) { ... }

If it does, the operator type will constrain the parameters (if any)
that are available to the macro, and a mis-match between the operator
type and the parameter count will generate a compile-time error.

=head2 Macro types

Macros can be defined for each of the following operator types:

=over

=item list

List operators are typical function-like macros that take arguments
as a list of named or positional parameters. Each parameter is an
C<Expression> node.

=item term

Term macros are like list macros, but never take any arguments.


=item quote

[ Note: notice camel-casing of LiteralString. Is that what was intended?
  -ajs
]

Quote macros take a single C<LiteralString> node. Backslash escaping
is performed only for the balancing quote terminator. Should some other
sort of parsing be required, use the "is parsed" trait.

=item prefix

Prefix macros take only a single C<Expression> parameter.

=item infix

Infix macros take two C<Expression> parameters.

=item postfix

Postfix macros can never be "is parsed" because their only parameter
(an C<Expression> node) is found before the macro name in the
program text.

=item circumfix

Circumfix macros take a single C<Expression> as a parameter.

=item postcircumfix

Postcircumfix macros take two parameters. The first a C<Value>. The
second is an C<Expression> that comes inside of the circumfix delimeters.

=item regex_metachar

=item regex_backslash

=item regex_mod_internal

=item regex_assertion

=item regex_mod_external

TBD

=item trait_verb

=item trait_auxiliary

[ Ok, here's a scary kind of question... can a macro be multi,
  dispatched at compile-time based on AST sub-types? If so, that
  answers the question of how "class Foo is Bar" is distinguished
  from "my Foo $x is Bar" which is similar, but certainly has very
  different types of parameters (the LHS is an AST node that
  is either a class name or a variable declaration). -ajs
]

=item scope_declarator

Scope_delcatator macros take a C<Pad> node which contains the information
about the variable being declared.

=item statement_control

Statement_controls are like if or while, and take an C<Expression>
and a C<Block>.

[ Question: How is elsif/else handled? Are they named as part of the
  macro somehow, or must any elsif block ever be called "elsif"? -ajs
]

=item statement_modifier

Statement_modifier macros take two parameters, a C<Statement> and
a C<expression>.

=item infix_prefix_meta_operator

=item infix_postfix_meta_operator

=item prefix_circumfix_meta_operator

All of these macro types take three paramters: a C<Expression> for
the LHS, an C<InfixOperator> node for the operator that it is modifying,
and an C<Expression> for the RHS.

[ Question: infix_postfix_meta_operator is designed for defining
  '=' which will take an LValue for its first parameter, but
  not all such operators will be for assignment, potentially.
  Again, this brings up the question of multi dispatch on AST
  node types. Is that what's intended, or should macro operator
  types be richer? -ajs
]

=item postfix_prefix_meta_operator

Postfix_prefix_meta_operator macros take an C<LValue> and a
C<PostfixOperator> as parameters.

=item prefix_postfix_meta_operator

Prefix_postfix_meta_operator macros take a C<PrefixOperator>
and an C<LValue> as parameters.

=item infix_circumfix_meta_operator

Infix_circumfix_meta_operator macros take an C<InfixOperator> and
an C<Expression> as parameters.

[ Question: What about sub, is that a statement_control? What
  about use? Is that just a list op that does its thing at
  run-time? -ajs ]

=head2 Accessing AST Internals

C<AST>s can be treated much the same as any rule state object. They
can be indexed like a hash to extract their match terms (in this case,
subrule names that match AST nodes). However, they also carry state
information about the file being parsed.

[ Question: how is that state information extracted? Is there
  a method that can be called to get line number for example? -ajs ]

Expressions are the most often-seen element of a macro's parameter list.
Expressions

An Expression is either a Literal, a LValue (Variable/Apply/Call), or
one of the special forms (Binding/Assignment). It can be tested like so:

 macro debug(*@exprs) {
   for @exprs -> $expr {
     if exists $expr<Literal> {
       ...
     } elsif exists $expr<LValue> {
       ...
     } elsif exists $expr<Binding> or exists $expr<Assignment> {
       ...
     } else {
       die "Unknown Expression type '$expr'\n";
     }
   }
 }

But in many cases, such a test is not required, as Expressions are so
universally useful:

 use AST::Tools :all;
 macro debug(*@exprs) {
   q:code{ say {{{ astlist(@exprs) }}} };
 }

[ Question: we need some tools for constructing AST nodes from other
  AST nodes. One of the most obvious to me is the above, but my name
  for it ("astlist") is just a suggestion. It's probably exported
  by something like AST::Tools -ajs ]

[ Question: Another way to do that would be to have a generic
  AST initializer so that this worked:
	q:code{ say {{{ AST.new('List',@exprs) }}} };
  which might make more sense, as you could do arbitrarily complex
  things with the combination of high-level tools and initializers like:
	macro curry($subroutine, *@args) {
	  my $body = call_as_block($subroutine,\(=@args));
	  return AST.new('Closure', :$body);
	}
  -ajs
]

... More to come once we work out the questions above ...

=cut