The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

macropp - A macro pre-processor with embedded perl capability

DESCRIPTION

See Text::MacroScript.pm for the most complete and up-to-date documentation.

See macrodir for a different approach.

These commands may appear in separate `macro' files, and/or in the body of files. Wherever a macroname or scriptname is encountered it will be replaced by the body of the macro or the result of the evaluation of the script using any parameters that are given.

If we choose the embedded perl style, then all definitions and macro names must be enclosed in the delimiters (<: and :> by default, but user-definable). The documentation assumes a macro approach, but simply by using the delimiters around all macros, scripts, variables and their names (and parameters if any) in the body text we can use the embedded perl approach.

    %DEFINE macroname [macro body]

    %DEFINE macroname
    multi-line macro body #0, #1 are the first and second parameters if any
    used
    %END_DEFINE

    %UNDEFINE macroname

    %UNDEFINE_ALL   # Undefine all macros

    %DEFINE_SCRIPT scriptname [script body]

    %DEFINE_SCRIPT scriptname
    multi-line script body arbitrary perl optional parameters are in @Param,
    although #0, etc may be used any variables are in %Var, although #varname
    may be used
    %END_DEFINE

    %UNDEFINE scriptname

    %UNDEFINE_ALL_SCRIPT

    %DEFINE_VARIABLE variablename [variable value]

    %UNDEFINE variablename

    %UNDEFINE_ALL_VARIABLE

    %LOAD[path/filename]    # Instantiate macros/scripts/variables in this
                            # file, but discard the text

    %INCLUDE[path/filename] # Instantiate macros/scripts/variables in this
                            # file and output the resultant text at the point
                            # where this command appears

 
    %CASE [condition]       # Provides #ifdef-type functionality, so we can
    %END_CASE               # skip selected parts of the text depending on the
                            # condition. The condition is usually dependent on
                            # some previously defined variable

    %REQUIRE[macroutil.pl]  # Load in the named Perl library

Thus, in the body of a file we may have, for example:

    %DEFINE &B [Billericky Rickety Builders]
    Some arbitrary text. We are writing to complain to the &B about the shoddy
    work they did.

Or, in the case of using the -e option for embedded style processing our files might look like this:

    <:
    %CASE[0]
    We keep our definitions etc as embedded commands - they will be
    instantiated but not output.
    %END_CASE DEFINE BB [Billericky Rickety Builders]
    :> Some arbitrary text. We are writing to complain to the <:BB:> about the
    shoddy work they did.

Note how we must enclose the definition in the body text between embedded delimiters, <: and :> (these are user-definable). Processing using this second approach should be a lot faster (since we only seek to expand text between the delimiters).

Macro systems vs embedded systems

Macro systems read all the text, substituting anything which matches a macro name with the macro's body (or script name with the result of the execution of the script). This makes macro systems slower (they have to check for macro/script names everywhere, not just in a delimited section) and more risky (if we choose a macro/script name that normally occurs in the text we'll end up with a mess) than embedded systems. On the other hand because they work on the whole text not just delimited bits, macro systems can perform processing that embedded systems can't. Macro systems are used extensively, for example the CPP, C pre-processor, with its #DEFINE's, etc.

Essentially, embedded systems print all text until they hit an opening delimiter. They then execute any code up until the closing delimiter. The text that results replaces everything between and including the delimeters. They then carry on printing text until they hit an opening delimeter and so on until they've finished processing all the text. This script provides a macro approach by default, but if called as macro with -e set then it behaves as an embedded perl processor and only performs macro operations on text between the delimiters.

Defining Macros with %DEFINE

In files we would write:

    %DEFINE MAC [The Mackintosh Macro]

or using the embedded style:

    <:%DEFINE MAC [The Mackintosh Macro]:>

We can call our macro anything, excluding white-space characters and [, although [ is not advised. So a name like %*&! is fine - indeed names which could not normally appear in the text are recommended to avoid having the wrong thing substituted. We should also avoid calling macros, scripts or variables names beginning with #. All names are case-sensitive.

Note that if we define a macro and then a script with the same name the script will effectively replace the macro.

We can have parameters:

    %DEFINE *P [The forename is #0 and the surname is #1]

or using the embedded style:

    <:%DEFINE *P [The forename is #0 and the surname is #1]:>

Parameters can contain square brackets since macro will grab up to the last square bracket on the line. The only thing we can't pass are `|'s since these are used to separate parameters. White-space between the macro name and the [ is optional.

Parameters are named #0, #1, etc. There is no limit, although we must use all those we specify. In the example above we must use *P[param1|param2], e.g. *P[Jim|Hendrix]; if we don't Text::MacroScript.pm will croak.

Because we use # to signify parameters if you require text that consists of a # followed by digits then you should escape the #, e.g.

    %DEFINE *GRAY[<font color="\#121212">#0</font>]

On the other hand we can use as many more than we need, for example add a third to document: *P[Jim|Hendrix|Musician] will become `The forename is Jim and the surname is Hendrix', just as in the previous example; the third parameter, `Musician', will simply be thrown away.

If we define a macro, script or variable and later define the same name the later definition will replace the earlier one. This is useful for making local macro definitions over-ride global ones, simply by loading the global ones first.

Although macros can have plain textual names like this:

    %DEFINE MAX_INT [32767]

It is generally wise to use a prefix and/or suffix to make sure we don't expand something unintentionally, e.g.

    %DEFINE $MAX_INT [65535]

Macro expansion is no respector of quoted strings or anything else - if the name matches the expansion will take place!

Multi-line definitions are permitted (here's an example I use with the lout typesetting language):

    %DEFINE SCENE
    @Section @Title {#0} @Begin @PP @Include {#1} @End @Section
    %END_DEFINE

Multi-line definitions are available using the embedded style:

    <:
    %DEFINE SCENE
    @Section @Title {#0} @Begin @PP @Include {#1} @End @Section
    %END_DEFINE
    :>

This allows us to write the following in our lout files:

    SCENE[ The title of the scene | scene1.lt ]

If using the embedded style we must use the delimiters:

    <:SCENE[ The title of the scene | scene1.lt ]:>

either of which is a lot shorter than the definition.

Defining Scripts with %DEFINE_SCRIPT

Instead of straight textual substitution, we can have some perl executed (after any parameters have been replaced in the perl text):

    %DEFINE_SCRIPT *ADD ["#0 + #1 = " . (#0 + #1)]

These would be used as *ADD[5|11] in the text which would be output as:

    These would be used as 5 + 11 = 16 in the text...

In script definitions we can use an alternative way of passing parameters instead of or in addition to the #0 syntax.

This is particularly useful if we want to take a variable number of parameters since the #0 etc syntax does not provide for this. An array called @Param is available to our perl code that has any parameters. This allows things like the following to be achieved:

    %DEFINE_SCRIPT ^PEOPLE
    # We don't use the name hash number params but read straight from the
    # array:
    my $a = "friends and relatives are " ; $a .= join ", ", @Param ; $a ;
    %END_DEFINE

The above would expand in the following text:

    Her ^PEOPLE[Anna|John|Zebadiah].

to Her friends and relatives are Anna, John, Zebadiah.

Macro names can be any length and consist of any characters (including non-printable which is probably only useful within code), except [, although ] is not recommended.

Here's a simple date-stamp using the macro approach:

    %DEFINE_SCRIPT *DATESTAMP
    { my( $d, $m, $y ) = (localtime( time ))[3..5] ; $m++ ; $m = "0$m" if $m <
    10 ; $d = "0$d" if $d < 10 ; $y += 1900 ; "#0 on $y/$m/$d" ; }
    %END_DEFINE

Here's (a somewhat contrived example of) how the above would be used:

    <HTML> <HEAD><TITLE>Test Page</TITLE></HEAD> <BODY> *DATESTAMP[Last
    Updated]<P> This page is up-to-date and will remain valid until
    *DATESTAMP[midnight] </BODY> </HTML>

Thus we could have a file, test.html.m containing:

    %DEFINE_SCRIPT *DATESTAMP
    { my( $d, $m, $y ) = (localtime( time ))[3..5] ; $m++ ; $m = "0$m" if $m <
    10 ; $d = "0$d" if $d < 10 ; $y += 1900 ; "#0 on $y/$m/$d" ; }
    %END_DEFINE
    <HTML> <HEAD><TITLE>Test Page</TITLE></HEAD> <BODY> *DATESTAMP[Last
    Updated]<P> This page is up-to-date and will remain valid until
    *DATESTAMP[midnight] </BODY> </HTML>

or a real embedded perl file, test.html.e:

    <:
    %CASE[0]
    Notice we don't need to delimit macro or script names using embedded style
    since we must use delimiters in the body text anyway
    %END_CASE

    %DEFINE_SCRIPT DATESTAMP
    { my( $d, $m, $y ) = (localtime( time ))[3..5] ; $m++ ; $m = "0$m" if $m <
    10 ; $d = "0$d" if $d < 10 ; $y += 1900 ; "#0 on $y/$m/$d" ; }
    %END_DEFINE
    :> <HTML> <HEAD><TITLE>Test Page</TITLE></HEAD> <BODY> <!-- Note how the
    parameter must be within the delimiters. --> <:DATESTAMP[Last
    Updated]:><P> This page is up-to-date and will remain valid until
    <:DATESTAMP[midnight]:> </BODY> </HTML>

either of which when expanded, either in code using $Macro-expand()>, or using the simple macropp utility supplied with Text::MacroScript.pm:

    [1]% macropp test.html.m > test.html

or

    [1]% macropp -e test.html.e > test.html

test.html will contain just this:

    <HTML> <HEAD><TITLE>Test Page</TITLE></HEAD> <BODY> Last Updated on
    1999/08/21<P> This page is up-to-date and will remain valid until midnight
    on 1999/08/21 </BODY> </HTML>

Of course in practice we wouldn't want to define everything in-line like this. See %LOAD later for an alternative.

Defining Variables with %DEFINE_VARIABLE

We can also define variables:

    %DEFINE_VARIABLE &*! [89.1232]

Note that there is no multi-line version of %DEFINE_VARIABLE.

All current variables are available inside %DEFINE_SCRIPT scripts in the %Var hash:

    %DEFINE_SCRIPT *TEST1
    $a = '' ; while( my( $k, $v ) each( %Var ) ) { $a .= "$key = $v\n" ; } $a
    ;
    %END_DEFINE

Variables are also used with %CASE (covered later).

Loading and including files with %LOAD and %INCLUDE

Although we can define macros directly in the files that require them it is often more useful to define them separately and include them in all those that need them.

From the command line it would be achieved thus:

    [2]% macropp -f ~/.macro/html.macros test.html.m > test.html

One disadvantage of this approach, especially if we have lots of macro files, is that we can easily forget which macro files are required by which text files. One solution to this is to go back to %DEFINEing in the text files themselves, but this would lose reusability. The answer to both these problems is to use the %LOAD command which loads the definitions from the named file at the point it appears in the text file:

    %LOAD[~/.macro/html.macros]
    <HTML> <HEAD><TITLE>Test Page Again</TITLE></HEAD> <BODY> *DATESTAMP[Last
    Updated]<P> This page will remain valid until *DATESTAMP[midnight] </BODY>
    </HTML>

The above text has the same output but we don't have to remember or explicitly load the macros so our command line becomes:

    [3]% macropp test.html.m > test.html

    At the beginning of our lout typesetting files we might put this line:

    %LOAD[local.macros]

The first line of the local.macros file is:

    %LOAD[~/.macro/lout.macros]

So this loads both global macros then local ones (which if they have the same name will of course over-ride).

This saves repeating the %DEFINE definitions in all the files and makes maintenance easier.

%LOAD loads perl scripts and macros, but ignores any other text. Thus we can use %LOAD, or its method equivalent load_file(), on any file, and it will only ever instantiate macros and scripts and produce no output.

If we want to include the entire contents of another file, and perform macro expansion on that file then use %INCLUDE, e.g.

    %INCLUDE[/path/to/file/with/scripts-and-macros-and-text]

The %INCLUDE command will instantiate any macros and scripts it encounters and include all other lines of text (with macro/script expansion) in the output stream.

Macros and scripts are expanded in the following order: 1. scripts (longest named first, shortest named last) 2. macros (longest named first, shortest named last)

Skipping text using %CASE and %END_CASE

It is possible to selectively skip parts of the text.

    %CASE[0]
    All the text here will be discarded. No matter how much there is.
    %END_CASE

The above is useful for multi-line comments.

We can also skip selectively. Here's an if...then:

    %CASE[#OS eq 'Linux']
    Skipped if the condition is FALSE. 
    %END_CASE

The condition can be any perl fragment. We can use previously defined variables either using the #variable syntax as shown above or using the exported perl name, in this case either #OS, or %Var{'OS'} whichever we prefer.

If the condition is true the text is output with macro/script expansion as normal; if the condition is false the text is skipped.

The if...then...else structure:

    %DEFINE_VARIABLE OS[Linux]

    %CASE[$Var{'OS'} eq 'Linux']
    Linux specific stuff.
    %CASE[#OS ne 'Linux']
    Non-linux stuff - note that both references to the OS variable are
    identical in the expression (#OS is converted internally to $Var{'0S'} so
    the eval sees the same code in both cases
    %END_CASE

Although nested %CASEs are not supported we can get the same functionality (and indeed more versatility because we can use full perl expressions), e.g.:

    %DEFINE_VARIABLE TARGET[Linux]

    %CASE[#TARGET eq 'Win32' or #TARGET eq 'DOS']
    Win32/DOS stuff.
    %CASE[#TARGET eq 'Win32']
    Win32 only stuff.
    %CASE[#TARGET eq 'DOS']
    DOS only stuff.
    %CASE[#TARGET eq 'Win32' or #TARGET eq 'DOS']
    More Win32/DOS stuff.
    %END_CASE

The expressions can be simple [1] (always true always use the text); or any arbitrary perl. Perl uses the following comparison operators:

These operators treat each argument as ASCII case sensitive strings: eq Equal ne Not equal lt Less than le Less than or equal gt Greather than ge Greater than or equal

To get case-insentitive comparisons use, lc #VAR1 eq lc #VAR2, etc. (lc is the perl lower-case operator.)

To perform compare numbers use these: == Equal != Not equal < Less than <= Less than or equal > Greather than <= Greater than or equal

Perl has the following logical operators which can be used with the comparison operators in our expressions:

    and Both are true or  At least one is true not Negate the expression

Although macropp doesn't support nested %CASE's we can still represent logic like this:

    if cond1 then if cond2 do cond1 + cond2 stuff else do cond1 stuff end if
    else do other stuff end if

By `unrolling' the expression and writing something like this:

    %CASE[#cond1 and #cond2]
        do cond1 + cond2 stuff
    %CASE[#cond1 and (not #cond2)]
        do cond1 stuff
    %CASE[(not #cond1) and (not #cond2)]
        do other stuff
    %END_CASE

In other words we must fully specify the conditions for each case.

We can use any other macro/script command within %CASE commands, e.g. %DEFINEs, etc., as well as have any text that will be macro/script expanded as normal.

Undefining with %UNDEFINE

Macros and scripts may be undefined in files:

    %UNDEFINE *P UNDEFINE_SCRIPT *DATESTAMP UNDEFINE_VARIABLE &*!

All macros, scripts and variables can be undefined:

    %UNDEFINE_ALL UNDEFINE_ALL_SCRIPT UNDEFINE_ALL_VARIABLE

One use of undefining everything is to ensure we get a clean start. We might head up our files thus:

    %UNDEFINE_ALL UNDEFINE_ALL_SCRIPT UNDEFINE_ALL_VARIABLE LOAD[mymacros]
    text goes here

Loading Perl libraries with %REQUIRE

We often want our scripts to have access to a bundle of functions that we have created or that are in other modules. This can now be achieved by:

    %REQUIRE[/path/to/mylibrary.pl]

An example library macroutil.pl is provided with examples of usage in html.macro.

Commenting

Generally the text files that we process are in formats that support commenting, e.g. HTML:

    <!-- This is a comment -->

Sometimes however we want to put comments in our macro source files that won't end up in the output files. One simple way of achieving this is to define a macro whose body is empty; when its called with any number of parameters (our comments), their text is thrown away:

    %DEFINE %%[]

which is used like this in texts:

    The comment comes %%[Here | [anything] put here will disappear]here!

The output of the above will be:

    The comment comes here!

However the easiest way to comment is to use %CASE:

    %CASE[0]
    This unconditionally skips text up until the end marker since the
    condition is always false.
    %END_CASE

Text::MacroScript.pm

Text::MacroScript.pm provides the underlying code for macropp, and for perl programmers it provides the ability to create and manipulate Macro objects in code similarly to how we can create and manipulate them through embedding them in files. See Text::MacroScript.pm's documentation for further details.

BUGS

Lousy error reporting for embedded perl in most cases.

CHANGES

1999/08/18 Created.

1999/08/23 Version 1.00.

1999/09/01 Minor documentation corrections.

1999/09/07 Took out embedded processing code - this is now taken care of by Text::MacroScript.pm, so we just need to call that with the right options.

1999/09/10 Renamed package Text::MacroScript.pm as per John Porter's (CPAN) suggestion.

2000/02/01 Minor documentation updates.

AUTHOR

Mark Summerfield. I can be contacted as <summer@perlpress.com> - please include the word 'macro' in the subject line.

COPYRIGHT

Copyright (c) Mark Summerfield 1999-2000. All Rights Reserved.

This module may be used/distributed/modified under the GPL.

PREREQUISITES

strict

Getopt::Long Text::MacroScript

COREQUISITES

OSNAMES

any

SCRIPT CATEGORIES

Text-processing Macros