Andy Wardley > Text-MetaText-0.22 > Text::MetaText

Download:
Text-MetaText-0.22.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 0.22   Source  

NAME ^

Text::MetaText - Perl extension implementing meta-language for processing "template" text files.

SYNOPSIS ^

    use Text::MetaText;

    my $mt = Text::MetaText->new();

    # process file content or text string 
    print $mt->process_file($filename, \%vardefs);
    print $mt->process_text($textstring, \%vardefs);

    # pre-declare a BLOCK for subsequent INCLUDE
    $mt->declare($textstring, $blockname);
    $mt->declare(\@content, $blockname);

SUMMARY OF METATEXT DIRECTIVES ^

    %% DEFINE 
       variable1 = value          # define variable(s)
       variable2 = "quoted value"  
    %%

    %% SUBST variable  %%         # insert variable value
    %% variable %%                # short form of above

    %% BLOCK blockname %%         # define a block 'blockname'
       block text... 
    %% ENDBLOCK %%

    %% INCLUDE blockname %%       # include 'blockname' block text
    %% INCLUDE filename  %%       # include external file 'filename'

    %% INCLUDE file_or_block      # a more complete example...
       variable = value           # additional variable definition(s)
       if       = condition       # conditional inclusion
       unless   = condition       # conditional exclusion
       format   = format_string   # printf-like format string with '%s'
       filter   = fltname(params) # post-process filter 
    %%

    %% TIME                       # current system time, as per time(2)
       format   = format_string   # display format, as per strftime(3C) 
    %%

DESCRIPTION ^

MetaText is a text processing and markup meta-language which can be used for processing "template" files. This module is a Perl 5 extension implementing a MetaText object class which processes text files, interpreting and acting on the embedded MetaText directives within.

Like a glorified pre-processor, MetaText can; include files, define and substitute variable values, execute conditional actions based on variables, call other perl functions or object methods and capture the resulting output back into the document, and more. It can format the resulting output of any of these operations in a number of ways. The objects, and inherently, the format and symantics of the MetaText langauge itself, are highly configurable.

MetaText was originally designed to aid in the creation of html documents in a large web site. It remains well suited for this and similar tasks, being able to create web pages (dynamically or statically) that are consistent with each other, yet easily customisable:

PREREQUISITES ^

MetaText requires Perl 5.004 or later. The Date::Format module should also be installed. This is available from CPAN (in the "TimeDate" distribution) as described in the following section. The metapage utility also requires the File::Recurse module, distributed in the "File-Tools" bundle, also available from CPAN.

OBTAINING AND INSTALLING THE METATEXT MODULE ^

The MetaText module is available from CPAN. As the 'perlmod' man page explains:

    CPAN stands for the Comprehensive Perl Archive Network.
    This is a globally replicated collection of all known Perl
    materials, including hundreds of unbunded modules.  

    [...]

    For an up-to-date listing of CPAN sites, see
    http://www.perl.com/perl/ or ftp://ftp.perl.com/perl/ .

Within the CPAN archive, MetaText is in the "Text::" group which forms part of the the category:

  *) String Processing, Language Text Processing, 
     Parsing and Searching

The module is available in the following directories:

    /modules/by-module/Text/Text-MetaText-<version>.tar.gz
    /authors/id/ABW/Text-MetaText-<version>.tar.gz

For the latest information on MetaText or to download the latest pre-release/beta version of the module, consult the definitive reference, the MetaText Home Page:

    http://www.kfs.org/~abw/perl/metatext/

MetaText is distributed as a single gzipped tar archive file:

    Text-MetaText-<version>.tar.gz

Note that "<version>" represents the current MetaText Revision number, of the form "0.18". See REVISION below to determine the current version number for Text::MetaText.

Unpack the archive to create a MetaText installation directory:

    gunzip Text-MetaText-<version>.tar.gz
    tar xvf Text-MetaText-<version>.tar

'cd' into that directory, make, test and install the MetaText module:

    cd Text-MetaText-<version>
    perl Makefile.PL
    make
    make test
    make install

The 't' sub-directory contains a number of small sample files which are processed by the test script (called by 'make test'). See the README file in that directory for more information. A logfile (test.log) is generated to report any errors that occur during this process. Please note that the test suite is incomplete and very much in an 'alpha' state. Any further contributions here are welcome.

The 'make install' will install the module on your system. You may need root access to perform this task. If you install the module in a local directory (for example, by executing "perl Makefile.PL LIB=~/lib" in the above - see perldoc MakeMaker for full details), you will need to ensure that the PERL5LIB environment variable is set to include the location, or add a line to your scripts explicitly naming the library location:

    use lib '/local/path/to/lib';

The metapage utility is a script designed to automate MetaText processing of files. It can traverse directory trees, identify modified files (by comparing the time stamp of the equivalent file in both "source" and "destination" directories), process them and direct the resulting output to the appropriate file location in the destination tree. One can think of metapage as the MetaText equivalent of the Unix make(1S) utility.

The installation process detailed above should install metapage in your system's perl 'installbin' directory (try perl '-V:installbin' to check this location). See the metapage documentation (perldoc metapage) for more information on configuring and using metapage.

USING THE METATEXT MODULE ^

To import and use the MetaText module the following line should appear in your Perl script:

    use Text::MetaText;

MetaText is implemented using object-oriented methods. A new MetaText object is created and initialised using the Text::MetaText->new() method. This returns a reference to a new MetaText object.

    my $mt = Text::MetaText->new;

A number of configuration options can be specified when creating a MetaText object. A reference to a hash array of options and their associated values should be passed as a parameter to the new() method.

    $my $mt = Text::MetaText->new( { 'opt1' => 'val1', 'opt2' => 'val2' } );

The configurations options available are described in full below. All keywords are treated case-insensitively (i.e. "LIB", "lib" and "Lib" are all considered equal).

LIB

The INCLUDE directive causes the external file specified ("INCLUDE <file>") to be imported into the current document. The LIB option specifies one or more directories in which the file can be found. Multiple directories should be separated by a colon or comma. The current directory is also searched by default.

    my $mt = Text::MetaText->new( { LIB => "/tmp:/usr/metatext/lib" } );
CASE

The default behaviour for MetaText is to treat variable names and identifiers case insensitively. Thus, the following are treated identically:

    %% INCLUDE foo %%
    %% INCLUDE Foo %%
    %% INCLUDE FOO %%

When running with CASE sensitivity disabled, the MetaText processor converts all variable and symbol names to lower case.

Setting the CASE option to any non-zero value causes the document to be processed case sensitively.

    my $mt = Text::MetaText->new( { CASE => 1 } ); # case sensitive

Note that the configuration options described in this section are always treated case insensitively regardless of the CASE setting.

CASEVARS

When running in the default case-insensitive mode (CASE => 0), all variable names are folded to lower case. It is convenient to allow applications to specify some variables that are upper or mixed case to distinguish them from normal variables. The metapage utility uses this to define a number of 'system variables' that hold information about the file being processed: FILETIME, FILEMOD, FILEPATH, etc. By defining these as CASEVARS, the processor will attempt to differentiate them from normal variables by their case. Thus, the calling application can define variables that are guaranteed not to conflict with any user-defined variables (while CASE insensitive) and are also effectively read-only.

    my $mt = Text::MetaText->new( { 
        CASEVARS => [ 'AUTHOR', 'COPYRIGHT' ],
    });

    print $mt->process_file($file, {
        AUTHOR    => 'Andy Wardley',
        COPYRIGHT => '(C) Copyright Andy Wardley 1998',
    });

The input file:

    %% DEFINE copyright = "(C) Ima Plagiarist" %%
    %% COPYRIGHT %%
    %% copyright %%

produces the following output:

    (C) Copyright Andy Wardley 1998        # COPYRIGHT
    (C) Ima Plagiarist                     # copyright 

Note that CASEVARS can only apply to variables that are pre-defined (i.e. specified in the hash array that is be passed to process_xxxx() as a second parameter). It is not possible to re-define a CASEVARS variable with a DEFINE directive because the variable name will always be folded to lower case (when CASE == 0). e.g.

    %% DEFINE COPYRIGHT = "..." %% 

is interpreted as:

    %% DEFINE copyright = "..." %%

It is recommended that such variables always be specified in UPPER CASE as a visual clue to indicate that they have a special meaning and behaviour.

MAGIC

MetaText directives are identifed in the document being processed as text blocks surrounded by special "magic" identifers. The default identifiers are a double percent string, "%%", for both opening and closing identifiers. Thus, a typical directive looks like:

    %% INCLUDE some/file %%

and may be embedded within other text:

    normal text, blah, blah %% INCLUDE some/file %% more normal text

The MAGIC option allows new identifiers to be defined. A single value assigned to MAGIC defines a token to be used for both opening and closing identifiers:

    my $mt = Text::MetaText->new( { MAGIC => '++' } );

    ++ INCLUDE file ++

A reference to an array providing two values (elements 0 and 1) indicates separate tokens to be used for opening and closing identifiers:

    my $mt = Text::MetaText->new( { MAGIC => [ '<!--', '-->' ] } );

    <!-- INCLUDE file -->
CHOMP

When MetaText processes a file it identifies directives and replaces them with the result of whatever magical process the directive represents (e.g. file contents for an INCLUDE, variable value for a SUBST, etc). Anything outside the directive, including newline characters, are left intact. Where a directive is defined that has no corresponding output (DEFINE, for example, which silently sets a variable value), the trailing newline characters can leave large tracts of blank lines in the output documents.

For example:

  line 1
  %% DEFINE f="foo" %%
  %% DEFINE b="bar" %%
  line 2 

Produces the following output:

  line 1


  line 2

This happens because the newline characters at the end of the second and third lines are left intact in the output text.

Setting CHOMP to any true value will remove any newline characters that appear immediately after a MetaText directive. Any characters coming between the directive and the newline, including whitespace, will override this behaviour and cause the intervening characters and newline to be output intact.

With CHOMP set, the following example demonstrates the behaviour:

  line 1
  %% DEFINE f="foo" %%
  %% DEFINE b="bar" %%<space>
  line 2

Produces the following output (Note that "<space>" is intended to represent a single space character, not the string "<space>" itself, although the effect would be identical):

  line 1
  <space>
  line 2
TRIM

The TRIM configuration parameter, when set to any true value, causes the leading and trailing newlines (if present) within a defined BLOCK to be deleted. This behaviour is enabled by default. The following block definition:

  %% BLOCK camel %%
  The eye of the needle
  %% ENDBLOCK %%

would define the block as "The eye of the needle" rather than "\nThe eye of the needle\n". With TRIM set to 0, the newlines are left intact.

It is possible to override the TRIM behaviour by specifying the trim value as a parameter in a BLOCK definition directive:

  %% BLOCK trim %%
  ...content...
  %% ENDBLOCK %%

or conversely:

  %% BLOCK trim=0 %% 
  ...content...
  %% ENDBLOCK %%
FILTER

There may be times when you may want to INCLUDE a file or element in a document but want to filter the contents in some way. You may wish to escape (i.e. prefix with a backslash '\') certain characters such as quotes, search for certain text and replace with an alternative phrase, or perform some other post-processing task. The FILTER option allows you to define one or more code blocks that can be called as filter functions from an INCLUDE directive. Each code block is given a unique name to identify it and may have calling parameters (parenthesised and separated by commas) that can be specified as part of the directive. e.g.

    %% INCLUDE foo filter="slurp(prm1, prm2, ...)" %%

Two default filters are pre-defined: escape() and sr(). escape() takes as a parameter a perl-like regular expression pattern that indicates characters that should be 'escaped' (i.e. prefixed by a backslash '\') in the text. For example, to escape any of the character class ["'\] you would specify the filter as:

    %% INCLUDE foo filter="escape([\"'\\])" %%

The second filter, sr(), takes two arguments, a search string and a replace string. A simple substitution is made on the included text. e.g.

    %% INCLUDE foo filter="sr(spam, \"processed meat\")" %%

Note that quotes and other special metacharacters should be escaped within the filter string as shown in the two examples above.

Additional filters can be specified by passing a reference to a hash array that contains the name of the filter and the code itself in each key/value pair. Your filter function should be designed to accept the name of the function as the first parameter, followed by a line of text to be processed. Any additional parameters specified in the INCLUDE directive follow. The filter function is called for each line of an INCLUDE block and should return the modified text.

Example:

    my $mt = Text::MetaText->new( { 
        FILTER => {
            'xyzzy' => sub { 
                 my ($filtername, $text, @params) = @_;
                 $text = # do something here...
                 $text;  # return modified text
            }
        }
    } );

    %% INCLUDE file1 filter="xyzzy(...)" %%

A new FILTER definition will replace any existing filter with the same name.

EXECUTE

The SUBST directive performs a simple substitution for the value of the named variable. In the example shown below, the entire directive, including the surrounding 'magic' tokens '%%', is replaced with the value of the variable 'foo':

    %% SUBST foo %%  (or more succinctly, %% foo %%)

If the named variable has not been defined, MetaText can interpret the variable as the name of an object method in the current class or as a function in the main package.

If the EXECUTE flag is set to any true value, the MetaText processor will interpret the variable as an object method and attempt to apply it to its own object instance (i.e. $self->$method(...)). If the method is not defined, the processor fails quietly (but see ROGUE below to see what can happens next). This allows classes to be derived from MetaText that implement methods that can be called (when EXECUTE == 1) as follows:

    %% method1 ... %%       # calls $self->method1(...);
    %% method2 ... %%       # calls $self->method2(...);

The text returned from the method is used as a replacement value for the directive.

The following pseudo-code example demonstrates this:

    package MyMetaText;
    @ISA = qw( Text::MetaText );

    sub foo { "This is method 'foo'" }  # simple return string
    sub bar { "This is method 'bar'" }  # "        "         "

    package main;

    my $mt = MyMetaText->new( { EXECUTE => 1 } );
    print $mt->process("myfile");

which, for the file 'myfile':

    %% foo %%
    %% bar %%

generates the following output:

    This is method 'foo'
    This is method 'bar'

If the EXECUTE flag is set to a value > 1 and the variable name does not correspond to a class method, the processor tries to interpret the variable as a function in the main package. Like the example above, the processor fails silently if the function is not defined (but see ROGUE below).

The following pseudo-code extract demonstrates this:

    my $mt = Text::MetaText->new( { EXECUTE => 2 } );
    print $mt->processs("myfile");

    sub foo { "This is function 'foo'" }  # simple return string
    sub bar { "This is function 'bar'" }  # "        "         "

which, for the file 'myfile':

    %% foo %%
    %% bar %%

generates the following output:

    This is function 'foo'
    This is function 'bar'

Any additional parameters specified in the directive are passed to the class method or function as a hash array reference. The original parameter string is also passed. Note that the first parameter passed to class methods is the MetaText (or derivative) object reference itself.

Example:

    %% foo name="Seuss" title="Dr" %%

causes the equivalent of (when EXECUTE is any true value):

    $self->foo(                                  # implicit $self ref
        { 'name' => 'Seuss', 'title' => 'Dr' },  # hash ref of params
          'name="Seuss" title="Dr"' );           # parameter string

and/or (when EXECUTE > 1):

    &main::foo(
        { 'name' => 'Seuss', 'title' => 'Dr' },  # hash ref of params
          'name="Seuss" title="Dr"' );           # parameter string
ROGUE

This configuration item determines how MetaText behaves when it encounters a directive it does not recognise. The ROGUE option may contain one or more of the ROGUE keywords separated by any non-word character. The keywords and their associated meanings are:

    warn    Issue a warning (via the ERROR function, if 
            specified) when the directive is encountered.

    delete  Delete any unrecognised directives.

The default behaviour is to silently leave any unrecognised directive in the processed text.

Example:

    my $mt = Text::MetaText->new( { ROGUE => "delete,warn" } );
DELIMITER

The DELIMITER item specifies the character or character sequence that is used to delimit lists of data. This is used, for example, by the "in" operator which can be used in evaluation conditions. e.g.

    %% INCLUDE hardenuf if="uid in abw,wrigley" %%

In this case, the condition evaluates true if the uid variable contains the value "abw" or "wrigley". The default delimiter character is a comma.

The example:

    my $mt = Text::MetaText->new( { DELIMITER => ":" } );

would thus correctly process:

    %% INCLUDE hardenuf if="uid in abw:wrigley" %%
ERROR

The ERROR configuration item allows an alternative error reporting function to be specified for error handling. The function should expect a printf() like calling convention.

Example:

    my $mt = Text::MetaText->new( { 
        ERROR => sub {
            my ($format, @params) = @_;
            printf(STDERR "ERROR: $format", @params);
        }
    } );
DEBUG

The DEBUG item allows an alternative debug function to be provided. The function should expect a printf() like calling convention, as per the ERROR option described above. The default DEBUG function sends debug messages to STDERR, prefixed by a debug string: 'D> '.

DEBUGLEVEL

The DEBUGLEVEL item specifies which, if any, of the debug messages are displayed during the operation of the MetaText object. Like the ROGUE option described above, the DEBUGLEVEL value should be constructed from one or more of the following keywords:

    none      no debugging information (default)
    info      general processing information
    config    MetaText object configuration items
    preproc   pre-processing phase
    process   processing phase
    postproc  post-processing phase
    data      additional data parameters in debug messages
    content   content of pre-processed INCLUDE blocks
    function  list functions calls as executed
    evaluate  trace conditional evaluations
    test      used for any temporary test code
    all       all of the above (excluding "none", obviously)

Example:

    my $mt = Text::MetaText->new( { 
        DEBUGLEVEL => "preproc,process,data" 
    } );
MAXDEPTH

It is possible for MetaText to become stuck in an endless loop if a circular dependancy exists between one or more files. For example:

    foo:
        %% INCLUDE bar %%

    bar:
        %% INCLUDE foo %%

To detect and avoid such conditions, MetaText allows files to be nested up to MAXDEPTH times. By default, this value is 32. If you are processing a file which has nested INCLUDE directives to a depth greater than 32 and MetaText returns with a "Maximum recursion exceeded" warning, set this confiuration item to a higher value. e.g.

    my $mt = Text::MetaText->new( { MAXDEPTH => 42 } );

PROCESSING TEXT FILES AND STRINGS ^

The MetaText methods for processing text files and strings are:

    process_file($file, ...);
    process_text($text, ...);

The process() method is also supported for backward compatibility with older versions of MetaText. The process() method simply calls process_file(), passing all arguments to it.

The process_file() method processes a text file interpreting any MetaText directives embedded within it. The first parameter should be the name of the file which should reside in the current working directory or in one of the directories specified in the LIB configuration option. A filename starting with a slash '/' or a period '.' is considered to be an absolute path or a path relative to the current working directory, respectively. In these cases, the LIB path is not searched. The optional second parameter may be a reference to a hash array containing a number of variable/value definitions that should be pre-defined when processing the file.

    print $mt->process_file("somefile", { name => "Fred" });

If "somefile" contains:

    Hello %% name %%

then the output generated would be:

    Hello Fred

Pre-defining variables in this way is equivalent to using the DEFINE directive (described below) at the start of the INCLUDE file

    %% DEFINE name="Fred" %%
    Hello %% name %%

The process_file() function will continue until it reaches the end of the file or a line containing the pattern "__END__" or "__MTEND__" by itself ("END" or "MTEND" enclosed by double underscores, no other characters or whitespace on the line).

Note that the pre-processor (a private method which is called by process(), so feel free to forget all about it) does scan past any __END__ or __MTEND__ marker. In practice, that means you can define blocks after, but use them before, the terminating marker. e.g.

    Martin, %% INCLUDE taunt %%

    __MTEND__               << processor stops here and ignores 
                               everything following
    %% BLOCK taunt %%       << but the pre-processor has correctly 
    you Camper!                continued and parsed this block so that
    %% ENDBLOCK %%             it can be included in the main body

produces the output:

    Martin, you Camper!

The process_file() function returns a string containing the processed file or block output. On error, a warning is generated (see "USING THE METATEXT MODULE") and undef is returned.

    my $output = $mt->process_file("myfile");
    print $output if defined $output;

The process_text() method is identical to process_file() except that the first parameter should represent a text string to be processed rather than the name of a file. All other parameters, behaviour and return values are the same as for process_file().

    my $text   = "%% INCLUDE header %% test! %% INCLUDE footer %%";
    my $output = $mt->process_text($text);
    print $output if defined $output;

METATEXT DIRECTIVES ^

A MetaText directive is a block of text in a file that is enclosed by the MAGIC identifiers (by default '%%'). A directive may span multiple lines and may include blank lines within in. Whitespace within a directive is generally ignored except where quoted as part of a specific value.

    %% DEFINE
       name    = Yorick
       age     = 30
       comment = "A fellow of infinite jest"
    %%

The first word of the directive indicates the directive type. Directives may be specified in upper, lower or mixed case, irrespective of the CASE sensitivity flag (which affects only variable names). The general convention is to specify the directive type in UPPER CASE to aid clarity.

The MetaText directives are:

DEFINE

Define the values for one or more variables

SUBST

Substitute the value of a named variable

INCLUDE

Process and include the contents of the named file or block

BLOCK

Define a named block which can be subsequently INCLUDE'd

ENDBLOCK

Marks the end of a BLOCK definition

To improve clarity and reduce excessive, unnecessary and altogether undesirable verbosity, a directive block that doesn't start with a recognised MetaText directive is assumed to be a 'SUBST' variable substitution. Thus,

    %% SUBST foo %%

can be written more succinctly as

    %% foo %%

When MetaText processes directives, it is effectively performing a "search and replace". The MetaText directive block is replaced with whatever text is appropriate for the directive specified. Generally speaking, MetaText does not alter any text content or formatting outside of directive blocks. The only exception to this rule is when CHOMP is turned on (see "USING THE METATEXT MODULE") and newlines immediately following a directive are subsequently deleted.

DEFINE

The DEFINE directive allows simple variables to be assigned values. Multiple variables may be defined in a single DEFINE directive.

    %% DEFINE 
       name  = Caliban
       quote = "that, when I waked, I cried to dream again."
    %%

It is also possible to use other variable values to DEFINE new variables. Use the '$' prefix to indicate a variable rather than an absolute value. If necessary, surround the variable name with braces '{' '}' to separate it from any surrounding text.

    %% DEFINE 
       server = www.kfs.org
       home   = /~abw/
    %%

    %% DEFINE
       homepage = http://$server${home}index.html
    %%

In the above example, the 'homepage' variable adopts the value 'http://www.kfs.org/~abw/index.html' which is constructed from the text string 'http://' and 'index.html' and the values for $server and $home. Notice how the 'home' variable is enclosed in braces. Without these, the homepage variable would not be constructed correctly, looking instead for a variable called 'homeindex.html'

    %% DEFINE
       homepage = http://$server$homeindex.html   ## WRONG!
    %%

See " " below for further information.

Variables defined within a file or passed to the process_file() or process_text() functions as a hash array remain defined until the file or block is processed in entirety. Variable values will be inherited by any nested files or blocks INCLUDE'd into the file. Re-definitions of existing variables will persist within the file or block, masking any existing values, until the end of the file or block when the previous values will be restored.

The following example illustrates this:

    foo:
        Hello %% name %%              # name assumes any predefined value
        %% DEFINE name=tom %%
        Hello %% name %%              # name = 'tom'
        %% INCLUDE bar name='dick' %% # name = 'dick' for "INCLUDE bar"
        Hello %% name %%              # name = 'tom'

    bar:
        Hello %% name %%              # name = 'dick'
        %% DEFINE name='harry' %%     # name = 'harry'
        Hello %% name %%

Processing the file 'foo' as follows:

    print $mt->process_file('foo', { 'name' => 'nobody' });

produces the following output (with explanatory comments added for clarity):

    Hello nobody                      # value from process() hash 
    Hello tom                         # from foo
    Hello dick                        # from bar
    Hello harry                       # re-defined in bar
    Hello tom                         # restored to previous value in foo

SUBST

A SUBST directive performs a simple variable substitution. If the variable is defined, its value will be inserted in place of the directive.

Example:

    %% DEFINE place = World %%
    Hello %% SUBST place %%!

generates the following output:

    Hello World!

The SUBST keyword can be omitted for brevity. Thus "%% place %%" is processed identically to "%% SUBST place %%".

If the variable is undefined, the MetaText processor will, according to the value of the EXECUTE configuration value, try to execute a class method or a function in the main package with the same name as the SUBST variable. If EXECUTE is set to any true value, the processor will try to make a corresponding method call for the current object (that is, the current instantiation of the MetaText or derived class). If no such method exists and EXECUTE is set to any value greater than 1, the processor will then try to execute a function in the main package with the same name as the SUBST variable In either case, the text returned from the method or function is included into the current block in place of the SUBST directive (non-text values are automatically coerced to text strings). If neither a variable, method or function exists, the SUBST directive will either be deleted or left intact (and additionally, a warning may be issued), depending on the value of the ROGUE configuration item.

See EXTENDING METATEXT below for more information on deriving MetaText classes and using EXECUTE to extend the meta-language.

The "format" and "filter" options as described in the INCLUDE section below are applied to the processed SUBST result before being inserted back into the document.

Some MetaText variables have a special meaning. Unless specifically defined otherwise, the variable(s) listed below generate the following output:

    TIME    The current system time in seconds since the epoch, 
            00:00:00 Jan 1 1970.  Use the "format" option to 
            specify a time/date format.

INCLUDE

The INCLUDE directive instructs MetaText to load and process the contents of the file or block specified. If the target is a file, it should reside in the current directory or a directory specified in the LIB configuration variable. Alternatively, the target may be a text block specified with BLOCK..ENDBLOCK directives (see below).

    %% INCLUDE chapter1 %%

The target may also be a variable name and should be prefixed with a '$' to identify it as such. On evaluation, the value of the named variable will be used as the target:

Example:

    %% DEFINE chapter=ch1 %%
    %% INCLUDE $chapter   %%  

is equivalent to:

    %% INCLUDE ch1 %%

Additional variables may be defined for substitution within the file:

    %% INCLUDE chapter2 bgcolor=#ffffff title="Chapter 2" %%

The contents of the file "chapter2":

    <html><head><title>%%title%%</title></head>
    <body bgcolor="%% bgcolor %%">
      ...
    </body>

would produce the output:

    <html><head><title>Chapter 2</title></head>
    <body bgcolor="#ffffff">
      ...
    </body>

Defining variables in this way is equivalent to using the DEFINE directive. Variables remain in scope for the lifetime of the file being processed and then revert to any previously defined values (or undefined). Any additional files processed via further INCLUDE directives within the file will also inherit any defined variable values.

Example:

      %% INCLUDE file1 name="World" %%

for the files:

    file1:                   # name => "World" from INCLUDE directive
        %% INCLUDE file2 %% 
  
    file2:                   # inherits "name" variable from file1
        %% INCLUDE file3 %%    

    file3:                   # inherits "name" variable from file2
        Hello %% name %%

produces the output:

    Hello World

The output generated by INCLUDE and SUBST directives can be formatted using a printf-like template. The format string should be specified as a "format" option in the INCLUDE or SUBST directive. Each line of the included text is formatted and concatentated to create the final output. Within the format string, '%s' is used to represent the text.

For example, the 'author' element below could be used to display details of the author of the current document.

    author:
        File:   %% file %%
        Author: %% name %%
        Date:   %% date %%

For inclusion in an HTML document, the text can be encapsulated in HTML comment tags ("<!--" and "-->") using a format string:

    %% INCLUDE author 
       file   = index.html
       name   = "Andy Wardley" 
       date   = 19-Mar-1987
       format = "<!-- %-12s -->" 
    %%

Which produces the following output:

    <!-- File:   index.html   -->
    <!-- Author: Andy Wardley -->
    <!-- Date:   19-Mar-1987  -->

Note that the print format is applied to each line of the included text. To encapsulate the element as a whole, simply apply the formatting outside of the INCLUDE directive:

    <!--
       %% INCLUDE author
       ...
       %%
    -->

In these examples, the formatting is applied as if the replacement value/line is a character string. Any of the standard printf(3) format tokens can be used to coerce the value into a specific type.

There are a number of pre-defined format types:

    dquoted      # encloses each line in double quotes: "like this"
    squoted      # encloses each line in single quotes: 'like this'
    quoted       # same as "dquoted"

Examples:

    %% some_quote format=quoted %%

As mentioned in the SUBST section above, the TIME variable is used to represent the current system time in seconds since the epoch (see time(2)). The "format" option can also be employed to represent such values in a more user-friendly format. Any format string that does not contain a '%s' token is assumed to be a time-based value and is formatted using the time2str() function from the Date::Format module (distributed as part of the TimeDate package).

Example:

    The date is %% TIME format="%d-%b-%y" %%

Generates:

    The date is 19-Mar-98

See perldoc Date::Format for information on the formatting characters available.

The pragmatic token '%P' can be added to a format to override this behaviour and force the use of printf(). The '%P' token is otherwise ignored.

Example:

    %% DEFINE foo=123456789  %%
    %% foo format="%d-%b-%y" %%  # "day-month-year" using time2str
    %% foo format="%d"       %%  # "day" using timestr
    %% foo format="%P%d"     %%  # decimal value using printf
    %% foo format="%s"       %%  # string value using printf

Generates:

    29-Nov-73
    29
    123456789
    123456789

Text that is inserted with an INCLUDE or SUBST directive can also be filtered. There are two default filters provided, 'escape' which can be used to escape (prefix with a backslash '\') certain characters, and 'sr' which is used to perform simple search and replace actions. Other filters may be added with the FILTER option when creating the object (see the FILTER section in "USING THE METATEXT MODULE", above).

Like the 'format' option, output filters work on a line of text at a time. Any parameters required for the filter can be specified in parentheses after the filter name. The 'escape' filter expects a perl-style character class indicating the characters to escape. The 'sr' filter expects two parameters, a search pattern and a replacement string, separated by a comma. Note that parameters that include embedded spaces should be quoted. The quote characters themselves must also be escaped as they already form part of a quoted string (the filter text). (This way of representing parameters is admittedly far from ideal and may be improved in a future version.)

Example:

    %% DEFINE text="Madam I'm Adam" %%
    %% SUBST  text filter="escape(['])"               %%
    %% SUBST  text filter="sr(Adam, \"Frank Bough\")" %%

Generates:

    Madam I\'m Adam
    Madam I'm Frank Bough

Conditional tests can be applied to INCLUDE blocks to determine if the block should evaluated or ignored. Variables and absolute values can be used and can be evaluated in the following ways:

    a == b       # a is equal to b
    a != b       # a is not equal to b
    a >  b       # a is greater than b
    a <  b       # a is less than b
    a => b       # a is greater than or equal to b
    a <= b       # a is less than or equal to b
    a =~ b       # a matches the perl regex pattern b
    a !~ b       # a does not match the perl regex pattern b
    a in b,c,d   # a appears in the list b, c, d (see DELIMITER)

The items on the right of the evaluations can be absolute values or variable names which should be prefixed by a '$'. The items on the left of the evaluation are assumed to be variable names. There is no need to prefix these with a '$', but you can if you choose.

The single equality, "a = b", is treated identically to a double equality "a == b" although the two traditionally represent different things (the first, an assignment, the second, a comparison). In this context, I consider the former usage confusing and would recommend use of the latter at all times.

Variables without any comparison operator or operand are tested for a true/false value.

Examples:

    %% INCLUDE foo if="name==fred"        %%
    %% INCLUDE foo if="$name==fred"       %%  # equivalent to above
    %% INCLUDE foo if="name==$goodguy"    %%
    %% INCLUDE foo if="hour > 10"         %%
    %% INCLUDE foo if="tonk =~ [Ss]pl?at" %%
    %% INCLUDE foo if="camper"            %%

Multiple conditions can be joined using the following boolean operators

    a && b       # condition 'a' and 'b' 
    a || b       # condition 'a' or  'b' 
    a ^  b       # condition 'a' xor 'b'
    a and b      # same as "a && b" but with lower precedence
    a or  b      # same as "a || b" but with lower precedence
    a xor b      # same as "a ^  b" but with lower precedence

Conditional equations are evaluated left to right and may include parentheses to explicitly set precedence.

Examples:

    %% INCLUDE tonk     
       if="hardenuf && uid in abw,wrigley"           
    %%
    %% INCLUDE tapestry 
       if="(girly && studly < 1) || uid == neilb"    
    %%
    %% INCLUDE tapestry 
       if="($girly && $studly < 1) || $uid == neilb" 
    %%

Note that the third example above is identical in meaning to the second, but explicitly prefixes variable names with '$'. This is optional for elements on the left hand side of comparison operators, but mandatory for those on the right that might otherwise be interpreted as absolute values.

BLOCK..ENDBLOCK

In some cases it is desirable to have a block of text available to be inserted via INCLUDE without having to define it in an external file. The BLOCK..ENDBLOCK directives allow this.

A BLOCK directive with a unique identifier marks the start of a block definition. The block continues, including any valid MetaText directives, until an ENDBLOCK directive is found.

A BLOCK..ENDBLOCK definition may appear anywhere in the file. It is in fact possible to INCLUDE the block before it has been defined as long as the block definition resides in the same file.

Processing of a file stops when it encounters an __END__ or __MTEND__ marker on a line by itself. Blocks can be defined after this marker even though the contents of the file after the marker are ignored by the processor.

    # include a block defined later
    %% INCLUDE greeting name=Prospero %%

    __END__
    %% BLOCK greeting %%
    Hello %% name %%
    %% ENDBLOCK %%

This produces the following output:

    # include a block defined later
    Hello Prospero

Additional variable definitions specified in an INCLUDE directive will be applied to blocks just as they would to external files.

By default, BLOCK definitions are "trimmed". That is, the leading and trailing newlines (if present) in the block definition are deleted. This allows blocks to be defined:

    %% BLOCK example1 %%
    Like this!
    %% ENDBLOCK %%

and not:

    %% BLOCK example2 %%Like this!%% ENDBLOCK %%

This behaviour can be disabled by specifying a TRIM configuration parameter with a zero value. See the TRIM option, mentioned above. A "trim" or "trim=0" parameter can be added to a block to override the behaviour for that BLOCK definition only. e.g.

    %% BLOCK sig trim=0 %%
    --
    This is my .signature
    %% ENDBLOCK %%

A BLOCK..ENDBLOCK definition that appears in the main part of a document (i.e. before, or in the absence of an __END__ line) will not appear in the processed output. A simple "print" flag added to the BLOCK directive overrides this behaviour, causing a copy of the BLOCK to appear in it's place:

    %% DEFINE name=Caliban %%

    %% BLOCK greeting print %%
    Hello %% name %%
    %% ENDBLOCK %%

    %% INCLUDE greeting name="Prospero" %%

produces the following output:

    Hello Caliban

    Hello Prospero

Conditions ("if" and "unless") can be applied to BLOCK directives, but they affect how and when the BLOCK itself is printed, rather than determining if the block gets defined or not. Conditionals have no effect on BLOCK directives that do not include a "print" flag.

It is possible to pre-declare blocks for subsequent inclusion by using the public declare() method. The first parameter should be a text string containing the content of the block. The second paramter is the block name by which it should consequently be known. The content string is parsed and an internal block definition is stored.

Example:

    $mt->declare("<title>%%title%%</title>", html_title);

This can subsequently be used as if the block was defined in any other way:

    %% INCLUDE html_title
       title = "My test page"
    %%

It is also possible to pass an array reference to declare() as the content parameter. In this context, it is assumed that the array is a pre-parsed list of text strings or Text::MetaText::Directive (or derivative) references which should be installed as the block definition for the named block. This process assumes an understanding of the MetaText directive structure and internal symbol table entries. If you don't know why you would want to do this, then the chances are that you don't need to do it. "Experts only" in other words.

VARIABLE INTERPOLATION ^

MetaText allows variable values to be interpolated into directive operands and other variable values. This is useful for style-sheet processing and other applications where a particular view required can be encoded in a variable and interpolated by the processor.

By example, the file 'mousey.html':

    %% INCLUDE $style/header %%

    The cat sat on the mouse.

    %% INCLUDE $style/footer %%

can be processed in the following ways to create customised output:

    $t1 = $mt->process_file('mousey.html', {'style' => 'text'});
    $t2 = $mt->process_file('mousey.html', {'style' => 'graphics'});

Variable interpolation is also useful for building up complex variables based on sub-elements:

    %% DEFINE root=/user/abw %%

    %% DEFINE 
       docs   = $root/docs
       images = $root/images 
    %%

Note though, that there is no guaranteed order of definition for multiple variables within a single DEFINE directive. The following is INCORRECT as there is no guarantee that 'base' will be defined before 'complex'.

    %% DEFINE 
       base    = /here
       complex = $base/and/there    # WRONG! $base may not be defined yet
    %%

In such circumstances, it is necessary to define variables in separate directives.

    %% DEFINE base=/here %%
    %% DEFINE complex=$base/and/there %%

Where necessary, variable names may be enclosed in braces to delimit them from surrounding text:

    %% DEFINE
       homepage = http://$server${home}index.html
    %%

EXTENDING METATEXT ^

MetaText may be used as a base class for deriving other text processing modules. Any member function of a derived class can be called directly as a MetaText directive. See the EXECUTE configuration option for more details.

Pseudo-code example:

    package MyMetaText;
    @ISA = qw( Text::MetaText );

    # define a new derived class method, get_name()
    sub get_name {
        my $self   = shift;
        my $params = shift;

        # return name from an ID hash, for example
        $self->{ PEOPLE }->{ $params->{'id'} } || 'nobody';
    }

    package main;

    # use the new derived class
    my $mmt = MyMetaText { EXECUTE => 1 };

    # process 'myfile'
    print $mmt->process('myfile');

which, for a sample file, 'myfile':

    %% get_name id=foo %%
    %% get_name id=bar %%

is equivalent to:

    print $mmt->get_name({ 'id' => 'foo' }), "\n";
    print $mmt->get_name({ 'id' => 'bar' }), "\n";

Alternatively, a simple calling script can be written that defines functions that themselves can be called from within a document:

    my $mt = Text::MetaText->new( { EXECUTE => 2 } );

    print $mt->process("myfile");

    sub get_name {
        my $params = shift;
        $global_people->{ $params->{'id'} } || 'nobody';
    }

WARNINGS AND ERRORS ^

The following list indicates warning or error messages that MetaText can generate and their associated meanings.

"CASEVARS option expects an array reference"

The configuration hash array passed to Text::MetaText->new() contained a CASEVARS entry that did not contain an array reference. See "USING THE METATEXT MODULE".

"Closing directive tag missing in %s"

A MetaText directive was found that was not terminated before the end of the file. e.g. %% INCLUDE something ... The processor attempts to compensate, but check your source files and add any missing MAGIC tokens.

"Directive constructor failed: %s"

The MetaText parser detected a failed attempt to construct a Directive object. This error should only happen in cases where a derived Directive class has been used (which should imply you know what you're doing and what the error means. The specific Directive constructor error is appended to the error message.

"Invalid configuration parameter: %s"

An invalid configuration parameter was identified in the hash array passed to Text::MetaText->new(). See "USING THE METATEXT MODULE".

"Invalid debug/error function"

The debug or error handling routine specified for the ERROR or DEBUG configuration options was not a code reference. See the ERROR and/or DEBUG sections for more details.

"Invalid debug option: %s"

A token was specified for the DEBUGLEVEL configuration item which was invalid. See the DEBUGLEVEL section for a complete list of valid tokens.

"Invalid factory object"

A FACTORY configuration item was specified which did not contain a reference to a Text::MetaText::Factory object, or derivative.

"Invalid input reference passed to declare()"

The declare() method was called and the first parameter was not a reference to an ARRAY or a text string. These are (currently) the only two valid input types.

"Invalid rogue option: %s"

A token was specified for the ROGUE configuration item which was invalid. See the ROGUE section for a complete list of valid tokens.

"Maximum recursion exceeded"

The processed file had multiple INCLUDE directives that nested to a depth greater than MAXDEPTH (default: 32). Set MAXDEPTH higher to avoid this problem, or check your files for circular dependencies.

"Missing directive keyword"

A MetaText directive was identified that had no keyword or other content. e.g. %% %%

"Parse error at %s line %s: %s"

The pre-processor was unable to correctly parse a block or file. The error message reports the file name and line number (or 'text string' in the case of parse_text()) and the specific error details.

"Text::MetaText->new expects a hash array reference"

The new() method can accept a reference to a hash array as the first parameter which contains configuration variables and values. This error is generated if the parameter is not a hash array reference.

"Unrecognise directive: %s"

An internal error that should never happen. The pre-processor has identified a directive type that the processor then failed to recognise.

"Unrecognised token: %s"

A %% SUBST <variable> %% or %% <variable> %% directive was found for which there was no corresponding <variable> defined. This warning is only generated when the 'warn' token is set for the ROGUE option.

"Unmatched parenthesis: %s"

A conditional evaluation ("if" or "unless") for a directive is missing a closing parenthesis. e.g. %% INCLUDE foobar if="(foo && bar || baz" %%

"%s: non-existant or invalid filter"

An INCLUDE or SUBST directive included a "filter" option that refers to a non-existant filter. e.g. %% INCLUDE foo filter=nosuchfilter() %%

"%s: no such block defined"

The _process($symbol) method could not process the named symbol because it was not defined in the symbol table.

AUTHOR ^

Andy Wardley <abw@kfs.org>

See also:

    http://www.kfs.org/~abw/

My thanks extend to the people who have used and tested MetaText. In particular, the members of the Peritas Online team; Simon Matthews, Simon Millns and Gareth Scott; who brutally tested the software over a period of many months and provided valuable feedback, ideas and of course, bug reports. Deep respect is also due to the members of the SAS Team at Canon Research Centre Europe Ltd; Tim "TimNix" O'Donoghue, Neil "NeilOS" Bowers, Ave "AveSki" Wrigley, Martin "MarTeX" Portman, Channing "Chango" Walton and Gareth "Gazola" Rees. Don't go changing now... :-)

I welcome bug reports, enhancement suggestions, comments, criticisms (hopefully constructive) and patches related to MetaText. I would appreciate hearing from you if you find MetaText particularly useful or indeed if it doesn't do what you want, for whatever reason. Hopefully this will help me make MetaText help you more.

It pains me to say that MetaText comes without guarantee or warranty of suitability for any purpose whatsoever. That doesn't mean it doesn't do anything good, but just that I don't want some scrupulous old git to sue me because they thought I implied it did something it doesn't. <sigh>

Text::MetaText is based on a template processing language I developed while working at Peritas Ltd. I am indebted to Peritas for allowing me to use this work as the basis for MetaText and to release it to the public domain. I am also pleased to note that Canon Research Centre Europe supports the Perl community and the Free Software ideology in general.

REVISION ^

$Revision: 0.22 $

COPYRIGHT ^

Copyright (c) 1996-1998 Andy Wardley. All Rights Reserved.

This program is free software; you can redistribute it and/or modify it under the terms of the Perl Artistic License.

SEE ALSO ^

For more information, see the accompanying documentation and support files:

    README    Text based version of this module documentation.
    Changes   Somewhat verbose list of per-version changes.
    Todo      Known bugs and possible future enhancements.
    Features  A summary of MetaText features and brief comparison to 
              other perl 'template' modules.

For information about the metapage utility, consult the specific documentation:

    perldoc metapage
  or 
    man metapage

For more information about the author and other Perl development work:

    http://www.kfs.org/~abw/
    http://www.kfs.org/~abw/perl/
    http://www.cre.canon.co.uk/perl/

For more information about Perl in general:

    http://www.perl.com/
syntax highlighting: