The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Apache2::ModXml2 - makes mod_xml2 funtionality available to perl modules

SYNOPSIS

  use XML::LibXML;

  use Apache2::ModXml2 qw(:all);

  # The usual filter stuff is omitted
  # ...

    for (my $b = $bb->first; $b; $b = $bb->next($b)) {      
      if ($b->type->name eq 'NODE') {
        # This is the most important interface function
        my $node = Apache2::ModXml2::unwrap_node($b);
        # The nodes are not connected but still know their document
        my $doc = $node->ownerDocument;
        if (defined($node)) {
          if ($node->isa('XML::LibXML::Element')) {
            my $end = Apache2::ModXml2::end_bucket($b);
            if ($end) {
                # If it knows the end bucket, it is a start bucket
                $node->setAttribute('class', 'mod_xml2');
            }

DESCRIPTION

Apache2::ModXml2 is a wrapper for the mod_xml2 API. It allows you to write filters that modify the outgoing XML/HTML by modifying XML::LibXML nodes.

The apache module mod_xml2 implements the "node" filter. This filter runs the libxml2 parser on the outgoing XML/HTML and wraps the SAX events into a special bucket type. These are called node buckets.

Subsequent filters then modify the outgoing by modifying the node bucket stream. With Apache2::ModXml2 this can be done with perl.

Node buckets hold a libxml node. ModXml2 wraps it into a XML::LibXML::Node that can be used with the set of funtions provided by XML::LibXML.

Note that in case of element nodes start and end bucket hold the same node. The start bucket already knows the end bucket. Even so the start node continues to exist until the end node is reached, modifying it may be pointless if it has been passed to the filter again. The node may have been sent over the network.

Apache2::ModXml2 also offers XPath callbacks, that get called on matches of (very) simple XPath selectors. Unlike the simpler ModXml2 functions these can do DOM tree manipulation since the matches get passed in as trees.

FUNCTIONS

BASIC FUNCTIONS

wrap_node
  wrap_node($alloc, $node, $r_log);

Returns an APR::Bucket object that has been created wrapping $node into a mod_xml2 node using the APR::BucketAllocator $alloc.

$r_log is a request object to use for logging.

unwrap_node
  unwrap_node($b);

Returns the XML::LibXML::Node held by the APR::Bucket $b given as a parameter.

end_bucket
  end_bucket($b);

Returns the associated end bucket provided $b is a start element bucket and undef othewise.

make_start_bucket
  make_start_bucket($b);

Turns the bucket $b into a start element bucket and returns the thereby created end bucket.

init_doc
  init_doc($doc, $pool);

This function is needed since wrapping of the document node (e.g. by calling $node->ownerDocument) will delete it when the perl node does out of scope.

So in case the document is used this needs to be called with the document and a pool to append node deletion as a cleanup.

XPATH FILTERING

mod_xml2 implements functions for a filter that builds a DOM subtree each time a streaming xpath expression (named pattern by libxml2) matches. The tree is passed passed to a callback function and decomposed into single nodes again afterwards. The streaming xpath expressions are from a very limited xpath subset as described here: http://www.w3.org/TR/xmlschema-1/#Selector

xpath_filter_init
  xpath_filter_init($f, $xpath, $namespaces, &transform);

To create a streaming xpath filter this function needs to be called from filter init. The return value is suitable for returning it from filter init.

Every time $xpath matches &transform is called with the subtrees root node as a parameter. The namespaces needed to compile the pattern are passed as a list [URI, prefix, ...]. Be aware that these prefixes are just aliases for pattern usage. They do not need to coincide with the prefixes in the document.

xpath_filter
  xpath_filter($f, $bb);

This is simply the work horse filter function.

EXPORT

None by default.

SEE ALSO

The concept for this implementation:

http://www.heute-morgen.de/site/03_Web_Tech/50_Building_an_Apache_XML_Rewriting_Stack.shtml

The mod_xml2 apache module:

http://www.heute-morgen.de/modules/mod_xml2/

AUTHOR

Joachim Zobel, <jz-2012@heute-morgen.de>

COPYRIGHT AND LICENSE

Copyright (C) 2012 by Joachim Zobel

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.