The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::Struct - Represent XML as data structure preserving element order

VERSION

version 0.15

SYNOPSIS

    use XML::Struct qw(readXML writeXML simpleXML removeXMLAttr);

    my $xml = readXML( "input.xml" );
    # [ root => { xmlns => 'http://example.org/' }, [ '!', [ x => {}, [42] ] ] ]

    my $doc = writeXML( $xml );
    # <?xml version="1.0" encoding="UTF-8"?>
    # <root xmlns="http://example.org/">!<x>42</x></root>

    my $simple = simpleXML( $xml, root => 'record' );
    # { record => { xmlns => 'http://example.org/', x => 42 } }

    my $xml2 = removeXMLAttr($xml);
    # [ root => [ '!', [ x => [42] ] ] ]

DESCRIPTION

XML::Struct implements a mapping between XML and Perl data structures. By default, the mapping preserves element order, so it also suits for "document-oriented" XML. In short, an XML element is represented as array reference with three parts:

   [ $name => \%attributes, \@children ]

This data structure corresponds to the abstract data model of MicroXML, a simplified subset of XML.

If your XML documents don't contain relevant attributes, you can also choose to map to this format:

   [ $name => \@children ]

Both parsing (with XML::Struct::Reader or function readXML) and serializing (with XML::Struct::Writer or function writeXML) are fully based on XML::LibXML, so performance is better than XML::Simple and similar to XML::LibXML::Simple.

MODULES

XML::Struct::Reader

Parse XML as stream into XML data structures.

XML::Struct::Writer

Write XML data structures to XML streams for serializing, SAX processing, or creating a DOM object.

FUNCTIONS

The following functions are exported on request:

readXML( $source [, %options ] )

Read an XML document with XML::Struct::Reader. The type of source (string, filename, URL, IO Handle...) is detected automatically. Options not known to XML::Struct::Reader are passed to XML::LibXML::Reader.

writeXML( $xml [, %options ] )

Write an XML document/element with XML::Struct::Writer.

simpleXML( $element [, %options ] )

Transform an XML document/element into simple key-value format as known from XML::Simple: Attributes and child elements are treated as hash keys with their content as value. Text elements without attributes are converted to text and empty elements without attributes are converted to empty hashes. The following options are supported:

root

Keep the root element (just as option KeepRoot in XML::Simple). In addition one can set the name of the root element if a non-numeric value is passed.

depth

Only transform to a given depth. See XML::Struct::Reader for documentation.

All elements below the given depth are returned unmodified (not cloned) as array elements:

    $data = simpleXML($xml, depth => 2)
    $content = $data->{x}->{y}; # array or scalar (if existing)
attributes

Assume input without attributes if set to a true value. The special value remove will first remove attributes, so the following three are equivalent:

    my @children = (['a'],['b']);

    simpleXML( [ $name => \@children ], attributes => 0 );
    simpleXML( removeXMLAttr( [ $name => \%attributes, \@children ] ), attributes => 0 );
    simpleXML( [ $name => \%attributes, \@children ], attributes => 'remove' );

Key attributes (KeyAttr in XML::Simple) and the option ForceArray are not supported yet.

removeXMLAttr( $element )

Transform XML structure with attributes to XML structure without attributes. The function does not modify the passed element but creates a modified copy.

EXAMPLE

To give an example, with XML::Struct::Reader, this XML document:

    <root>
      <foo>text</foo>
      <bar key="value">
        text
        <doz/>
      </bar>
    </root>

is transformed to this structure:

    [
      "root", { }, [
        [ "foo", { }, "text" ],
        [ "bar", { key => "value" }, [
          "text", 
          [ "doz", { }, [ ] ]
        ] 
      ]
    ]

This module also supports a simple key-value (aka "data-oriented") format, as used by XML::Simple. With option simple (or function simpleXML) the document given above woule be transformed to this structure:

    {
        foo => "text",
        bar => {
            key => "value",
            doz => {}
        }
    }

SEE ALSO

This module was first created to be used in Catmandu::XML and turned out to also become a replacement for XML::Simple.

See XML::Twig for another popular and powerfull module for stream-based processing of XML documents.

See XML::Smart, XML::Hash::LX, XML::Parser::Style::ETree, XML::Fast, and XML::Structured for different representations of XML data as data structures (feel free to implement converters from/to XML::Struct). See

See XML::GenericJSON for an (outdated and incomplete) attempt to capture more parts of XML Infoset in another data structure.

See JSONx for a kind of reverse direction (JSON in XML).

AUTHOR

Jakob Voß

COPYRIGHT AND LICENSE

This software is copyright (c) 2013 by Jakob Voß.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.