The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

XML::DT::Sequence - Down Translator (XML::DT) for sequence XMLs

VERSION

Version 0.01

SYNOPSIS

A lot of XML files nowadays are just catalogues, simple sequences of small chunks, that repeat, and repeat. These files can get enormous, and DOM processing hard. SAX processing it interesting but not always the best approach.

This module chunks the XML file in Header, a sequence of the repeating blocks, and a footer, and each one of these chunks can be processed by DOM, using XML::DT technology.

    use XML::DT::Sequence;

    my $dt = XML::DT::Sequence->new();

    $dt->process("file.xml",
                 -tag => 'item',
                 -head => sub {
                      my ($self, $xml) = @_;
                      # do something with $xml
                 },
                 -body => {
                        item => sub {
                            # XML::DT like handler
                        }
                 },
                 -foot => sub {
                      my ($self, $xml) = @_;
                      # do something with $xml
                 },
                );

EXPLANATION

There are four options, only two mandatory: -tag and -body. -tag is the element name that repeats in the XML file, and that you want to process one at a time. -body is the handler to process each one of these elements.

-head is the handler to process the XML that appears before the first instance of the repeating element, and -foot the handler to process the XML that apperas after the last instance of the repeating element.

Each one of these handlers can be a code reference that receives the XML::DT::Sequence object and the XML string, or a hash reference, with XML::DT handlers to process each XML snippet.

Note that when processing header or footer, XML is incomplete, and the parser can recover in weird ways.

The process method returns a hash reference with three keys: -head is the return value of the -head handler, and -foot is the return value of the -foot handler. -body is the number of elements of the sequence that were processed.

METHODS

new

Constructor.

process

Processor. Se explanation above.

break

Forces the process to finish. Useful when you processed enough number of elements. Note that if you break the process the -foot code will not be run.

If you are using a code reference as a handler, call it from the first argument (reference to the object). If you are using a XML::DT handler, $u has the object, so just call break on it.

AUTHOR

Alberto Simões, <ambs at cpan.org>

BUGS

Please report any bugs or feature requests to bug-xml-dt-sequence at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-DT-Sequence. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc XML::DT::Sequence

You can also look for information at:

KNOWN BUGS AND LIMITATIONS

  • Spaced tags

    It is not usual, but XML allows the usage of spaces inside element tags, for instance, between the < and the element name. This is NOT supported.

  • Multiple usage tags

    If the same tag is used in different levels of the XML hierarchy, it is likely that the implemented algorithm will not work.

ACKNOWLEDGEMENTS

LICENSE AND COPYRIGHT

Copyright 2012 Alberto Simões.

This program is free software; you can redistribute it and/or modify it under the terms of either: the GNU General Public License as published by the Free Software Foundation; or the Artistic License.

See http://dev.perl.org/licenses/ for more information.