The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Test::Formats::XML - Test::Formats specialization that tests XML content

VERSION

Version 0.11

SYNOPSIS

    use Test::Formats::XML;

    our $schema  = (<schema/*.xsd>)[0];
    our $relaxng = (<relaxng/*.rng>)[0];
    our $sgmldtd = (<dtd/*.dtd>)[0];

    our @schema_tests  = <schema/*.xml>;
    our @relaxng_tests = <relaxng/*.xml>;
    our @sgmldtd_tests = <dtd/*.xml>;

    plan tests => (1 + @schema + @relaxng + @sgmldtd);

    is_well_formed_xml($schema, "Test that the XML Schema parses");

    is_valid_against_xmlschema($schema, $_) for (@schema_tests);

    is_valid_against_relaxng($relaxng, $_)  for (@relaxng_tests);

    is_valid_against_sgmldtd($sgmldtd, $_)  for (@sgmldtd_tests);

DESCRIPTION

Test::Formats::XML is a specialization module for Test::Formats that provides test-functions for evaluating XML content against XML Schema, RelaxNG Schema and Document Type Declarations (DTDs).

This module is built on the framework provided by Test::Builder (see Test::Builder and Test::More), and works under the TAP-based Test::Harness system. It can be used directly as the only testing module a given suite uses, or it can be used in conjunction with other harness-friendly modules.

The module uses the XML::LibXML module from CPAN, and provides the user with simple-to-use wrappers around the various forms of validation provided by XML::LibXML::Schema, XML::LibXML::RelaxNG and XML::LibXML::Dtd.

FUNCTIONS

This only covers the functions specific to this module. However, all functionality provided by Test::Builder/Test::More is accessible here, as well. See those modules for more information.

Parameters

All of the functions described in the next section take the same sequence of parameters, with the same relevance. These are:

$schema

For all of the test routines, the first argument represents the schema being used to validate the document (the second argument). What type of schema is important to the function being called-- if you pass a DTD to the RelaxNG test, it will not automatically re-route you to the DTD test. The value of this argument may be any of the following:

pre-parsed XML::LibXML::* object

The easiest form to deal with, of course, is when the user is generous-enough to compile the schema themselves with the appropriate XML::LibXML::* class and pass the resulting object. The object is then used directly. This also saves slightly on processing and overhead time when you intend to use the same schema for a large number of tests.

open filehandle

If the argument is a filehandle, the contents are read and the resulting document parsed. None of the schema-related classes can (currently) take a filehandle directly, so this is offered to the user as a matter of convenience. If you are re-using the same file across multiple tests, you can use the seek command to move the filehandle back to the start of the file and re-use the existing filehandle as well.

scalar reference

If the argument is a scalar reference, it is presumed to contain the text of the schema and is passed to the parser as such.

string (scalar)

If the argument is a (non-reference) scalar, it is treated as a string. It is first tested with some regular expressions to see if the content looks like a schema of the given type. If it does not look like the text of a schema, it is passed to the constructor method of the relevant schema-class as a location of the schema. The particular XML::LibXML::* class will try to read it and parse it into an object.

Any of the forms that have to read and/or parse the schema text are wrapped in eval blocks. If they fail for any reason, the test reports a failure and the text of the error is output as diagnostic information.

The tests done to match plain text data to one of the specific schema-types are somewhat limited, and may not always be guaranteed to work. Generally, it is best to only use the straight string parameter for filenames. If you have the schema in string-form, consider passing it as a scalar reference.

$document

This argument represents the document being tested against the schema provided in the first argument. As with the schema, you have a choice of ways in which to pass this:

pre-parsed XML document

If the user has pre-parsed the document, the resulting XML::LibXML::Document object can be passed in as this parameter. This can be useful if the test suite wishes to distinguish document well-formedness (the document is parseable without errors) versus document validity (whether the parsed document conforms to a given schema).

open filehandle

Unlike the schema-related classes, the XML::LibXML objects have a method to parse directly from a filehandle. If the parameter passed in appears to be an open filehandle, it is passed to this method in order to obtain a document object.

scalar reference

If the parameter is a scalar reference, it is assumed to be a reference to the document in memory. The de-referenced scalar is passed to the parse_string method of a XML::LibXML object, to result in a document object.

string (scalar)

Lastly, if the value is a (non-reference) scalar, it is first examined to see if it looks like an XML document. Regular expressions are used to see if either a DOCTYPE declaration or an XML document declaration (the initial <?xml ...?> line that most XML documents have) is present. If neither of these are found, the string is presumed to be a file-name and is passed to the parse_file method of XML::LibXML. If the string looks like XML content after all, it is passed to the parse_string method of that class.

Also as with the schema argument, any of the forms that have to directly handle reading a file and/or parsing the document itself, are wrapped in eval blocks to catch any fatal errors. If such occur, the test reports a failure and the error is given as diagnostic information for the test.

$name

This argument is the only optional parameter of the three. If passed, it should be a string identifying the test. It is displayed in the TAP output stream, just as the name parameter to more-familiar test functions (ok(), like(), etc.) is used.

Tests

The following test functions are provided:

is_valid_against_relaxng($schema, $document, $name)
is_valid_against_rng($schema, $document, $name)

The first pair test a document against a RelaxNG schema. For more on the RelaxNG syntax, see http://relaxng.org/. The second name is provided as a shorter alias for the full name.

is_valid_against_sgmldtd($schema, $document, $name)
is_valid_against_dtd($schema, $document, $name)

This pair test a document against a DTD. The name is slightly misleading, as both SGML and XML DTDs are supported by XML::LibXML::Dtd. There are some minor syntactical differences between SGML DTDs and XML DTDs, but you can use whichever is best for your needs. The second name is a shorter alias provided for convenience.

is_valid_against_xmlschema($schema, $document, $name)
is_valid_against_xsd($schema, $document, $name)

This pair validate documents against XML Schemas. See http://www.w3.org/TR/xmlschema-0/ and http://www.w3.org/TR/xmlschema-1/ for more about using XML Schema to define document structure. The second name is provided as a shorter alias.

is_well_formed_xml($document, $name)
xml_parses_ok($document, $name)

This pair test that an XML document is well-formed, which is to say that it parses without errors. This is not the same as validation. A passing test here says nothing about the validity of the XML content itself, only that all tags are properly closed, etc. Note that these functions do not take a schema argument, only the XML document and (optionally) the test name.

These tests are convenience, as the same basic functionality can be found in other test-related modules on CPAN. However, as long as XML::LibXML is already being used, there is no harm in making things easier for the user by providing them here and cutting down on the list of dependencies.

The second name is provided as a shorter alias, and for aesthetic purposes for those who prefer tests that follow the *ok() naming pattern.

All of the tests capture any fatal errors thrown by the underlying XML::LibXML classes used, and report them as diagnostic data to accompany a failed test report. See the diag method of Test::Builder for more information.

BUGS

Please report any bugs or feature requests to bug-test-formats at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Formats. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

ACKNOWLEDGMENTS

The original idea for this stemmed from a blog post on http://use.perl.org by Curtis "Ovid" Poe. He proferred some sample code based on recent work he'd done, that validated against a RelaxNG schema. I generalized it for all the validation types that XML::LibXML offers, and expanded the idea to cover more general cases of structured, formatted text.

COPYRIGHT AND LICENSE

Copyright (c) 2008 Randy J. Ray, all rights reserved.

This module and the code within are released under the terms of the Artistic License 2.0 (http://www.opensource.org/licenses/artistic-license-2.0.php). This code may be redistributed under either the Artistic License or the GNU Lesser General Public License (LGPL) version 2.1 (http://www.opensource.org/licenses/lgpl-license.php).

SEE ALSO

Test::Formats, Test::More, Test::Builder, XML::LibXML::Schema, XML::LibXML::RelaxNG, XML::LibXML::Dtd

AUTHOR

Randy J. Ray, <rjray at blackperl.com>