
HTML::Validator - HTML validation by using nsgmls

use HTML::Validator; $doc = new HTML::Validator($file); $doc->validate; print "Document is valid\n" if $doc->is_valid;

This module can be used to validate HTML (or SGML) documents. For the validation itself, it uses nsgmls and a set of document type definition files (aka DTDs).
HTML::Validator uses libwww-perl to validate remote files.
The file or url will be used as the document to validate. This method will be called implicitely if the constructor is called with an argument.
Returns the document type.
The return value is undefined if no filename has been passed to the object via the constructor or the open method.
If the file has not been retrieved yet, it will be done.
If you want to replace the document type, you must do so with the first call to this method. The document types are defined in $doc->{dtdmap}.
Finds out the actual name of document type definition file that is used. The return value is the name of the file, or undefined if the document type is not defined.
Validates the document. The return value is a reference to an array containing the modified output from nsgmls.
Internal method to get the file and process the doctype information.
If there is an URL in the doctype, it will be replaced to support nsgmls binaries that do not support URLs
Replaces the document type definition on the file. The new dtd is the first argument, or the default dtd if no argument is supplied.
Returns an error from nsgmls error output queue.
Internal method to parse the raw nsgmsl output to a more readable form. If you want to call this method more than once per object, purge the error output queue with $doc->errors first.
This method will call a parser method to do the actual parsing, which is $doc->parser() by default. It can overriden by setting $doc->{parser}.
The default nsgmls output parser. This is called from parse_errors. If the return value is undef, then to parser is assumed to have found no errors. Otherwise the parser will return a reference to an array containing the errors.
Returns 1 if document is valid, 0 if document is invalid and undef if document hasn't been validated yet.
Contains the source of the HTML file as a scalar.
Contains the message queue. If called with an argument, places a new message to the queue, without an argument a message (if any) is removed.
If the argument is '-1', the last message on the queue is returned.
Resets the object to original state so we can reuse it

The used nsgmls binary
The used catalog file
Array of messages
The maximum number of errors. This is passed to nsgmls with the -E option
The document type for the document
The default type for the document. By default this is 'html4'.
The dtd used for the document
The mapping for document types and the explicit document type definition strings
The custom parser to use. See the information for the parser method.

HTML::Validator requires that


Thanks go to:
- Heikki Kantola <hezu@iki.fi>, for his help in the early testing phases and his excellent knowledge about HTML standards.

The latest version of HTML::Validator can be found from http://www.iki.fi/si/HTML-Validator/.
It is also available from CPAN (http://www.perl.com/CPAN/).

HTML::Validator is (c) 1997-1999 Sami Itkonen <si@iki.fi>
HTML::Validator is distributed under the GNU General Public License.