Olivier Thereaux > W3C-LogValidator-0.4.1 > W3C::LogValidator



Annotate this POD


New  3
Open  0
View/Report Bugs
Module Version: 1.012   Source   Latest Release: W3C-LogValidator-1.4


W3C::LogValidator - The W3C Log Validator - Quality-focused Web Server log processing engine

Checks quality/validity of most popular content on a Web server


Generic, basic use of the W3C::LogValidator module. Parse configuration file and process relevant logs.

    use W3C::LogValidator;
    my $logprocessor = W3C::LogValidator->new("sample.conf");

Alternatively (use default config and process logs)

    my $logprocessor = W3C::LogValidator->new;


W3C::LogValidator is the main module for the W3C Log Validator, a combination of Web Server log analysis and statistics tool and Web Content quality checker.

As an easy alternative to using this module, the perl script logprocess.pl is bundled in the W3C::LogValidator distribution.



$processor = W3C::LogValidator->new

Constructs a new W3C::LogValidator processor. You might pass a configuration file name, as well as a hash of attribute-value pairs as parameters to the constructor.

e.g. for mail output:

  %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::Mail",
    "ServerAdmin" => 'webmaster@example.com',
    "verbose" => "3"
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);

Or e.g. for HTML output:

  %conf = (
    "UseOutputModule" => "W3C::LogValidator::Output::HTML",
    "OutputTo" => 'path/to/file.html',
    "verbose" => "0"
  $processor = W3C::LogValidator->new("path/to/config.conf", \%conf);

If given the path to a configuration file, new() will call the W3C::LogValidator::Config module to get its configuration variables. Otherwise, a default set of values is used.

Main processing method


Do-it-all method: Read configuration file (if any), parse log files, run them through processing modules, send result to output module.

Modules methods


Creates a configuration hash for a specific module, adding module-specific configuration variables, overriding if necessary


Run the data parsed off the log files through the various processing (validation) modules specified by UseValidationModule in the configuration.

Log parsing and URI methods


Loops through and parses all log files specified in the configuration


Extracts URIs and number of hits from a given log file, and feeds it to the processor's URI/Hits table


Given a log record and the type of the log (common log format, flat list of URIs, etc), extracts the URI


Given a URI, removes "directory index" suffixes such as index.html, etc so that http://foobar/ and http://foobar/index.html be counted as one resource


Add a URI to the processor's URI/Hits table


Returns the list of URIs in the processor's table, sorted by popularity (hits)


Tests whether a given URI contains a CGI query string


Returns the number of hits for a given URI. Basically a "public" method accessing $hits{$uri};


Public bug-tracking interface at http://www.w3.org/Bugs/Public/


Olivier Thereaux <ot@w3.org> for The World Wide Web Consortium


perl(1). Up-to-date information on this tool at http://www.w3.org/QA/Tools/LogValidator/

syntax highlighting: