Jeffrey Kegler > Marpa-HTML > html_score

Download:
Marpa-HTML-0.112000.tar.gz

Annotate this POD

CPAN RT

Open  0
View/Report Bugs
Source  

NAME ^

html_score - Show complexity metric and other stats for web page

SYNOPSIS ^

    html_score [--html] [uri|file]

EXAMPLES ^

    html_score http://perl.org

    html_score --html http://perl6.org

DESCRIPTION ^

Given a URI or a file name, treats its referent as HTML and prints a complexity metric, the maximum element depth, and per-element statistics. The per-element statistics appear in rows, one per tag name. For each tag name, its row contains:

The argument to html_score can be either a URI or a file name. If it starts with alphanumerics followed by a colon, it is treated as a URI. Otherwise it is treated as file name. If the --html option is specified, the output is written as an HTML table.

The complexity metric is the average depth (or nesting level), in elements, of a character, divided by the logarithm of the length of the HTML. Whitespace and comments are ignored in calculating the complexity metric. The division by the logarithm of the HTML length is based on the idea that, all else being equal, it is reasonable for the nesting to increase logarithmically as a web page grows in length.

SAMPLE OUTPUT ^

Here is the first part of the output for http://perl.org.

    http://perl.org
    Complexity Score = 0.873
    Maximum Depth = 12
                  Maximum   Number of  Size in      Average
       Element    Nesting   Elements  Characters     Size  
    a                    1         56       3533         63
    body                 1          1       7615       7615
    div                  5         30      24695        823
    em                   1          1         13         13
    h1                   1          1         37         37
    h4                   1         11        559         50

INTERPRETING THE COMPLEXITY METRIC ^

With caution, the complexity metric can be used as a self-assessment of website quality. Well designed websites often have low numbers, particularly if fast loading is an important goal. But high values of the complexity metric do not necessarily mean low quality. Everything depends on what the mission is, and how well complexity is being used to serve the site's mission.

PURPOSE ^

This program is a demo of a demo. It purpose is to show how easy it is to write applications which look at the structure of web pages using Marpa::HTML. And the purpose of Marpa::HTML is to demonstrate the power of its parse engine, Marpa. Marpa::HTML was written in a few days, and its logic is a straightforward, natural expression of the structure of HTML.

ACKNOWLEDGMENTS ^

The starting template for this code was HTML::TokeParser, by Gisle Aas. See also the acknowledgments for Marpa as a whole.

LICENSE AND COPYRIGHT ^

Copyright 2007-2010 Jeffrey Kegler, all rights reserved. Marpa is free software under the Perl license. For details see the LICENSE file in the Marpa distribution.

syntax highlighting: