Akira Hangai > Apache-ParseLog-1.02 > Apache::ParseLog

Download:
Apache-ParseLog-1.02.tar.gz

Dependencies

Annotate this POD (1)

Related Modules

Date::Manip
Date::Calc
Time::Local
Bit::Vector
Text::CSV_XS
Getopt::Long
Time::ParseDate
Time::CTime
CGI::Carp
Time::localtime
more...
By perlmonks.org

CPAN RT

New  4
Open  1
View/Report Bugs
Module Version: 1.02   Source  

NAME ^

Apache::ParseLog - Object-oriented Perl extension for parsing Apache log files

SYNOPSIS ^

    use Apache::ParseLog;
    $base = new Apache::ParseLog();
    $transferlog = $base->getTransferLog();
    %dailytransferredbytes = $transferlog->bytebydate();
    ...

DESCRIPTION ^

Apache::ParseLog provides an easy way to parse the Apache log files, using object-oriented constructs. The data obtained using this module are generic enough that it is flexible to use the data for your own applications, such as CGI, simple text-only report generater, feeding RDBMS, data for Perl/Tk-based GUI application, etc.

FEATURES ^

  1. Easy and Portable Log-Parsing Methods

    Because all of the work (parsing logs, constructing regex, matching and assigning to variables, etc.) is done inside this module, you can easily create log reports (unless your logs need intense scrutiny). Read on this manpage as well as the "EXAMPLES" section to see how easy it is to create log reports with this module.

    Also, this module does not require C compiler, and it can (should) run on any platforms supported by Perl.

  2. Support for LogFormat/CustomLog

    The Apache Web Server 1.3.x's new LogForamt/CustomLog feature (with mod_log_config) is supported.

    The log format specified with Apache's LogFormat directive in the httpd.conf file will be parsed and the regular expressions will be created dynamically inside this module, so re-writing your existing code will be minimal when the log format is changed.

  3. Reports on Unique Visitor Counts

    Tranditionally, the hit count is calculated based on the number of files requested by visitors (the simplest is the the total number of lines of the log file calculated as the "total hit").

    As such, the hit count obviously can be misleading in the sense of "how many visitors have actually visited my site?", especially if the pages of your site contain many images (because each image is counted as one hit).

    Apache::ParseLog provides the methods to obtain such traditional data, because those data also are very important for monitoring your web site's activities. However, this module also provides the methods to obtain the unique visitor counts, i.e., the actual number of "people" (well, IP or hostname) who visited your site, by date, time, and date and time.

    See the "LOG OBJECT METHODS" for details about those methods.

  4. Pre-Compiled Regex

    The new pre-compiled regex feature introduced by Perl 5.005 is used (if you have the version installed on your machine).

    For the pre-compiled regex and the new quote-like assignment operator (qr), see perlop(1) and perlre(1) manpages.

CONSTRUCTOR ^

To construct an Apache::ParseLog object,new() method is available just like other modules.

The new() constructor returns an Apache::ParseLog base object with which to obtain basic server information as well as to construct log objects.

New Method

new([$path_to_httpd_conf[, $virtual_host]]);

With the new() method, an Apache::ParseLog object can be created in three different ways.

  1. $base = new Apache::ParseLog();

    This first method creates an empty object, which means that the fields of the object are undefined (undef); i.e., the object does not know what the server name is, where the log files are, etc. It is useful when you need to parse log files that are not created on the local Apache server (e.g., the log files FTP'd from elsewhere).

    You have to use the config() method (see below) to call any other methods.

  2. $base = new Apache::ParseLog($httpd_conf);

    This is the second way to create an object with necessary information extracted from the $httpd_conf. $httpd_conf is a scalar string containing the absolute path to the httpd.conf file; e.g.,

        $httpd_conf = "/usr/local/httpd/conf/httpd.conf";

    This method tries to extract the information from $httpd_conf, specified by the following Apache directives: ServerName, Port, ServerAdmin, TransferLog, ErrorLog, AgentLog, RefererLog, and any user-defined CustomLog along with LogFormat.

    If any of the directives cannot be found or commented out in the $httpd_conf, then the field(s) for that directive(s) will be empty (undef), and corresponding methods that use the particular fields return an empty string when called, or error out (for log object methods, refer to the section below).

  3. $base = new Apache::ParseLog($httpd_conf, $virtual_host);

    This method creates an object just like the second method, but for the VirtualHost specified by $virtual_host only. The Apache directives and rules not specified within the <VitualHost xxx> and </VirtualHost> tags are parsed from the "regular" server section in the httpd.conf file.

    Note that the $httpd_conf must be specified in order to create an object for the $virtual_host.

BASE OBJECT METHODS ^

This section describes the methods available for the base object created by the new() construct described above.

Unless the object is created with an empty argument, the Apache::ParseLog module parses the basic information configured in the httpd.conf file (as passed as the first argument). The object uses the information to construct the log object.

The available methods are (return values are in parentheses):

    $base->config([%fields]); # (object)
    $base->version(); # (scalar)
    $base->serverroot(); # (scalar)
    $base->servername(); # (scalar)
    $base->httpport(); # (scalar)
    $base->serveradmin(); # (scalar)
    $base->transferlog(); # (scalar)
    $base->errorlog(); # (scalar)
    $base->agentlog(); # (scalar)
    $base->refererlog(); # (scalar)
    $base->customlog(); # (array)
    $base->customlogLocation($name); # (scalar)
    $base->customlogExists($name); # (scalar boolean, 1 or 0)
    $base->customlogFormat($name); # (scalar)
    $base->getTransferLog(); # (object)
    $base->getErrorLog(); # (object)
    $base->getRefererLog(); # (object)
    $base->getAgentLog(); # (object)
    $base->getCustomLog(); # (object)

*

config(%fields]);

    $base = $base->config(field1 => value1,
                          field2 => valud2,
                          fieldN => valueN);

This method configures the Apache::ParseLog object. Possible fields are:

    Field Name                     Value
    ---------------------------------------------------------
    serverroot  => absolute path to the server root directory
    servername  => name of the server, e.g., "www.mysite.com"
    httpport    => httpd port, e.g., 80
    serveradmin => the administrator, e.g., "admin@mysite.com"
    transferlog => absolute path to the transfer log
    errorlog    => absolute path to the error log
    agentlog    => absolute path to the agent log
    refererlog  => absolute path to the referer log

This method should be called after the empty object is created (new(), see above). However, you can override the value(s) for any fields by calling this method even if the object is created with defined $httpd_conf and $virtual_host. (Convenient if you don't have any httpd server running on your machine but have to parse the log files transferred from elsewhere.)

Any fields are optional, but at least one field should be specified (otherwise why use this method?).

When this method is called from the empty object, and not all the fields are specified, the empty field still will be empty (thereby not being able to use some corresponding methods).

When this method is called from the already configured object (with new($httpd_conf[, $virtual_host])), the fields specified in this config() method will override the existing field values, and the rest of the fields inherit the pre-existing values.

NOTE 1: This method returns a newly configured object, so make sure to use the assignment operator to create the new object (see examples below).

NOTE 2: You cannot (re)configure CustomLog values. It is to alleviate the possible broken log formats, which would render the parsed results unusable.

Example 1

    # Create an empty object first
    $base = new Apache::ParseLog();
    # Configure the transfer and error fields only, for the files
    # transferred from your Web site hosting service
    $logs = "/home/webmaster/logs";
    $base = $base->config(transferlog => "$logs/transfer_log",
                          errorlog    => "$logs/error_log");

Example 2

    # Create an object with $httpd_conf
    $base = new Apache::ParseLog("/usr/local/httpd/conf/httpd.conf");
    # Overrides some fields
    $logs = "/usr/local/httpd/logs";
    $base = $base->config(transferlog => "$logs/old/trans_199807",
                          errorlog    => "$logs/old/error_199807",
                          agentlog    => "$logs/old/agent_199807",
                          refererlog  => "$logs/old/refer_199807");

*

serverroot();

    print $base->serverroot(), "\n";    

Returns a scalar containing the root of the Web server as specified in the httpd.conf file, or undef if the object is not specified.

*

servername();

    print $base->servername(), "\n";

Returns a scalar containing the name of the Web server, or undef if server name is not specified.

*

httpport();

    print $base->httpport(), "\n";

Returns a scalar containing the port number used for the httpd, or undef if not specified. (By default, httpd uses port 80.)

*

serveradmin();

    print $base->serveradmin(), "\n";

Returns a scalar containing the name of the server administrator, or undef if not specified.

*

transferlog();

     die "$!\n" unless -e $base->transferlog();

Returns a scalar containing the absolute path to the transfer log file, or undef if not specified.

*

errorlog();

     die "$!\n" unless -e $base->errorlog();

Returns a scalar containing the absolute path to the error log file, or undef if not specified.

*

agentlog();

    die "$!\n" unless -e $base->agentlog();

Returns a scalar containing the absolute path to the agent log file, or undef if not specified.

*

refererlog();

    die "$!\n" unless -e $base->refererlog();

Returns a scalar containing the absolute path to the referer log file, or undef if not specified.

*

customlog();

    @customlog = $base->customlog();

Returns an array containing "nicknames" of the custom logs defined in the $httpd_conf.

*

customlogLocation($log_nickname);

    print $base->customlogLocation($name), "\n";

Returns a scalar containing the absolute path to the custom log $name. If the custom log $name does not exist, it will return undef.

This method should be used for debugging purposes only, since you can call getCustomLog() to parse the logs, making it unnecessary to manually open the custom log file in your own script.

*

customlogExists($log_nickname);

    if ($base->customlogExists($name)) {
        $customlog = $base->getCustomLog($name);
    }

Returns 1 if the custom log $name (e.g., common, combined) is defined in the $httpd_conf file and the log file exists, or 0 otherwise.

You do not have to call this method usually because this is internally called by the getCustomLog($name) method.

*

customlogFormat($log_nickname);

    print $base->customlogFormat($name), "\n";

Returns a scalar containing the string of the "LogFormat" for the custom log $name, as specified in $httpd_conf. This method is meant to be used internally, as well as for debugging purpose.

*

getTransferLog();

    $transferlog = $base->getTransferLog();

Returns an object through which to access the information parsed from the TransferLog file. See the "LOG OBJECT METHODS" below for methods to access the log information.

*

getRefererLog();

    $refererlog = $base->getRefererLog();

Returns an object through which to access the information parsed from the RefererLog file. See the "LOG OBJECT METHODS" below for methods to access the log information.

*

getAgentLog();

    $agentlog = $base->getAgentLog();

This method returns an object through which to access the information parsed from the AgentLog file. See the "LOG OBJECT METHODS" below for methods to access the log information.

*

getErrorLog();

    $errorlog = $base->getErrorLog();

This method returns an object through which to access the information parsed from the ErrorLog file. See the "LOG OBJECT METHODS" below for methods to access the log information.

*

getCustomLog($log_nickname);

    $customlog = $base->getCustomLog($name);

This method returns an object through which to access the information parsed from the CustomLog file $name. See the "LOG OBJECT METHODS" below for methods for methods to access the log information.

LOG OBJECT METHODS ^

This section describes the methods available for the log object created by any of the following base object methods: getTransferLog(), getErrorLog(), getRereferLog(), getAgentLog(), and getCustomLog($log_nickname).

This section is devided into six subsections, each of which describes the available methods for a certain log object.

Note that all the methods for TransferLog, RefererLog, and AgentLog can be used for the object created with getCustomLog($name).

TransferLog/CustomLog Methods

The following methods are available for the TransferLog object (created by getTransferLog() method), as well as the CustomLog object that logs appropriate arguments to the corresponding LogFormat.

*

hit();

    %hit = $logobject->hit();

Returns a hash containing at least a key 'Total' with the total hit count as its value, and the file extensions (i.e., html, jpg, gif, cgi, pl, etc.) as keys with the hit count for each key as values.

*

host();

    %host = $logobject->host();

Returns a hash containing host names (or IPs if names are unresolved) of the visitors as keys, and the hit count for each key as values.

*

topdomain();

    %topdomain = $logobject->topdomain();

Returns a hash containing topdomain names (com, net, etc.) of the visitors as keys, and the hit count for each key as values.

Note that if the hostname is unresolved and remains as an IP address, the visitor will not be counted toward the (and the next secdomain()) returned value of this method.

*

secdomain();

    %secdomain = $logobject->secdomain();

Returns a hash containing secondary domain names (xxx.com, yyy.net, etc.) as keys, and the hit count for each key as values.

For the unresolved IPs, the same rule applies as the above topdomain() method.

*

login();

    %login = $logobject->login();

Returns a hash containing login names (authenticated user logins) of the visitors as keys, and the hit count for each key as values.

Log entries for non-authenticated files have a character "-" as the login name.

*

user();

    %user = $logobject->user();

Returns a hash containing user names (for access-controlled directories, refer to the access.conf file of the Apache server) of the visitors as keys, and the hit count for each key as values.

Non-access-controlled log entries have a character "-" as the user name.

*

hitbydate();

    %hitbydate = $logobject->hitbydate();

Returns a hash containing date (mm/dd/yyyy) when visitors visited the particular file (html, jpg, etc.) as keys, and the hit count for each key as values.

*

hitbytime();

    %hitbytime = $logobject->hitbytime();

Returns a hash containing time (00-23) each file was visited as keys, and the hit count for each key as values.

*

hitbydatetime();

    %hitbydatetime = $logobject->hitbydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys, and the hit count for each key as values.

*

visitorbydate();

    %visitorbydate = $logobject->visitorbydate();

Returns a hash containing date (mm/dd/yyyy) as keys, and the unique visitor count for each key as values.

*

visitorbytime();

    %visitorbytime = $logobject->visitorbytime();

Returns a hash containing time (00-23) as keys, and the unique visitor count for each key as values.

*

visitorbydatetime();

    %visitorbydatetime = $logobject->visitorbydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys, and the unique visitor count for each key as values.

*

method();

    %method = $logobject->method();

Returns a hash containing HTTP method (GET, POST, PUT, etc.) as keys, and the hit count for each key as values.

*

file();

    %file = $logobject->file();

Returns a hash containing the file names relative to the DocumentRoot of the server as keys, and the hit count for each key as values.

*

querystring();

    %querystring = $logobject->querystring();

Returns a hash containing the query string as keys, and the hit count for each key as values.

*

proto();

    %proto = $logobject->proto();

Returns a hash containing the protocols used (HTTP/1.0, HTTP/1.1, etc.) as keys, and the hit count for each key as values.

*

lstatus();

    %lstatus = $logobject->lstatus();

Returns a hash containing HTTP codes and messages (e.g. "404 Not Found") for the last status (i.e., when the httpd finishes processing that request) as keys, and the hit count for each key as values.

*

byte();

    %byte = $logobject->byte();

Returns a hash containing at least a key 'Total' with the total transferred bytes as its value, and the file extensions (i.e., html, jpg, gif, cgi, pl, etc.) as keys, and the transferred bytes for each key as values.

*

bytebydate();

    %bytebydate = $logobject->bytebydate();

Returns a hash containing date (mm/dd/yyyy) as keys, and the hit count for each key as values.

*

bytebytime();

    %bytebytime = $logobject->bytebytime();

Returns a hash containing time (00-23) as keys, and the hit count for each key as values.

*

bytebydatetime();

    %bytebydatetime = $logobject->bytebydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys, and the hit count for each key as values.

ErrorLog Methods

Until the Apache version 1.2.x, each error log entry was just an error, meaning that there was no distinction between "real" errors (e.g., File Not Found, malfunctioning CGI, etc.) and non-significant errors (e.g., kill -1 the httpd processes, etc.).

Starting from the version 1.3.x, the Apache httpd logs the "type" of each error log entry, namely "error", "notice" and "warn".

If you use Apache 1.2.x, the errorbyxxx(), noticebyxxx(), and warnbyxxx() should not be used, because those methods for that are for 1.3.x only will merely return an empty hash. The allbyxxx() methods will return desired results.

The following methods are available for the ErrorLog object (created by getErrorLog() method).

*

count();

    %errors = $errorlogobject->count();

Returns a hash containing count for each type of messages logged in the error log file.

The keys and values are: 'Total' (total number of errors), 'error' (total number of errors of type "error"), 'notice' total number of errors of type "notice"), 'warn' (total number of errors of type "warn"), 'dated' (total number of error entries with date logged), and 'nodate' (total number of error entires with no date logged). So:

    print "Total Errors: ", $errors{'Total'}, "\n";
    print "Total 1.3.x Errors: ", $errors{'error'}, "\n";
    print "Total 1.3.x Notices: ", $errors{'notice'}, "\n";
    print "Total 1.3.x Warns: ", $errors{'warn'}, "\n";
    print "Total Errors with date: ", $errors{'dated'}, "\n";
    print "Total Errors with no date: ", $errors{'nodate'}, "\n";

Note that with the ErrorLog file generated by Apache version before 1.3.x, the value for 'error', 'notice', and 'warn' will be zero.

*

allbydate();

    %allbydate = $errorlogobject->allbydate();

Returns a hash containing date (mm/dd/yyyy) when the error was logged as keys, and the number of error occurrances as values.

*

allbytime();

    %allbytime = $errorlogobject->allbytime();

Returns a hash containing time (00-23) as keys and the number of error occurrances as values.

*

allbydatetime();

    %allbydatetime = $errorlogobject->allbydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys and the number of error occurrances as values.

*

allmessage();

    %allmessage = $errorlogobject->allmessage();

Returns a hash containing error messages as keys and the number of occurrances as values.

*

errorbydate();

    %errorbydate = $errorlogobject->errorbydate();

Returns a hash containing date (mm/dd/yyyy) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

errorbytime();

    %errorbytime = $errorlogobject->errorbytime();

Returns a hash containing time (00-23) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

errorbydatetime();

    %errorbydatetime = $errorlogobject->errorbydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

errormessage();

    %errormessage = $errorlogobject->errormessage();

Returns a hash containing error messages as keys and the number of occurrances as values. For the Apache 1.3.x log only.

*

noticebydate();

    %noticebydate = $errorlogobject->noticebydate();

Returns a hash containing date (mm/dd/yyyy) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

noticebytime();

    %noticebytime = $errorlogobject->noticebytime();

Returns a hash containing time (00-23) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

noticebydatetime();

    %noticebydatetime = $errorlogobject->noticebydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys and the number of error occurrances as values. For the Apache 1.3.x log only.

*

noticemessage();

    %noticemessage = $errorlogobject->noticemessage();

Returns a hash containing notice messages as keys and the number of occurrances as values. For the Apache 1.3.x log only.

*

warnbydate();

    %warnbydate = $errorlogobject->warnbydate();

Returns a hash containing date (mm/dd/yyyy) as keys and the number of error occurrances as values. For the Apache 1.3.x only.

*

warnbytime();

    %warnbytime = $errorlogobject->warnbytime();

Returns a hash containing time (00-23) as keys and the number of error occurrances as values. For the Apache 1.3.x only.

*

warnbydatetime();

    %warnbydatetime = $errorlogobject->warnbydatetime();

Returns a hash containing date/time (mm/dd/yyyy-hh) as keys and the number of error occurrances as values. For the Apache 1.3.x only.

*

warnmessage();

    %warnmessage = $errorlogobject->warnmessage();

Returns a hash containing warn messages as keys and the number of occurrances as values. For the Apache 1.3.x only.

RefererLog/CustomLog Methods

The following methods are available for the RefererLog object (created by getRefererLog() method), as well as the CustomLog object that logs %{Referer}i to the corresponding LogFormat.

*

referer();

    %referer = $logobject->referer();

Returns a hash containing the name of the web site the visitor comes from as keys, and the hit count for each key as values.

Note that the returned data from this method contains only the site name of the referer, e.g. "www.altavista.digital.com", so if you want to obtain the full details of the referer as well as the referred files, use refererdetail() method described below.

*

refererdetail();

Returns a hash containing the full URL of the referer as keys, and the hit count for each key as values.

The standard log format for the RefererLog is <referer -> URL>. With the CustomLog object, the object attempts to use the URL first, and if the URL is not logged, then the relative path, and then the absolute path to create the key for the returned data %referer. If none of the URL, relative or absolute paths are logged, the object will use only the referer URL itself (without refererd files) as the key.

AgentLog/CustomLog Methods

This subsection describes the methods available for the AgentLog object (created by getAgentLog() method), as well as the CustomLog object that logs %{User-agent}i to the corresponding LogFormat.

*

uagent();

    %uagent = $logobject->uagent();

Returns a hash containing the user agent (the "full name", as you see in the log file itself) as keys, and the hit count for each key as values.

*

uaversion();

    %uaversion = $logobject->uaversion();

Returns a hash containing the most basic and simple information about the user agent (the first column in the agent log file, e.g. "Mozilla/4.06") as keys, and the hit count for each key as values. Useful to collect the information about the parser engine and its version, to determine which specs of HTML and/or JavaScript to deploy, for example.

*

browser();

    %browser = $logobject->browser();

Returns a hash containing the actual browsers (as logged in the file) as keys, and the hit count for each key as values.

For example, Netscape Navigator/Communicator will (still) be reported as "Mozilla/version", Microsoft Internet Explorer as "MSIE version", and so on.

*

platform();

    %platform = $logobject->platform();

Returns a hash containing the names of OS (and possibly its version, hardware architecture, etc.) as keys, and the hit count for each key as values.

For example, Solaris 2.6 on UltraSPARC will be reported as "SunOS 5.6 sun4u",

*

browserbyos();

    %browserbyos = $logobject->browserbyos();

Returns a hash containing the browser names with OS (in the form, browser (OS)) as keys, and the hit count for each key as values.

CustomLog Methods

This subsection describes the methods available only for the CustomLog object. See each method for what Apache directive is used for each returned result.

*

addr();

    %addr = $logobject->addr();

Returns a hash containing the IP addresses of the web site (instead of the ServerName) visited as keys, and the hit count for each key as values. (LogFormat %a)

*

filename();

    %filename = $logobject->filename();

Returns a hash containing the absolute paths to the files as keys, and the hit count for each key as values. (LogFormat %f)

*

hostname();

    %hostname = $logobject->hostname();

Returns a hash containing the hostnames of the visitors as keys, and the hit count for each key as values. (LogFormat %v)

*

ostatus();

    %ostatus = $logobject->ostatus();

Returns a hash containing HTTP codes and messages (e.g. "404 Not Found") for the original status (i.e., when the httpd starts processing that request) as keys, and the hit count for each key as values.

*

port();

    %port = $logobject->port();

Returns a hash containing the port used for the transfer as keys, and the hit count for each key as values (there will probably be the only one key-value pair value for each server). (LogFormat %p)

*

proc();

    %proc = $logobject->proc();

Returns a hash containing the process ID of the server used for each file transfer as keys, and the hit count for each key as values. (LogFormat %P)

*

sec();

    %sec = $logobject->sec();

Returns a hash containing the file names (either relative paths, absolute paths, or the URL, depending on your log format) as keys, and the maximum seconds it takes to finish the process as values. Thus, note that the values are not accumulated results, but rather the highest number of seconds it took to process the file. (LogFormat %T)

*

url();

    %url = $logobject->url();

Returns a hash containing the URLs (path relative to the DocumentRoot> as keys, and the hit count for each key as values. (LogFormat %U)

Special Method

The special method described below, getMethods(), can be used with any of the log objects to extract the methods available for the calling object.

*

getMethods();

    @object_methods = $logobject->getMethods();

Returns an array containing the names of the available methods for that log object. Each of the elements in the array is the name of one of the methods described in this section.

By using this method, you can write a really simple Apache log parsing and reporting script, like so:

    #!/usr/local/bin/perl
    $|++; # flush buffer
    use Apache::ParseLog;
    # Construct the Apache::ParseLog object
    $base = new Apache::ParseLog("/usr/local/httpd/conf/httpd.conf");
    # Get the CustomLog object for "my_custom_log"
    $customlog = $base->getCustomLog("my_custom_log");
    # Get the available methods for the CustomLog object
    @methods = $customlog->getMethods();
    # Iterate through the @methods
    foreach $method (@methods) {
        print "$method log report\n";
        # Get the returned value for each method
        %{$method} = $customlog->$method();
        # Iterate through the returned hash
        foreach (sort keys %{$method}) {
            print "$_: ${$method}{$_}\n";
        }
        print "\n";
    }
    exit;

MISCELLANEOUS ^

This section describes some miscellaneous methods that might be useful.

Exported Methods

This subsection describes exported methods provided by the Apache::ParseLog module. (For the information about exported methods, see Exporter(3).)

Note that those exported modules can be used (called) just like local (main package) subroutines.

*

countryByCode();

    %countryByCode = countryByCode();

Returns a hash containing a hashtable of country-code top-level domain names as keys and country names as values. Useful for creating a report on "hits" by countries.

*

statusByCode();

    %statusByCode = statusByCode();

Returns a hash containing a hashtable of status code of the Apache HTTPD server, as defined by RFC2068, as keys and meanings as values.

*

sortHashByValue(%hash);

    @sorted_keys = sortHashByValue(%hash);

Returns an array containing keys of the %hash numerically sorted by the values of the %hash, by the descending order.

Example

    # Get the custom log object
    $customlog = $log->getCustomLog("combined");
    # Get the log report on "file"
    %file = $customlog->file();
    # Sort the %file by hit counts, descending order
    @sorted_keys = sortHashByValue(%hash);
    foreach (@sorted_keys) {
        print "$_: $file{$_}\n"; # print <file>: <hitcount>
    }

EXAMPLES ^

The most basic, easiest way to create reports is presented as an example in the getMethods() section above, but the format of the output is pretty crude and less user-friendly.

Shown below are some other examples to use Apache::ParseLog.

Example 1: Basic Report

The example code below checks the TransferLog and ErrorLog generated by the Apache 1.2.x, and prints the reports to STDOUT. (To run this code, all you have to do is to change the $conf value.)

    #!/usr/local/bin/perl
    $|++;
    use Apache::ParseLog;

    $conf = "/usr/local/httpd/conf/httpd.conf"; 
    $base = new Apache::ParseLog($conf);

    print "TransferLog Report\n\n";
    $transferlog = $base->getTransferLog();

    %hit = $transferlog->hit();
    %hitbydate = $transferlog->hitbydate();
    print "Total Hit Counts: ", $hit{'Total'}, "\n";
    foreach (sort keys %hitbydate) {
        print "$_:\t$hitbydate{$_}\n"; # <date>: <hit counts>
    }
    $hitaverage = int($hit{'Total'} / scalar(keys %hitbydate));
    print "Average Daily Hits: $hitaverage\n\n";

    %byte = $transferlog->byte();
    %bytebydate = $transferlog->bytebydate();
    print "Total Bytes Transferred: ", $byte{'Total'}, "\n";
    foreach (sort keys %bytebydate) {
        print "$_:\t$bytebydate{$_}\n"; # <date>: <bytes transferred>
    }
    $byteaverage = int($byte{'Total'} / scalar(keys %bytebydate));
    print "Average Daily Bytes Transferred: $byteaverage\n\n";

    %visitorbydate = $transferlog->visitorbydate();
    %host = $transferlog->host();
    print "Total Unique Visitors: ", scalar(keys %host), "\n";
    foreach (sort keys %visitorbydate) {
        print "$_:\t$visitorbydate{$_}\n"; # <date: <visitor counts>
    }
    $visitoraverage = int(scalar(keys %host) / scalar(keys %visitorbydate));
    print "Average Daily Unique Visitors: $visitoraverage\n\n";
    
    print "ErrorLog Report\n\n";
    $errorlog = $base->getErrorLog();

    %count = $errorlog->count();
    %allbydate = $errorlog->allbydate();
    print "Total Errors: ", $count{'Total'}, "\n";
    foreach (sort keys %allbydate) {
        print "$_:\t$allbydate{$_}\n"; # <date>: <error counts>
    }
    $erroraverage = int($count{'Total'} / scalar(keys %allbydate));
    print "Average Daily Errors: $erroraverage\n\n";

    exit;

Example 2: Referer Report

The RefererLog (or CustomLog with referer logged) contains the referer for every single file requested. It means that everytime a page that contains 10 images is requested, 11 lines are added to the RefererLog, one line for the actual referer (where the visitor comes from), and the other 10 lines for the images with the just refererd page containing the 10 images as the referer, which is probably a little too much more than what you want to know.

The example code below checks the CustomLog that contains referer, (among other things), and reports the names of the referer sites that are not the local server itself.

    #!/usr/local/bin/perl
    $|++;
    use Apache::ParseLog;

    $conf = "/usr/local/httpd/conf/httpd.conf"; 
    $base = new Apache::ParseLog($conf);

    $localserver = $base->servername();

    $log = $base->getCustomLog("combined");
    %referer = $log->referer();
    @sortedkeys = sortHashByValue(%referer);

    print "External Referers Report\n";
    foreach (@sortedkeys) {
        print "$_:\t$referer{$_}\n" unless m/$localserver/i or m/^\-/;
    }

    exit;

Example 3: Access-Controlled User Report

Let's suppose that you have a directory tree on your site that is access-controlled by .htaccess or the like, and you want to check how frequently the section is used by the users.

    #!/usr/local/bin/perl
    $|++;
    use Apache::ParseLog;

    $conf = "/usr/local/httpd/conf/httpd.conf";
    $base = new Apache::ParseLog($conf);

    $log = $base->getCustomLog("common");
    %user = $log->user();

    print "Users Report\n";
    foreach (sort keys %user) {
        print "$_:\t$user{$_}\n" unless m/^-$/;
    }

    exit;

SEE ALSO ^

perl(1), perlop(1), perlre(1), Exporter(3)

BUGS ^

The reports on lesser-known browsers returned from the AgentLog methods are not always informative.

The data returned from the referer() method for RefererLog may be irrelvant if the referred files are not accessed via HTTP (i.e., the referer does not start with "http://" string).

If the base object is created with the $virtualhost specified, unless the ServerAdmin and ServerName are specified within the <VirtualHost xxx> ... </VirtualHost>, those values specified in the global section of the httpd.conf are not shared with the $virtualhost.

TO DO ^

Increase the performance (speed).

VERSION ^

Apache::ParseLog 1.01 (10/01/1998).

AUTHOR ^

Apache::ParseLog was written and is maintained by Akira Hangai (akira@discover-net.net)

For the bug reports, comments, suggestions, etc., please email me.

COPYRIGHT ^

Copyright 1998, Akira Hangai. All rights reserved.

This program is free software; You can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER ^

This package is distributed in the hope that it will be useful for many web administrators/webmasters who are too busy to write their own programs to analyze the Apache log files. However, this package is so distributed WITHOUT ANY WARRANTY in that any use of the data generated by this package must be used at the user's own discretion, and the author shall not be held accountable for any results from the use of this package.

syntax highlighting: