Apache::Traffic - Tracks hits and bytes transferred on a per-user basis
# Place this in your Apache's httpd.conf file PerlLogHandler Apache::Traffic
This module tracks the total number of hits and bytes transferred per day by the Apache web server, on a per-user basis. This allows for real-time statistics without having to parse the log files.
After installation, add this to your Apache's httpd.conf file and restart the server:
PerlLogHandler Apache::Traffic
The statistics are then available through the 'traffic' script, which is included in this distribution. See the section VIEWING STATISTICS for more details.
You need to have compiled mod_perl with the LogHandler hook in order to use this module. Additionally, the following modules are required:
o IPC::Shareable o IPC::SysV o DB_File o Date::Parse
Your OS must also support SysV IPC (shared memory and semaphores). If this is not the case, this module will be useless to you.
To install this module, move into the directory where this file is located and type the following:
perl Makefile.PL make make test make install
This will install the module into the Perl library directory.
Once installed, you will need to modify your web server's configuration file so it knows to use Apache::Traffic during the logging phase:
Restart your web server.
As of this writing, there is a problem with IPC::Shareable which will cause segmentation faults in httpd processes if Apache::Traffic is run long enough (at least this is the case under Linux). This distribution contains a patch named 'share.patch', which will fix the problem.
If Apache::Traffic does not appear to work correctly (look in your server's error_log for problems), make sure the semaphore and shared memory segments are not already allocated for another purpose. If this is the case, you can change the constants SHMKEY, SEMKEY, and DBPATH at the top of the Apache::Traffic module, and reinstall.
Each time a request is served, the Apache::Traffic log handler is called which increments the byte and hit totals for the owner of the resource.
The owner of the resource is determined in the following way:
o If the Perl variable Owner has been set for the directory, its value is used. For example: <Directory /home/root/www/mark> PerlSetVar Owner mark </Directory> This would declare user mark as the owner of everything under the specified directory. The value can be either the username or UID of the user. This value can also be a fake user (i.e. a username which is not present in the passwd file). In this case, the username is stored (rather than the UID). o If the request is to a virtual host, the owner of the document root is used. o If neither of the above methods work, the owner of the file is used.
The hit and byte total information is stored in shared memory to minimize processing. On the first request of each day, all previous data in shared memory is automatically moved to permanent storage. This means that no more than one day's worth of information is ever stored in shared memory, and prevents performance degradation as data accumulates. This separation of data is transparent from the end-user perspective.
If you would rather not have the data moved into the dbm file, you can set USE_DBM to 0 at the top of the Traffic.pm module and reinstall.
Shared memory segments are not preserved through reboots. If you reboot your machine multiple times a day, Apache::Traffic will be of questionable value to you. I run Linux, so of course, I only reboot when I've upgraded the OS. ;-) This area may be improved in the future (at least for orderly shutdowns).
A script named 'traffic' is included in this distribution, which allows you to view the totals for a given user. Note that this script will not run properly until Apache::Traffic has recorded at least one page request.
The basic syntax for the script is:
traffic [options] [username]
If username is not specified, the effective UID of the person running the script is used. By default, only data for the current day is displayed.
The following options are supported:
-start=starting_date Specifies the starting date that you wish to see data for. The date specifications can take any format supported by the Date::Parse module. If -end is not specified, all data between -start and the current day is displayed. -end=ending_date Specifies the ending date that you wish to see data for. -days=num_days Specifies the number of days you want to see information for relative to the value of -start (or the current day if -start is not specified). The value can be either positive or negative. -user=username Specifies the user you want to see data for. Multiple -user specifications are allowed. The users can also be specified as non-option arguments. Both UIDs and usernames are allowed. -all Displays all data present within the given time period. -reverse If present, the information is sorted in descending order based on date. -units=unit Specifies the unit to display transfer totals in. Acceptable values are 'Bytes', "Kilobytes', 'Megabytes', or 'Gigabytes'. Only the first character of the unit need be specified. The default is Bytes. -summary If -summary is present, aggregate totals for the period being viewed are displayed, rather than daily totals. -n If the -n option is present, the report displays UIDs rather than converting them to usernames. In the case of a "fake" user, the username will still be displayed (which is a way to tell is a user is fake or not). -remove If the -remove option is present, all data within the specified time period is permanently removed. Only root is allowed to perform this operation (see the SECURITY NOTES section though). The operation must be confirmed prior to being carried out.
If the supplied traffic script is not sufficient for your needs, you may access the raw data directly. The following functions are available for import into your scripts.
fetch([START], [END], [WANTUID], [ALL], [USER LIST]) This function retrieves all data between START and END times, inclusive, for the users specified in USER LIST. Both START and END should be UTC timestamps. The function automatically normalizes the timestamps to be on day boundaries. If WANTUID is true, usernames are not looked up. If ALL is true, data for all users is returned and USER_LIST is ignored. If ALL is true, data for all users is returned and USER_LIST is ignored. On success, the function returns a complex hash reference, which contains the requested data: use Apache::Traffic qw( fetch remove error ); $ref = fetch(time, time, 0, 0, 'maurice'); foreach $day (%$ref) { foreach $user (%{ $ref->{$day} }) { print scalar gmtime $day, " $user\n"; print " BYTES: $ref->{$day}{$user}{bytes}\n"; print " HITS: $ref->{$day}{$user}{hits}\n\n"; } } Note that the timestamps are stored internally in GM time, although START and END should be in local time. We do this so we don't have to worry about daylight savings. The function returns undef on error, in which case you can call the error() function to determine what went wrong. remove([START], [END]) This function removes all data between the START and END times, inclusive. The fuction returns true on success and undef on error, in which case you can call the error() function to determine what went wrong. error() Returns a string describing the last error condition encountered.
By default, the shared memory segments, semaphores, and DBM file are created with permissions of 0644. However, these resources must be owned by whatever user the server runs as (normally user 'nobody'). This means that your users could create CGI scripts to play with the data. For this reason, the information maintained by Apache::Traffic should not be relied upon for auditing purposes, and is intended mainly for use in friendly environments.
Copyright (C) 1997, Maurice Aubrey <maurice@hevanet.com>. All rights reserved.
This module is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
perl(1), mod_perl(3)
To install Apache::Traffic, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Apache::Traffic
CPAN shell
perl -MCPAN -e shell install Apache::Traffic
For more information on module installation, please visit the detailed CPAN module installation guide.