
Regexp::Log::Common - A regular expression parser for the Common Log Format

my $foo = Regexp::Log::Common->new(
format => 'custom %date %request',
capture => [qw( ts request )],
);
# the format() and capture() methods can be used to set or get
$foo->format('custom %date %request %status %bytes');
$foo->capture(qw( ts req ));
# this is necessary to know in which order
# we will receive the captured fields from the regexp
my @fields = $foo->capture;
# the all-powerful capturing regexp :-)
my $re = $foo->regexp;
while (<>) {
my %data;
@data{@fields} = /$re/; # no need for /o, it's a compiled regexp
# now munge the fields
...
}

Regexp::Log::Common uses Regexp::Log as a base class, to generate regular expressions for performing the usual data munging tasks on log files that cannot be simply split().
This specific module enables the computation of regular expressions for parsing the log files created using the Common Log Format. An example of this format are the logs generated by the httpd web server using the keyword 'common'.
The module also allows for the use of the Extended Common Log Format.
For more information on how to use this module, please see Regexp::Log.

Enables simple parsing of log files created using the Common Log Format or the Extended Common Log Format, such as the logs generated by the httpd/Apache web server using the keyword 'common'.

my $foo = Regexp::Log::Common->new( format => ':common' );
The Common Log Format is made up of several fields, each delimited by a single space.
remotehost rfc931 authuser [date] "request" status bytes
127.0.0.1 - - [19/Jan/2005:21:47:11 +0000] "GET /brum.css HTTP/1.1" 304 0 For the above example: remotehost: 127.0.0.1 rfc931: - authuser: - [date]: [19/Jan/2005:21:47:11 +0000] "request": "GET /brum.css HTTP/1.1" status: 304 bytes: 0
* host * rfc * authuser * date ** ts (date without the []) * request ** req (request without the quotes) * status * bytes
my $foo = Regexp::Log::Common->new( format => ':extended' );
The Extended Common Log Format is made up of several fields, each delimited by a single space.
remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"
127.0.0.1 - - [19/Jan/2005:21:47:11 +0000] "GET /brum.css HTTP/1.1" 304 0 "http://birmingham.pm.org/" "Mozilla/2.0GoldB1 (Win95; I)" For the above example: remotehost: 127.0.0.1 rfc931: - authuser: - [date]: [19/Jan/2005:21:47:11 +0000] "request": "GET /brum.css HTTP/1.1" status: 304 bytes: 0 "referer": "http://birmingham.pm.org/" "user_agent": "Mozilla/2.0GoldB1 (Win95; I)"
* host * rfc * authuser * date ** ts (date without the []) * request ** req (request without the quotes) * status * bytes * referer ** ref (referer without the quotes) * useragent ** ua (useragent without the quotes)

There are no known bugs at the time of this release. However, if you spot a bug or are experiencing difficulties that are not explained within the POD documentation, please submit a bug to the RT system (see link below). However, it would help greatly if you are able to pinpoint problems or even supply a patch.
Fixes are dependant upon their severity and my availablity. Should a fix not be forthcoming, please feel free to (politely) remind me by sending an email to barbie@cpan.org .
RT: http://rt.cpan.org/Public/Dist/Display.html?Name=Regexp-Log-Common

Regexp::Log

BooK for initially putting the idea into my head, and the thread on a perl message board, that wanted the help that was solved with this exact module.

b - beta d - Developer p - Perl only O - Object oriented p - Standard Perl

Barbie <barbie@cpan.org> for Miss Barbell Productions, L<http://www.missbarbell.co.uk>

Copyright © 2005-2007 Barbie for Miss Barbell Productions. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, using the Artistic License.
The full text of the licenses can be found in the Artistic file included with this distribution, or in perlartistic file as part of Perl installation, in the 5.8.1 release or later.