The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Net::Inspect::L7::HTTP - guesses and handles HTTP traffic

SYNOPSIS

 my $req = Net::Inspect::L7::HTTP::Request::Simple->new(..);
 my $http = Net::Inspect::L7::HTTP->new($req);
 my $guess = Net::Inspect::L5::GuessProtocol->new;
 $guess->attach($http);
 ...

DESCRIPTION

This class extracts HTTP requests from TCP connections. It provides all hooks required for Net::Inspect::L4::TCP and is usually used together with it. It provides the guess_protocol hook so it can be used with Net::Inspect::L5::GuessProtocol.

Attached flow is usually a Net::Inspect::L7::HTTP::Request::* object.

Hooks provided:

guess_protocol($guess,$dir,$data,$eof,$time,$meta)
new_connection($meta) - this returns an object for the connection
$connection->in($dir,$data,$eof,$time)

Processes new data and returns number of bytes processed.

$data are the data as string. In some cases $data can be [ 'gap' => $len ], e.g. only the information, that there would be $len bytes of data w/o submitting the data. These should only be submitted in request and response bodies and only if the attached layer can handle these gaps in the in_request_body and in_response_body methods.

Gaps on other places are not allowed, because all other data are needed for interpreting the placement of request, response and data inside the connection.

$connection->fatal($reason,$dir,$time)

Hooks called:

new_request(\%meta,$conn)

This should return an request object. The reference to the connection object is given in case the request object likes to call fatal to end the connection.

The function should not get hold of $conn, e.g. only store a weak reference, otherwise memory might leak.

$request->in_request_header($header,$time,\%hdr_meta)

Called when the full request header is read. $header is the string of the header.

%hdr_meta contains information extracted from the header:

method - method of request
url - url, as given in request
version - version of HTTP spoken in request
fields - (key => \@values) hash of header fields
junk - invalid data found in header fields part
content_length - length of request body
chunked - true if body uses transfer encoding chunked
$request->in_response_header($header,$time,\%hdr_meta)

Called when the full response header is read. $header is the string of the header.

%hdr_meta contains information extracted from the header:

version - version of HTTP spoken in response
code - status code from response
reason - reason given for response code
fields - (key => \@values) hash of header fields
junk - invalid data found in header fields part
content_length - length of request body if known, else undef
chunked - true if body uses transfer encoding chunked
$request->in_request_body($data,$eobody,$time)

Called for a chunk of data of the request body. $eobody is true if this is the last chunk of the request body. If the request body is empty the method will be called once with ''. If no body exists because of CONNECT or HTTP Upgrade in_data will be called, not in_request_body.

$data can be [ 'gap' => $len ] if the input to this layer were gaps.

$request->in_response_body($data,$eobody,$time)

Called for a chunk of data of the response body. $eof is true if this is the last chunk of the connection. $eobody is true if this is the last chunk of the response body. If the response body is empty the method will be called once with ''. If no body exists because of CONNECT or HTTP Upgrade in_data will be called, not in_response_body.

$data can be [ 'gap' => $len ] if the input to this layer were gaps.

$request->in_chunk_header($dir,$header,$time)

will be called with the chunk header for chunked encoding. Usually one is not interested in the chunk framing, only in the content so that this method will be empty. Will be called before the chunk data.

$request->in_chunk_trailer($dir,$trailer,$time)

will be called with the chunk trailer for chunked encoding. Usually one is not interested in the chunk framing, only in the content so that this method will be empty. Will be called after in_response_body/in_request_body got called with eof true.

$request->in_data($dir,$data,$eof,$time)

Will be called for any data after successful CONNECT or Upgrade, Websockets... $dir is 0 for data from client, 1 for data from server.

$request->in_junk($dir,$data,$eof,$time)

Will be called for legally ignored junk (empty lines) in front of request or response body. $dir is 0 for data from client, 1 for data from server.

$request->fatal($reason,$dir,$time)

will be called on fatal errors, mostly protocol iregularities.

Methods suitable for overwriting:

new_request(\%meta)

default implementation will just call new_request from the attached flow

Helpful methods

$connection->dump_state

collects the state of the open connections. If defined wantarray it will return a message, otherwise output it via xdebug

$connection->offset($dir)

returns the current offset in the data stream, that is the position behind the within the in_* methods forwarded data.

$connection->open_requests(@index)

in array context returns the objects for the open requests, in scalar context the number of open requests. If index is given only the specified objects will be returned, e.g. index -1 is the object currently receiving response data while index 0 specifies the object currently receiving request data (both are the same unless pipelining is used)

LIMITS

100 Continue, 101 Upgrade are not yet implemented.