The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTTP::Caching - The RFC 7234 compliant brains to do caching right

VERSION

Version 0.13

SYNOPSIS

    my $chi_cache = CHI->new(
        driver          => 'File',
        root_dir        => '/tmp/HTTP_Caching',
        file_extension  => '.cache',
        l1_cache        => {
            driver          => 'Memory',
            global          => 1,
            max_size        => 1024*1024
        }
    );
    
    my $ua = LWP::UserAgent->new();
    
    my $http_caching = HTTP::Caching->new(
        cache         => $chi_cache,
        cache_type    => 'private',
        forwarder     => sub { return $ua->request(shift) }
    );
    
    my $rqst = HTTP::Request->new( GET => 'http://example.com' );
    
    my $resp = $http_caching->make_request( $rqst );
    

DEPRECATION WARNING !!!

This module is going to be completely redesigned!!!

As it was planned, these are the brains, but unfortunately, it has become an implementation.

The future version will answer two questions:

may_store
may_reuse

Those are currently implemented as private methods.

Please contact the author if you rely on this module directly to prevent breakage

Sorry for any inconvenience

ADVICE

Please use LPW::UserAgent::Caching or <LWP::UserAgent::Caching::Simple>.

NOTE

You can surpress the message by setting the environment varibale HTTP_CACHING_DEPRECATION_WARNING_HIDE

DESCRIPTION

This module tries to provide caching for HTTP responses based on RFC 7234 Hypertext Transfer Protocol (HTTP/1.1): Caching.

Basicly it looks like the following steps below:

  • For a presented request, it will check with the cache if there is a suitable response available AND if it can be served or that it needs to be revalidated with an upstream server.

  • If there was no response available at all, or non were suitable, the (modified) request will simply be forwarded.

  • Depending on the response it gets back, it will do one of the following dependingon the response status code:

    200 OK

    it will update the cache and serve the response as is

    304 Not Modified

    the cached version is valid, update the cache with new header info and serve the cached response

    500 Server Error

    in general, this is an error, and pass that onto the caller, however, in some cases it could be fine to serve a (stale) cached response

The above is a over-simplified version of the RFC

CONSTRUCTORS

new

    my $http_caching = HTTP::Caching->new(
        cache           => $chi_cache,
        cache_type      => 'private',
        cache_request   => 'max-age=86400, min-fresh=60',
        forwarder       => sub { return $ua->request(shift) }
    );

Constructs a new HTTP::Caching object that knows how to find cached responses and will forward if needed.

ATRRIBUTES

cache

Cache must be an object that MUST implement two methods

sub set ($key, $data)

to store data in the cache

sub get ($key)

to retrieve the data stored under the key

This can be as simple as a hash, like we use in the tests:

    use Test::MockObject;
    
    my %cache;
    my $mocked_cache = Test::MockObject->new;
    $mocked_cache->mock( set => sub { $cache{$_[1]} = $_[2] } );
    $mocked_cache->mock( get => sub { return $cache{$_[1]} } );

But very convenient is to use CHI, which implements both required methods and also has the option to use a L1 cache to speed things up even more. See the SYNOPSIS for an example

cache_type

This must either be 'private' or 'public'. For most LWP::UserAgents, it can be 'private' as it will probably not be shared with other processes on the same macine. If this module is being used at the serverside in a Plack::Middleware then the cache will be used by all other clients connecting to the server, and thus should be set to 'public'.

Responses to Authenticated request should not be held in public caches and also those responses that specifacally have their cache-control headerfield set to 'private'.

cache_control_request

A string that contains the Cache-control header-field settings that will be sent as default with the request. So you do not have to set those each time. See RFC 7234 Section 5.2.1 for the list of available cache-control directives.

cache_control_response

Like the above, but those will be set for each response. This is useful for server side caching. See RFC 7234 Section 5.2.2.

forwarder

This CodeRef must be a callback function that accepts a HTTP::Request and returns a HTTP::Response. Since this module does not know how to do a request it will use the forwarder. It will be used to sent of validation requests with If-None-Match and/or If-Modified-Since header-fields. Or if it does not have a stored response it will send the original full request (with the extra directives from cache_request).

Failing to return a HTTP::Response might cause the module to die or generate a response itself with status code 502 Bad Gateway.

METHODS

make_request

This is the only public provided method and will take a HTTP::Request. Like described above, it might have to forward the (modified) request throug the CodeRef in the forwarder attribute.

It will return a HTTP::Response from cache or a new retrieved one. This might be a HTTP respons with a 500 Error message.

In other cases it might die and let the caller know what was wrong, or send another 5XX Error.

ABOUT CACHING

If one would read the RFC7234 Section 2. Overview of Cache Operation, it becomes clear that a cache can hold multiple responses for the same URI. Caches that conform to CHI and many others, typically use a key / value storage. But this will become a problem as that it can not use the URI as a key to the various responses.

The way it is solved is to create an intermediate meta-dictionary. This can be stored by URI as key. Each response will simply be stored with a unique key and these keys will be used as the entries in the dictionary.

The meta-dictionary entries will hold (relevant) request and response headers so that it willbe more quick to figure wich entrie can be used. Otherwise we would had to read the entire responses to analyze them.