NAME
Feed::Pipe - Pipe Atom/RSS feeds through UNIX-style high-level filters
SYNOPSIS
use Feed::Pipe;
my $pipe = Feed::Pipe
->new(title => "Mah Bukkit")
->cat( qw(1.xml 2.rss 3.atom) )
->grep(sub{$_->title =~ /lolrus/i })
->sort
->head
;
my $feed = $pipe->as_atom_obj; # returns XML::Atom::Feed
# Add feed details such as author and self link. Then...
print $feed->as_xml;
DESCRIPTION
This module is a Feed model that can mimic the functionality of
standard UNIX pipe and filter style text processing tools. Instead of
operating on lines from text files, it operates on entries from Atom
(or RSS) feeds. The idea is to provide a high-level tool set for
combining, filtering, and otherwise manipulating bunches of Atom data
from various feeds.
Yes, you could do this with Yahoo Pipes. Until they decide to take it
down, or start charging for it. And if your code is guaranteed to have
Internet access.
Also, you could probably do it with Plagger, if you're genius enough to
figure out how.
CONSTRUCTOR
To construct a feed pipe, call new(%options), where the keys of
%options correspond to any of the method names described under ACCESSOR
METHODS. If you do not need to set any options, cat may also be called
on a class and will return an instance.
my $pipe = Feed::Pipe->new(title => 'Test Feed');
FILTER METHODS
cat(@feeds)
my $pipe = Feed::Pipe->new(title => 'Test')->cat(@feeds);
# This also works:
my $pipe = Feed::Pipe->cat(@feeds);
Combine entries from each feed listed, in the order received, into a
single feed. RSS feeds will automatically be converted to Atom before
their entries are added. (NOTE: Some data may be lost in the
conversion. See XML::Feed.)
If called as a class method, will implicitly call new with no options
to return an instance before adding the passed @feeds.
Values passed to cat may be an instance of Feed::Pipe, XML::Atom::Feed,
XML::Feed, or URI, a reference to a scalar variable containing the XML
to parse, or a filename that contains the XML to parse. URI objects
will be dereferenced and fetched, and the result parsed.
Returns the feed pipe itself so that you can chain method calls.
grep(sub{})
# Keeps all entries with the word "Keep" in the title
my $pipe = Feed::Pipe
->cat($feed)
->grep( sub { $_->title =~ /Keep/ } )
;
Filters the list of entries to those for which the passed function
returns true. If no function is passed, the default is to keep entries
which have content (or a summary). The function should test the entry
object aliased in $_ which will be a XML::Atom::Entry.
Returns the feed pipe itself so that you can chain method calls.
head(Int $limit=10)
Output $limit entries from the top of the feed, where $limit defaults
to 10. If your entries are sorted in standard reverse chronological
order, this will pull the $limit most recent entries.
Returns the feed pipe itself so that you can chain method calls.
map(\&mapfunction)
# Converts upper CASE to lower case in each entry title.
my $pipe = Feed::Pipe
->cat($feed)
->map( sub { $_->title =~ s/CASE/case/; return $_; } )
;
Constructs a new list of entries composed of the return values from
mapfunction. The mapfunc must return one or more XML::Atom::Entry
objects, or an empty list. Within the mapfunction $_ will be aliased to
the XML::Atom::Entry it is visiting.
Returns the feed pipe itself so that you can chain method calls.
reverse()
Returns the feed with entries sorted in the opposite of the input
order. This is just for completeness, you could easily do this with
sort instead.
sort(sub{})
# Returns a feed with entries sorted by title
my $pipe = Feed::Pipe
->cat($feed)
->sort(sub{$_[0]->title cmp $_[1]->title})
;
Sort the feed's entries using the comparison function passed as the
argument. If no function is passed, sorts in standard reverse
chronological order. The sort function should be as described in Perl's
sort, but using $_[0] and $_[1] in place of $a and $b, respectively.
The two arguments will be XML::Atom::Entry objects.
Returns the feed pipe itself so that you can chain method calls.
tail(Int $limit=10)
Output $limit entries from the end of the feed, where $limit defaults
to 10. If your entries are sorted in standard reverse chronological
order, this will pull the $limit oldest entries.
Returns the feed pipe itself so that you can chain method calls.
ACCESSOR METHODS
NOTE: These methods are not filters. They do not return the feed pipe
and must not be used in a filter chain (except maybe at the end).
title
Human readable title of the feed. Defaults to "Combined Feed".
id
A string conforming to the definition of an Atom ID. Defaults to a
newly generated UUID.
updated
A DateTime object representing when the feed should claim to have been
updated. Defaults to "now".
OTHER METHODS
NOTE: These methods are not filters. They do not return the feed pipe
and must not be used in a filter chain (except maybe at the end).
as_atom_obj
Returns the XML::Atom::Feed object represented by the feed pipe.
as_xml
Serialize the feed object to an XML (Atom 1.0) string and return the
string. Equivalent to calling $pipe->as_atom_obj->as_xml. NOTE: The
current implementation does not guarantee that the resultant output
will be valid Atom. In particular, you are likely to be missing
required author and link elements. For the moment, you should use
as_atom_obj and manipulate the feed-level elements as needed if you
require validatable output.
count
Returns the number of entries in the feed.
entries
Returns the list of XML::Atom::Entry objects in the feed.