The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

App::Wubot::Guide::FeedFu - example fu for handling feeds

DESCRIPTION

This document gives some examples of the range of operations that can be performed on a feed.

FILTERING

I follow twitter, but I am not a fan of the tweets of the form "I'm at x location". So I filter those out with the following rule:

  - name: ignore
    condition: subject matches ^I'm at
    last_rule: 1

I follow the lifehacker RSS feed, but I know in advance that I'm not going to be interested in any articles that refer to windows 8 or sports.

  - name: ignore
    condition: subject imatches windows 8|sports
    last_rule: 1

A more complex example can be found in App::Wubot::Guide::GettingStarted.

COMBINING FEEDS

I follow a lot of RSS feeds. I often like to combine multiple feeds together into a single feed. For example, I follow the 'tekgear' and 'thinkgeek' RSS feeds, and I combine them together into a 'shop' feed.

In the configuration for both RSS feeds, I simply set the 'mailbox' to 'shop'.

  - name: categorize
    plugin: SetField
    config:
      set:
        mailbox: shop

Then I point my RSS feeder to the combined feed.

  http://myhost:3000/atom/shop.xml

SPLITTING AN INCOMING FEED INTO MULTIPLE OUTGOING FEEDS

I monitor my email inbox with the 'Mbox' plugin. I also get e-mails from Jira in my inbox, but I like to route those off to a 'jira' mailbox rather than to my mbox inbox. The following reactor rule would find emails that have JIRA in the subject and set the target mailbox to 'jira'.

      - name: jira
        condition: subject matches JIRA
        plugin: SetField
        config:
          field: mailbox
          value: jira

STRIPPING IMAGES

I hate that some RSS feeds contain advertisments or other images that are not related to the content. You can strip them out by using the ImageStrip reactor.

For example, the RSS feed from slashdot has image buttons to share an article on facebook or twitter. If I really want to do that, I'll click off to the article.

  ---
  url: http://rss.slashdot.org/Slashdot/slashdot
  delay: 1h

  react:

    - name: body image remover
      condition: contains body
      plugin: ImageStrip
      config:
        field: body

FETCHING THE BODY

I hate that a lot of RSS feeds have started only providing 100 or less characters of the article. This requires you to click off to the website to get the content. Using wubot, you can fetch the body and trim out the bits that are not interesting.

One example is the efoodalert RSS feed which provides information about food recalls.

  ---
  url: http://efoodalert.wordpress.com/feed/
  delay: 1h

  react:

    - name: get full body
      condition: contains body
      rules:

        - name: fetch body
          plugin: WebFetch
          config:
            field: body
            url_field: link

        - name: capture body contents
          plugin: CaptureData
          config:
            field: body
            regexp: '^.*(<div id="content">.*)<div class="postinfo'

FEED CONVERSIONS

Wubot can be used to transform a feed from one feed type to another. For example, data coming in from any of the following forms can be routed back out to any of the other forms:

  - RSS/Atom
  - mbox/maildir
  - IRC
  - logfiles

For example, if you wanted to receive an RSS feed and write the articles out to an mbox, you could configure a monitor for the RSS feed:

  ---
  url: http://rss.slashdot.org/Slashdot/slashdot
  delay: 1h
  react:
    - name: categorize
      plugin: SetField
      config:
        set:
          mailbox: news

Then a reactor rule could be used to write it back out in Maildir format:

  rules:
    - name: notify maildir
      plugin: Maildir
      condition: mailbox equals news
      config:
        path: /usr/home/wu/mail
        mailbox: news

Now you can read your RSS feeds in mutt.