App::Wubot::Guide::FeedFu - example fu for handling feeds
This document gives some examples of the range of operations that can be performed on a feed.
I follow twitter, but I am not a fan of the tweets of the form "I'm at x location". So I filter those out with the following rule:
- name: ignore condition: subject matches ^I'm at last_rule: 1
I follow the lifehacker RSS feed, but I know in advance that I'm not going to be interested in any articles that refer to windows 8 or sports.
- name: ignore condition: subject imatches windows 8|sports last_rule: 1
A more complex example can be found in App::Wubot::Guide::GettingStarted.
I follow a lot of RSS feeds. I often like to combine multiple feeds together into a single feed. For example, I follow the 'tekgear' and 'thinkgeek' RSS feeds, and I combine them together into a 'shop' feed.
In the configuration for both RSS feeds, I simply set the 'mailbox' to 'shop'.
- name: categorize plugin: SetField config: set: mailbox: shop
Then I point my RSS feeder to the combined feed.
http://myhost:3000/atom/shop.xml
I monitor my email inbox with the 'Mbox' plugin. I also get e-mails from Jira in my inbox, but I like to route those off to a 'jira' mailbox rather than to my mbox inbox. The following reactor rule would find emails that have JIRA in the subject and set the target mailbox to 'jira'.
- name: jira condition: subject matches JIRA plugin: SetField config: field: mailbox value: jira
I hate that some RSS feeds contain advertisments or other images that are not related to the content. You can strip them out by using the ImageStrip reactor.
For example, the RSS feed from slashdot has image buttons to share an article on facebook or twitter. If I really want to do that, I'll click off to the article.
--- url: http://rss.slashdot.org/Slashdot/slashdot delay: 1h react: - name: body image remover condition: contains body plugin: ImageStrip config: field: body
I hate that a lot of RSS feeds have started only providing 100 or less characters of the article. This requires you to click off to the website to get the content. Using wubot, you can fetch the body and trim out the bits that are not interesting.
One example is the efoodalert RSS feed which provides information about food recalls.
--- url: http://efoodalert.wordpress.com/feed/ delay: 1h react: - name: get full body condition: contains body rules: - name: fetch body plugin: WebFetch config: field: body url_field: link - name: capture body contents plugin: CaptureData config: field: body regexp: '^.*(<div id="content">.*)<div class="postinfo'
Wubot can be used to transform a feed from one feed type to another. For example, data coming in from any of the following forms can be routed back out to any of the other forms:
- RSS/Atom - mbox/maildir - IRC - logfiles
For example, if you wanted to receive an RSS feed and write the articles out to an mbox, you could configure a monitor for the RSS feed:
--- url: http://rss.slashdot.org/Slashdot/slashdot delay: 1h react: - name: categorize plugin: SetField config: set: mailbox: news
Then a reactor rule could be used to write it back out in Maildir format:
rules: - name: notify maildir plugin: Maildir condition: mailbox equals news config: path: /usr/home/wu/mail mailbox: news
Now you can read your RSS feeds in mutt.
To install App::Wubot, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::Wubot
CPAN shell
perl -MCPAN -e shell install App::Wubot
For more information on module installation, please visit the detailed CPAN module installation guide.