# Writing rawdog plugins ## Introduction As provided, rawdog provides a fairly small set of features. In order to make it do more complex jobs, rawdog can be extended using plugin modules written in Python. This document is intended for developers who want to extend rawdog by writing plugins. Extensions work by registering hook functions which are called by various bits of rawdog's core as it runs. These functions can modify rawdog's internal state in various interesting ways. An arbitrary number of functions can be attached to each hook; they are called in the order they were attached. Hook functions take various arguments depending on where they're called from, and returns a boolean value indicating whether further functions attached to the same hook should be called. The "plugindirs" config option gives a list of directories to search for plugins; all Python modules found in those directories will be loaded by rawdog. In practice, this means that you need to call your file something ending in ".py" to have it recognised as a plugin. ## The plugins module All plugins should import the `rawdoglib.plugins` module, which provides the functions for registering and calling hooks, along with some utilities for plugins. Many plugins will also want to import the `rawdoglib.rawdog` module, which contains rawdog's core functionality, much of which is reusable. ### rawdoglib.plugins.attach_hook(hook_name, function) The attach_hook function adds a hook function to the hook of the given name. ### rawdoglib.plugins.Box The Box class is used to pass immutable types by reference to hook functions; this allows several plugins to modify a value. It contains a single `value` attribute for the value it is holding. ## Hooks Most hook functions are called with "rawdog" and "config" as their first two arguments; these are references to the aggregator's Rawdog and Config objects. If you need a hook that doesn't currently exist, please contact me. The following hooks are supported: ### startup(rawdog, config) Run when rawdog starts up, after the state file and config file have been loaded, but before rawdog starts processing command-line arguments. ### shutdown(rawdog, config) Run just before rawdog saves the state file and exits. ### config_option(config, name, value) * name: the option name * value: the option value Called when rawdog encounters a config file option that it doesn't recognise. The rawdoglib.rawdog.parse_* functions will probably be useful when dealing with config options. You can raise ValueError to have rawdog print an appropriate error message. You should return False from this hook if name is an option you recognise. Note that using config.log in this hook will probably not do what you want, because the verbose flag may not yet have been turned on. ### config_option_arglines(config, name, value, arglines) * name: the option name * value: the option value * arglines: a list of extra indented lines given after the option (which can be used to supply extra arguments for the option) As config_option for options that can handle extra argument lines. If the options you are implementing should not have extra arguments, then use the config_option hook instead. ### output_filter(rawdog, config, articles) * articles: the mutable list of Article objects Called before rawdog sorts the list of articles to write. This hook can be used to remove articles that shouldn't be written. ### output_sort(rawdog, config, articles) * articles: the mutable list of Article objects Called after rawdog has sorted the list of articles to write. This hook can be used to reorder (or completely resort) the list of articles to write. ### output_write(rawdog, config, articles) * articles: the mutable list of Article objects Called immediately before output_sorted_filter; this hook is here for backwards compatibility, and should not be used in new plugins. ### output_sorted_filter(rawdog, config, articles) * articles: the mutable list of Article objects Called after rawdog sorts the list of articles to write, but before it removes duplicate and excessively old articles. This hook can be used to implement alternate duplicate-filtering methods. If you return False from this hook, then rawdog will not do its usual duplicate-removing filter pass. ### output_write_files(rawdog, config, articles, article_dates) * articles: the mutable list of Article objects * article_dates: a dictionary mapping Article objects to the dates that were used to sort them Called when rawdog is about to write its output to files. This hook can be used to implement alternative output methods. If you return False from this hook, then rawdog will not write any output itself (and the later output_ hooks will thus not be called). ### output_items_begin(rawdog, config, f) * f: a writable file object (__items__) Called before rawdog starts expanding the items template. This set of hooks can be used to implement alternative date (or other section) headings. ### output_items_heading(rawdog, config, f, article, date) * f: a writable file object (__items__) * article: the Article object about to be written * date: the Article's date for sorting purposes Called before each item is written. If you return False from this hook, then rawdog's normal time-based section headings will not be written. ### output_items_end(rawdog, config, f) * f: a writable file object (__items__) Called after all items are written. ### output_bits(rawdog, config, bits) * bits: a dictionary of template parameters Called before expanding the main template. This hook can be used to add extra template parameters. ### output_item_bits(rawdog, config, feed, article, bits) * feed: the Feed containing this article * article: the Article being templated * bits: a dictionary of template parameters Called before expanding the item template for an article. This hook can be used to add extra template parameters. ### pre_update_feed(rawdog, config, feed) * feed: the Feed about to be updated Called before a feed is updated. This hook can be used to perform extra actions before fetching a feed. ### post_update_feed(rawdog, config, feed, seen_articles) * feed: the Feed about to be updated * seen_articles: a boolean indicating whether any articles were read from the feed Called after a feed is updated. ### article_seen(rawdog, config, article, ignore) * article: the Article that has been received * ignore: a Boxed boolean indicating whether to ignore the article Called when an article is received from a feed. This hook can be used to modify or ignore incoming articles. ### article_updated(rawdog, config, article, now) * article: the Article that has been updated * now: the current time Called after an article has been updated (when rawdog receives an article from a feed that it already has). ### article_added(rawdog, config, article, now) * article: the Article that has been added * now: the current time Called after a new article has been added. ### article_expired(rawdog, config, article, now) * article: the Article that will be expired * now: the current time Called before an article is expired. ### fill_template(template, bits, result) * template: the template string to fill * bits: a dictionary of template arguments * result: a Boxed Unicode string for the result of template expansion Called whenever template expansion is performed. If you set the value inside result to something other than None, then rawdog will treat that value as the result of template expansion (rather than performing its normal expansion process); you can thus use this hook either for manipulating template parameters, or for replacing the template system entirely. ### clean_html(config, html, baseurl, inline) * html: a Boxed Unicode string containing the HTML being cleaned * baseurl: the URL at which the HTML was originally found * inline: a boolean indicating whether the output should be inline HTML or a block element Called whenever HTML is being sanitised by rawdog (after its existing HTML sanitisation processes). You can use this to implement extra sanitisation passes. You'll need to update the boxed value with the new, cleaned string. ### add_urllib2_handlers(rawdog, config, feed, handlers) * feed: the Feed to which the request will be made * handlers: the mutable list of urllib2 *Handler objects that will be passed to feedparser Called before feedparser is used to fetch feed content. This hook can be used to add additional urllib2 handlers to cope with unusual protocol requirements; use `handlers.append` to add extra handlers. ### feed_fetched(rawdog, config, feed, feed_data, error, non_fatal) * feed: the Feed that has just been fetched * feed_data: the data returned from feedparser.parse * error: the error string if an error occurred, or None if no error occurred * non_fatal: if error is not None, a boolean indicating whether the error was fatal Called after feedparser has been called to fetch the feed. This hook can be used to manipulate the received feed data or implement custom error handling. ## Examples ### backwards.py This is probably the simplest useful example plugin: it reverses the sort order of the output. import rawdoglib.plugins def backwards(rawdog, config, articles): articles.reverse() return False rawdoglib.plugins.attach_hook("output_sort", backwards) ### option.py This plugin shows how to handle a config file option. import rawdoglib.plugins def option(config, name, value): if name == "myoption": print "Test plugin option:", value return False else: return True rawdoglib.plugins.attach_hook("config_option", option)