- rawdog 2.1 Fix a character encoding problem with format=text feeds. Add proxyuser and proxypassword options for feeds, so that you can use per-feed proxies requiring HTTP Basic authentication (patch from Jon Nelson). - rawdog 2.0 Update to feedparser 3.3. This meant reworking some of rawdog's internals; state files from old versions will no longer work with rawdog 2.0 (and external programs that manipulate rawdog state files will also be broken). The new feedparser provides a much nicer API, and is significantly more robust; several feeds that previously caused feedparser internal errors or Python segfaults now work fine. Add an --upgrade option to import state from rawdog 1.x state files into rawdog 2.x. To upgrade from 1.x to 2.x, you'll need to perform the following steps after installing the new rawdog: - cp -R ~/.rawdog ~/.rawdog-old - rm ~/.rawdog/state - rawdog -u - rawdog --upgrade ~/.rawdog-old ~/.rawdog (to copy the state) - rawdog -w - rm -r ~/.rawdog-old (once you're happy with the new version) Keep track of a version number in the state file, and complain if you use a state file from an incompatible version. Remove support for the old option syntax ("rawdog update write"). Remove workarounds for early 1.x state file versions. Save the state file in the binary pickle format, and use cPickle instead of pickle so it can be read and written more rapidly. Add hideduplicates and allowduplicates options to attempt to hide duplicate articles (based on patch from Grant Edwards). Fix a bug when sorting feeds with no titles (found by Joseph Reagle). Write the updated state file more safely, to reduce the chance that it'll be damaged or truncated if something goes wrong while it's being written (requested by Tim Bishop). Include feedfinder, and add a -a|--add option to add a feed to the config file. Correctly handle dates with timezones specified in non-UTC locales (reported by Paul Tomblin and Jon Lasser). When a feed's URL changes, as indicated by a permanent HTTP redirect, automatically update the config file and state. - rawdog 1.13 Handle OverflowError with parsed dates (patch from Matthew Scott). - rawdog 1.12 Add "sortbyfeeddate" option for planet pages (requested by David Dorward). Add "currentonly" option (patch from Chris Cutler). Handle nested CDATA blocks in feed XML and HTML correctly in feedparser. - rawdog 1.11 Add __num_items__ and __num_feeds__ to the page template, and __url__ to the item template (patch from Chris Cutler). Add "daysections" and "timesections" options to control whether to split items up by day and time (based on patch from Chris Cutler). Add "tidyhtml" option to use mx.Tidy to clean feed-provided HTML. Remove the

wrapping __description__ from the default item template, and make rawdog add

...

around the description only if it doesn't start with a block-level element (which isn't perfect, but covers the majority of problem cases). If you have a custom item template and want rawdog to generate a better approximation to valid HTML, you should change "

__description__

" to "__description__". HTML metacharacters in links are now encoded correctly in generated HTML ("foo?a=b&c=d" as "foo?a=b&c=d"). Content type selection is now performed for all elements returned from the feed, since some Blogger v5 feeds cause feedparser to return multiple versions of the title and link (reported by Eric Cronin). - rawdog 1.10 Add "ignoretimeouts" option to silently ignore timeout errors. Fix SSL and socket timeouts on Python 2.3 (reported by Tim Bishop). Fix entity encoding problem with HTML sanitisation that was causing rawdog to throw an exception upon writing with feeds containing non-US-ASCII characters in attribute values (reported by David Dorward, Dmitry Mark and Steve Pomeroy). Include MANIFEST.in in the distribution (reported by Chris Cutler). - rawdog 1.9 Add "clear: both;" to item, time and date styles, so that items with floated images in don't extend into the items below them. Changed how rawdog selects the feeds to update; --verbose now shows only the feeds being updated. rawdog now uses feedparser 2.7.6, which adds date parsing and limited sanitisation of feed-provided HTML; I've removed rawdog's own date-parsing (including iso8601.py) and relative-link-fixing code in favour of the more-capable feedparser equivalents. The persister module in rawdoglib is now licensed under the LGPL (requested by Giles Radford). Made the error messages that listed the state dir reflect the -b setting (patch from Antonin Kral). Treat empty titles, links or descriptions as if they weren't supplied at all, to cope with broken feeds that specify "" (patch from Michael Leuchtenburg). Make the expiry age configurable; previously it was hard-wired to 24 hours. Setting this to a larger value is useful if you want to have a page covering more than a day's feeds. Time specifications in the config file can now include a unit; if no unit is specified it'll default to minutes or seconds as appropriate to maintain compatibility with old config files. Boolean values can now be specified as "true" or "false" (or "1" or "0" for backwards compatibility). rawdog now gives useful errors rather than Python exceptions for bad values. (Based on suggestions by Tero Karvinen.) Added datetimeformat option so that you can display feed and article times differently from the day and time headings, and added some examples including ISO 8601 format to the config file (patch from Tero Karvinen). Forcing a feed to be updated with -f now clears its ETag and Last-Modified, so it should always be refetched from the server. Short-form XML tags in RSS () are now handled correctly. Numeric entities in RSS encoded content are now handled correctly. - rawdog 1.8 Add format=text feed option to handle broken feeds that make their descriptions unescaped text. Add __hash__ and unlinked titles to item templates, so that you can use multiple config files to build a summary list of item titles (for use in the Mozilla sidebar, for instance). (Requested by David Dorward.) Add the --verbose argument (and the "verbose" option to match); this makes rawdog show what it's doing while it's running. Add an "include" statement in config files that can be used to include another config file. Add feed options to select proxies (contributed by Neil Padgen). This is straightforward for Python 2.3, but 2.2's urllib2 has a bug which prevents ProxyHandlers from working; I've added a workaround for now. - rawdog 1.7 Fix code in iso8601.py that caused a warning with Python 2.3. - rawdog 1.6 Config file lines are now split on arbitary strings of whitespace, not just single spaces (reported by Joseph Reagle). Include a link to the rawdog home page in the default template. Fix the --dir argument: -d worked fine, but the getopt call was missing an "=" (reported by Gregory Margo). Relative links (href and src attributes) in feed-provided HTML are now made absolute in the output. (The feed validator will complain about feeds with relative links in, but there are quite a few out there.) Item templates are now supported, making it easier to customise item appearance (requested by a number of users, including Giles Radford and David Dorward). In particular, note that __feed_hash__ can be used to apply a CSS style to a particular feed. Simple conditions are supported in templates: __if_x__ .. __endif__ only expands to its contents if x is not empty. These conditions cannot be nested. PyXML's iso8601 module is now included so that rawdog can parse dates in feeds. - rawdog 1.5 Remove some debugging code that broke timeouts. - rawdog 1.4 Fix option-compatibility code (reported by BAM). Add HTTP basic authentication support (which means modifying feedparser again). Print a more useful error if the statefile can't be read. - rawdog 1.3 Reverted the "retry immediately" behaviour from 1.2, since it causes denied or broken feeds to get checked every time rawdog is run. Updated feedparser to 2.5.3, which now returns the XML encoding used. rawdog uses this information to convert all incoming items into Unicode, so multiple encodings are now handled correctly. Non-ASCII characters are encoded using HTML numeric character references (since this allows me to leave the HTML charset as ISO-8859-1; it's non-trivial to get Apache to serve arbitrary HTML files with the right Content-Type, and using won't override HTTP headers). Use standard option syntax (i.e. "--update --write" instead of "update write"). The old syntax will be supported until 2.0. Error output from reading the config file and from --update now goes to stderr instead of stdout. Made the socket timeout configurable (which also means the included copy of feedparser isn't modified any more). Added --config option to read an additional config file; this lets you have multiple output files with different options. Allow "outputfile -" to write the output to stdout; useful if you want to have cron mail the output to you rather than putting it on a web page. Added --show-template option to show the template currently in use (so you can customise it yourself), and "template" config option to allow the user to specify their own template. Added --dir option for people who want two lots of rawdog state (for two sets of feeds, for instance). Added "maxage" config option for people who want "only items added in the last hour", and made it possible to disable maxarticles by setting it to 0. - rawdog 1.2 Updated feedparser to 2.5.2, which fixes a bug that was making rawdog handle content incorrectly in Echo feeds, handles more content encoding methods, and returns HTTP status codes. (I've applied a small patch to correct handling of some Echo feeds.) Added useful messages for different HTTP status codes and HTTP timeouts. Since rawdog reads a config file, it can't automatically update redirected feeds, but it will now tell you about them. Note that for "fatal" errors (anything except a 2xx response or a redirect), rawdog will now retry the feed next time it's run. Prefer "content" over "content_encoded", and fall back correctly if no useful "content" is found. - rawdog 1.1 rawdog now preserves the ordering of articles in the RSS when a group of articles are added at the same time. Updated rawdog URL in setup.py, since it now has a web page. Updated rssparser to feedparser 2.4, and added very preliminary support for the "content" element it can return (for Echo feeds). - rawdog 1.0 Initial stable release.