Unicode and Django RSS Framework

Unicode issues are the most annoying thing about Django. Here is one workaround for a bug in Django RSS framework.

I have migrated my Ma.gnolia bookmarks and Flickr photos into this site. Both services have tags that have what Django devs call “funky characters”, that is non-ascii characters in them. Getting these into the database unchanged was one pain in the butt itself, but after that, I wanted to make my own feeds for both Ma-gnolia and Flickr tags with Djangos wonderful syndication framework. Turns out that the framework don’t play well with urls that have funky characters.

The problem is in the feed class that adds automatically appropriate ‘http://’ prefixes in front of any urls that need them. On creation, the feed object it is passed with request object that has unencoded path attribute which throws an uncatched exception when there are funky characters in the url. Adding the site domain to it before passing it to the feed class circumvents the problem.

This is my (stripped down) feeds view:

 from django.contrib.syndication.views import feed  def my_feeds(request, url):     from unessanet.links.feeds import *     from unessanet.photos.feeds import *      unessanet_feed_dict = {         'linkit': LatestBookmarks,         'valokuvat': LatestPhotos,         'valokuvatagi': PhotosForTag,     }      # Fixes a bug in syndication framework     request.path = 'http://www.unessa.net' + request.path     return feed(request, url, unessanet_feed_dict)

Now the feeds render properly. Almost.

A feed with an unquoted url does not validate. It may work, but it doesn’t validate. To fix this, just escape the url with quote function found in urllib module.

This is my feed class for photo tags:

 class PhotosForTag(Feed):      description_template = "feeds/latest_photos_description.html"     title_template = "feeds/latest_photos_title.html"      def get_object(self, bits):         if len(bits) != 1:             raise ObjectDoesNotExist         tag = bits[0]         return PhotoTag.objects.get(tag=tag)      def title(self, obj):         return "Unessa.net Valokuvat: %s" % obj.tag      def link(self, obj):         # Quote the url so the feed validates         from urllib import quote         return 'http://www.unessa.net/valokuvat/tagit/%s/' % quote(obj.tag)      def description(self, obj):         return "Unessa.net Valokuvat: %s" % obj.tag      def items(self, obj):         return obj.flickrphoto_set.filter(is_public=True)[:10]

Note that the quoted part of the url must be unicode or otherwise you’ll end up with a broken url. But after these fixes, the feeds work as expected — with or withouth funky characters.

I really, really hope that Django will be converted to use nothing but unicode strings before the long waited 1.0 release.