Don't encode and decode when copying nearly-unicode objects.
authorAdam Sampson <ats@offog.org>
Sat, 21 Feb 2009 18:36:38 +0000 (18:36 +0000)
committerAdam Sampson <ats@offog.org>
Sat, 21 Feb 2009 18:36:38 +0000 (18:36 +0000)
Why? Because BeautifulSoup's string class has a broken encode
implementation... and it appears that the unicode class is smart enough
to not reuse the object when it's a subclass anyway.

rawdoglib/rawdog.py

index 0880cf56c458fc62cb2a50209e21c3cf68ffd7c1..988aacdef404998e07e4da1e1e70d39d0628a7f7 100644 (file)
@@ -290,7 +290,7 @@ def ensure_unicode(value, encoding):
                # This is a subclass of unicode (e.g.  BeautifulSoup's
                # NavigableString, which is unpickleable in some versions of
                # the library), so force it to be a real unicode object.
-               return value.encode("UTF-8").decode("UTF-8")
+               return unicode(value)
        elif isinstance(value, dict):
                d = {}
                for (k, v) in value.items():