Re: [evolution-patches] Fix for Evolution 1.2 shortcut migration



> > > > Strangely, you need to feed libxml2 with locale-encoded strings (and not
> > > > UTF-8 ones) for shortcuts.xml migration from 1.2 (otherwise you'll end
> > > > with very ugly strings, which is quite obvious in french).
> > > 
> > > Hmm, I don't understand why this patch works at all...  The current
> > > shortcut code always sets the data in the libxml tree as UTF-8.  So why
> > > would getting old UTF-8 data and converting it to locale generate a
> > > valid tree?  (It's also the opposite of what the rest of the XML fixing
> > > code in that file does...)
> > 
> > I don't understand either :)) Maybe libxml2 switched to locale encoding
> > when reading shortcuts.xml and then expected all strings to be in locale
> > encoding.. 
> 
> > <item name="R&#195;&#169;sum&#195;&#169;"...
> 
> Right. The problem is that libxml1 wrote out the UTF8 wrong (storing
> each *byte* of the UTF8-encoded string as a separate entity instead of
> storing each *character* as its own entity). So when you read it into
> libxml2, each byte of UTF8 encoding becomes a separate character and you
> end up with "RÃésumÃé".
> 
> Converting it to locale encoding isn't the right fix though; you
> essentially want to convert to iso-8859-1 regardless of what the locale
> encoding is (because that reverses the translation above: the "Ã"s
> become 0xC3, and the "é"s become 0xE9, and then when you hand the data
> back to libxml, it sees "0x52 0xC3 0xE9 0x73 0x75 0x6D 0xC3 0xE9", which
> is the UTF-8 encoding of "Résumé").
> 
> But it would be less confusing to just do the transformation by hand,
> since you don't really mean "convert from utf-8 to iso-8859-1", you just
> mean "replace each multibyte utf-8 character with the corresponding
> single-byte value".

Hmm, I'm not so sure of that : it will work for iso-8859-1 badly libxml1
encoded strings (ie French) but I'm not sure it will work for non
ISO8859-1 encoded strings (like Chinese ...)

-- 
Frederic Crozat <fcrozat mandrakesoft com>
Mandrakesoft




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]