"Malformed UTF-8 character" warnings in gtkdoc-fixxref



Anders recently filed a bug report that would be worth looking at before the next release:
   http://bugzilla.gnome.org/show_bug.cgi?id=101221

If you have a newer Perl (5.8 definitely, maybe 5.6 too) and are using a UTF-8 locale, you will get many warnings from perl about "Malformed UTF-8 character" warnings. This is because the HTML output currently uses latin1, which doesn't validate as UTF-8 (unless I am mistaken, this would occur for any input that didn't validate against the locale's encoding even without a UTF-8 locale).

This problem can be worked around by telling perl to treat the input data as byte data rather than character data by adding the "use bytes;" pragma. With this change, gtkdoc-fixxref runs without warnings for me.

I don't know that much about perl (especially these esoteric parts), so I would appreciate a second opinion about this (eg. compatibility with different perl versions). For reference, SpamAssassin seems to use the following:
   eval "use bytes";

Which I guess might be for backward compat reasons. I haven't tested this with gtk-doc though.

James.

--
Email: james daa com au              | Linux.conf.au   http://linux.conf.au/
WWW: http://www.daa.com.au/~james/ | Jan 22-25 Perth, Western Australia.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]