"Malformed UTF-8 character" warnings in gtkdoc-fixxref

From: James Henstridge <james daa com au>
To: gtk-doc-list gnome org, Anders Carlsson <andersca gnu org>
Subject: "Malformed UTF-8 character" warnings in gtkdoc-fixxref
Date: Sun, 15 Dec 2002 22:43:07 +0800

Anders recently filed a bug report that would be worth looking at beforethe next release:

   http://bugzilla.gnome.org/show_bug.cgi?id=101221

If you have a newer Perl (5.8 definitely, maybe 5.6 too) and are using aUTF-8 locale, you will get many warnings from perl about "MalformedUTF-8 character" warnings. This is because the HTML output currentlyuses latin1, which doesn't validate as UTF-8 (unless I am mistaken, thiswould occur for any input that didn't validate against the locale'sencoding even without a UTF-8 locale).

This problem can be worked around by telling perl to treat the inputdata as byte data rather than character data by adding the "use bytes;"pragma. With this change, gtkdoc-fixxref runs without warnings for me.

I don't know that much about perl (especially these esoteric parts), soI would appreciate a second opinion about this (eg. compatibility withdifferent perl versions). For reference, SpamAssassin seems to use thefollowing:

   eval "use bytes";

Which I guess might be for backward compat reasons. I haven't testedthis with gtk-doc though.


James.

--
Email: james daa com au              | Linux.conf.au   http://linux.conf.au/

WWW: http://www.daa.com.au/~james/ | Jan 22-25 Perth, Western Australia.

Follow-Ups:
- Re: "Malformed UTF-8 character" warnings in gtkdoc-fixxref
  - From: James Henstridge

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]