Re: Encoding problem in Gtk2::Ex::PodViewer



2008/7/26 Ben Staude <sben1783 yahoo de>:
using the PodViewer widget to display help, I have an encoding problem: I
cannot get the widget to correctly display utf8 text.

It's a bug in Gtk2::Ex::PodViewer::Parser:
- $viewer->load($file) loads $file by opening in UTF8 mode
  (open FH, "<:utf8", $file)
- reads it line by line and concatenates the lines into $data
- call parse_from_string($data)

Unfortunately the file's contents in $data isn't marked
a Unicode anymore - Devel::Peek::Dump() is invaluable
here, because it shows whether a Perl string has the
UTF8 flag set or not. This causes $data to be promoted
to Unicode when passed to Gtk2, so the two-byte
UTF8 sequences for e.g. a umlaut are interpreted as two
characters encoded in Latin-1.

The reason that $data has the UTF8 flag turned off
is that all of Gtk2::Ex::PodViewer::Parser operates
under "use bytes". Bytes are correctly read from
your file: after <FH>, $_ has UTF8 on. But appending
it to the lexical variable $data (which is in the lexical
scope of "use bytes") "loses" the flag (cf. man perlunicode).

There's probably a reason for "use bytes" in Gtk2::Ex::PodViewer::Parser,
but that should be restricted where it's really needed.

Cheers, Roderich



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]