Re: Issues with Gtk2 dialogs and UTF8 data.
- From: Roderich Schupp <roderich schupp googlemail com>
- To: Tadej BorovÅak <tadeboro gmail com>
- Cc: gtk-perl-list gnome org
- Subject: Re: Issues with Gtk2 dialogs and UTF8 data.
- Date: Mon, 20 Dec 2010 14:47:09 +0100
On Mon, Dec 20, 2010 at 1:06 PM, Tadej BorovÅak <tadeboro gmail com> wrote:
binmode $input, ':utf8';
That's crucial here. I suspect the following:
- your script has "use utf8;", hence all literal strings have the internal
UTF-flag turned ON, hence using such string in a Gtk2::Label works
as expected
- your file is indeed UTF-8 encoded, but you didn't open it in :utf8 mode
as above; hence a string read from the file has the internal UTF-8 flag
turned OFF (though it contains the "correct" byte squences)
- concatening a string with UTF-8 flag on with another with the flag off
causes the non-UTF-8 string to be "upgraded to UTF-8" which means
- its bytes are interpreted as being in Latin-1 (!) and all non-ASCII bytes
are converted to corresponding UTF-8 multibyte sequences
- the UTF-8 flag is turned on
- e.g. a UTF-8 encoded "LATIN SMALL LETTER U WITH DIAERESIS"
(U+00FC) will be interpreted as two Latin-1 chars 0xC3 and 0xBC
which represent codepoints U+00C3 "LATIN CAPITAL LETTER A WITH TILDE"
and U+00BC "VULGAR FRACTION ONE QUARTER", resp.
Cheers, Roderich
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]