Re: Invalid UTF8 string passed to pango_layout_set_text()



I am trying add contents to clist from a file which contains filenames,

I asume this is on Linux or some other POSIX system? File names in
POSIX are not necessarily in any enforced consistent character set
and/or encoding. File names are just a string of bytes. Especially on
older file systems or multiprotocol network file servers that have a
history of being mounted on machines running different operating
systems with users using random locales it is not uncommon to come
across inconsistent charsets in file names.

You can use g_filename_display_name() to reliably get a UTF-8
"display" form of a file name. This might then contain one or several
instances of the "Unicode replacement character" if GLib can't figure
out the charset/encoding of the file name.

If you have knowledge of the history of the site, you could write some
code that tries some known legacy charsets/encoding if the name isn't
in UTF-8 (in case UTF-8 is what currently is generally used at the
site, as one might hope). Possibly even with some mapping table like
"this user's home directory is likely to contain file names in Hebrew
in ISO8859-8, this user's in Russian in KOI-8", etc. But anyway,
basically you are out of luck.

--tml



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]