Re: How to keep UTF-8 characters, but escape non-UTF-8 byte sequence to hex codes in ASCII



At 02:40 AM 12/5/2006, Peter Lund wrote:
On Mon, 2006-12-04 at 18:22 -0800, Daniel Yek wrote:


At 07:29 AM 12/1/2006, Peter Lund wrote:
>On Thu, 2006-11-30 at 15:46 -0800, Daniel Yek wrote:
>
> > Well, with g_utf8_validate(), it is trivial to implement a function that
> > escape non-UTF-8 bytes to Hex. However, I then found out that TreeView, or
> > more likely Pango, would unescape the %xx sequence (undo my attempt to
> help
> > it) and choke!??!
>
>What if you double the percent signs?

Was it supposed to work? I tried, it didn't work.

I just made a test (a simple modification of a program I'm hacking on).

Inserting strings like "abcæø%F3æø" into a GtkTreeStore works fine. GtkTreeView displays them precisely as expected.

(In other words, "escaping" the %-sign with another %-sign is completely unnecessary.)

I think something is going wrong at your end, unfortunately :(

Thanks much Peter! You got it completely right. I wasn't aware of some code that un-escape the string before displaying. (No wonder International characters displaying was already working even before I attempt to take care of it.) That was the last obstacle. Glad to have it resolved.

In my opinion, it would be great if g_filename_to_utf8() or especially a new variant would automatically take care of escaping bytes not representable in UTF-8 so that GLib will just work with file names on the system (also especially when the, Linux, system is so versatile in working with any character set.) Application developers would then be able to choose if supporting file names with unknown encoding is more important than allowing file names with raw %xx ASCII sequence. That is just one wish list.


--
Daniel Yek




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]