Re: How to keep UTF-8 characters, but escape non-UTF-8 byte sequence to hex codes in ASCII



On Thu, 2006-11-30 at 15:46 -0800, Daniel Yek wrote:
Well, with g_utf8_validate(), it is trivial to implement a function that 
escape non-UTF-8 bytes to Hex. However, I then found out that TreeView, or 
more likely Pango, would unescape the %xx sequence (undo my attempt to help 
it) and choke!??!

Use another escape-representation character?

More random thoughts:
Is there a way to ask Pango to render illegal UTF-8 bytes as the more 
pleasant rectangle with hex number in it (as in the case when the font is 
not installed), rather than printing out cryptic messages on the terminal?

No; the rectangle-with-hex isn't Pango's devising, it's an actual font
(most likely DejaVu Sans) which contains that glyph as a fallback.
Because illegal UTF-8 non-sequences don't correspond to a codepoint,
there is no character to display and so no glyph to fall back to.

Ed




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]