Re: How to keep UTF-8 characters, but escape non-UTF-8 byte sequence to hex codes in ASCII



At 12:35 PM 12/1/2006, Edward Catmur wrote:
On Thu, 2006-11-30 at 15:46 -0800, Daniel Yek wrote:
> Well, with g_utf8_validate(), it is trivial to implement a function that
> escape non-UTF-8 bytes to Hex. However, I then found out that TreeView, or
> more likely Pango, would unescape the %xx sequence (undo my attempt to help
> it) and choke!??!

Use another escape-representation character?

> More random thoughts:
> Is there a way to ask Pango to render illegal UTF-8 bytes as the more
> pleasant rectangle with hex number in it (as in the case when the font is
> not installed), rather than printing out cryptic messages on the terminal?

No; the rectangle-with-hex isn't Pango's devising, it's an actual font
(most likely DejaVu Sans) which contains that glyph as a fallback.
Because illegal UTF-8 non-sequences don't correspond to a codepoint,
there is no character to display and so no glyph to fall back to.

Thanks! That was informative.


--
Daniel Yek


Ed




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]