[gmime-devel] Unecscaped Unicode
- From: Robert Schroll <rschroll gmail com>
- To: gmime-devel-list gnome org
- Subject: [gmime-devel] Unecscaped Unicode
- Date: Wed, 20 Feb 2013 23:11:04 -0300
Hi Jeff,
Thanks for the prompt fixes! I have another question about FilterHTML,
but this one isn't a bug, I swear. As I understand it, FilterHTML will
either escape non-ASCII unicode characters as &#uuuu; or convert them to
question marks. Is there some way to let them stay encoded as UTF-8, or
even to let all bytes through without checking for unicode validity? If
not, should there be?
I ask because I'm trying to use FilterHTML to take a plain text email
from the wire and convert it to HTML for display. There's no need to
worry about getting the output into a 7-bit encoding, so the escapement
doesn't really help. It actually gets into the way a bit: I'm also
trying to put <blockquotes> around the quoted sections. I do this by
marking the quoted sections with a flag as I'm unwrapping the flowed
text before sending it through FilterHTML, and then adding the HTML tags
afterwards based on the flags. My two top choices for the flag would be
a private unicode control character or a byte invalid in unicode. The
second is completely ruled out; the first is still feasible, but is less
clear with the encoding going on. (FWIW, I'm using 0x7f, the DEL
character, as the flag right now. I'm not expecting any emails to have
this in them, but still....)
Of course, if I'm going about this in a completely wrong way, please say so.
Thanks,
Robert
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]