Re: [gmime-devel] Unecscaped Unicode



On 02/21/2013 12:24 AM, Jeffrey Stedfast wrote:
Maybe it would make sense to have gmime's html filter optionally add
blockquotes instead of doing things the way your doing it?

Perhaps, but we're trying to make sure this works well with format=flowed emails, so this isn't as simple as changing the <font color=""> tags to <blockquote>s.

Can you give me an example of the input and what you want as the output?
Do you want to nest the <blockquote>'s according to the line's citation
depth?

Here's a simple example. I'll use ~ to represent space so you can better see what's going on.

>~This~is~a~line~of~flowed~text~
>~that~has~been~wrapped.
>~But~this~is~a~new~line.
~>~This~looks~like~a~quote,~but~
it~isn't.

should become:

<blockquote>This is a line of flowed text that has been wrapped.
But this is a new line.</blockquote>
&lt; This looks like a quote, but isn't.

Not sure how big the can of worms is that my big mouth is offering to
implement, but if it's not too difficult maybe I can add that feature ;-)

You can take a look at our implementation-in-progress, which uses a filter on either side of the FilterHTML. Prior to the FilterHTML, FilterFlowed (https://github.com/rschroll/geary/blob/plaintext/src/engine/rfc822/rfc822-gmime-filter-flowed.vala) undoes line wrapping and space stuffing, and converts quote symbols to 0x7f. After FilterHTML, FilterBlockquote (https://github.com/rschroll/geary/blob/plaintext/src/engine/rfc822/rfc822-gmime-filter-blockquotes.vala) removes the 0x7f flags and inserts <blockquote>s appropriately. This works fine, unless the email has lines starting with 0x7f. (Which it won't, so why am I worrying?)

As you see, the latter is relatively simple, but the former is non-trivial. Either we'd need all of that in FilterHTML, or FilterHTML would need a flag to indicate quote levels separate from >. Actually, the second solution might be feasible. FilterHTML could gain an optional "quote_marker" flag, defaulting to ">", so it would work automatically with unprocessed text. But people like me could set it to be something else and do our own preprocessing.

Does that make sense?

Thanks,
Robert


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]