Re: [evolution-patches] patch to revert the fix for bug #42170 (fixes bug #46331)



Not Zed wrote:

I don't agree with this patch.  The problem is the header is invalid
anyway since it's got 8 bit data in it.


yea, but so was the other header :-)


Having pango abort and display nothing on bad text is a bigger problem
than a badly displayed, bad text.


I'll agree that we should not feed non-UTF-8 to etable so that we don't
get an abort() in pango, but that doesn't mean that we have to sacrifice
being able to handle raw iso-8859-1 headers (tho I agree they are
invalid and I normally wouldn't feel bad about it not working, but the
original patch was wrong...).


I guess something is wrong with the other patch, maybe something is
re-encoding utf8 twice or something, or his locale default is utf8, in
which case it *was* doing the right thing ...  But something needs to be
there to address 42170.


header_decode_word() was converting word tokens into UTF-8 and the code
that was using header_decode_word() expected word-tokens. This code was
later converting it to UTF-8 again with the header_decode_string() which
is where it *should* be converted to UTF-8...

since lewing's original bug was just that local-part tokens of
addr-spec's would be left in a non-UTF-8 state and thus break pango... I
suggest that we just convert the addr->str to UTF-8 when we're done
constructing it *or* we just replace 8bit chars with '?' in the
addr-spec string.

so perhaps something like this:

if (!g_utf8_validate (addr->str, addr->len)) {
   unsigned char *ptr;

   ptr = addr->str;
   while (*ptr) {
      if (*ptr >= 128)
         *ptr = '?';
      ptr++;
   }
}


the only reason I suggest '?' (or '_' or 'x' or something) rather than
trying to convert to UTF-8 is that addr-specs don't allow anything but
us-ascii anyway, so the address is already invalid (100% guarentee that
it is spam). Now, someone might argue that since hostnames in the future
will be UTF-8, well... we've already got that covered - the token is
*not* in UTF-8 and so again it is still invalid.

if people really want, we could try and convert to utf-8. I don't really
care much one way or the other...just that I don't feel it is worth it.

Jeff



On Fri, 2003-07-25 at 06:40, Jeffrey Stedfast wrote:
I think a better approach to bug #42170 might be to check the parsed
addr->str when we're done and do charset conversions there instead? or
maybe just replace those invalid 8bit chars with a '?' or something?
it's 100% invalid... those can't be real email addresses, so no sense
trying to "make it work".

Jeff








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]