Re: [evolution-patches] patch to revert the fix for bug #42170 (fixes bug #46331)



looks ok.

i guess the only thing that might concern me is that there's some other
case not handled, but i guess it can be fixed if it comes up as an
issue.

 Z

On Fri, 2003-07-25 at 14:37, Jeffrey Stedfast wrote:
> ok, new patch (one that fixes both bugs)
> 
> now for an explanation on how name strings were being
> double-utf-8-encoded:
> 
> header_decode_word() converts all rfc822 word strings into UTF-8 before
> returning them.
> 
> header_decode_mailbox() calls header_decode_word() to grab 'word' by
> 'word' to construct the 'name' portion of the amil address. Each 'word'
> that it gets, it passes to header_decode_string() so that if the word is
> an rfc2047 encoded word, it will get decoded and converted to UTF-8
> before being appended to the GString *name variable. However,
> header_decode_string *also* converts raw 8bit strings into UTF-8, this
> is how the name strings were double-utf-8-encoded.
> 
> Since we don't need header_decode_word() to convert to UTF-8 for any
> place other than in header_decode_mailbox() for that one specific bug
> (42170 - which is a bug about how addr-spec strings are not sanitised to
> UTF-8, thus allowing spammers to send us invalid 8bit addr-specs that
> cause pango to crash), I think a better solution for bug #42170 would be
> to wait until we've finished parsing out the addr-spec string and
> putting it all together into GString *addr and then checking that it is
> valid UTF-8... if it is not, then convert it to UTF-8.
> 
> I think this is the cleaner solution...
> 
> Jeff
> 
> On Fri, 2003-07-25 at 13:22, Jeffrey Stedfast wrote:
> > Not Zed wrote:
> > 
> > >I don't agree with this patch.  The problem is the header is invalid
> > >anyway since it's got 8 bit data in it.
> > >
> > 
> > yea, but so was the other header :-)
> > 
> > >
> > >Having pango abort and display nothing on bad text is a bigger problem
> > >than a badly displayed, bad text.
> > >
> > 
> > I'll agree that we should not feed non-UTF-8 to etable so that we don't
> > get an abort() in pango, but that doesn't mean that we have to sacrifice
> > being able to handle raw iso-8859-1 headers (tho I agree they are
> > invalid and I normally wouldn't feel bad about it not working, but the
> > original patch was wrong...).
> > 
> > >
> > >I guess something is wrong with the other patch, maybe something is
> > >re-encoding utf8 twice or something, or his locale default is utf8, in
> > >which case it *was* doing the right thing ...  But something needs to be
> > >there to address 42170.
> > >
> > 
> > header_decode_word() was converting word tokens into UTF-8 and the code
> > that was using header_decode_word() expected word-tokens. This code was
> > later converting it to UTF-8 again with the header_decode_string() which
> > is where it *should* be converted to UTF-8...
> > 
> > since lewing's original bug was just that local-part tokens of
> > addr-spec's would be left in a non-UTF-8 state and thus break pango... I
> > suggest that we just convert the addr->str to UTF-8 when we're done
> > constructing it *or* we just replace 8bit chars with '?' in the
> > addr-spec string.
> > 
> > so perhaps something like this:
> > 
> > if (!g_utf8_validate (addr->str, addr->len)) {
> >     unsigned char *ptr;
> > 
> >     ptr = addr->str;
> >     while (*ptr) {
> >        if (*ptr >= 128)
> >           *ptr = '?';
> >        ptr++;
> >     }
> > }
> > 
> > 
> > the only reason I suggest '?' (or '_' or 'x' or something) rather than
> > trying to convert to UTF-8 is that addr-specs don't allow anything but
> > us-ascii anyway, so the address is already invalid (100% guarentee that
> > it is spam). Now, someone might argue that since hostnames in the future
> > will be UTF-8, well... we've already got that covered - the token is
> > *not* in UTF-8 and so again it is still invalid.
> > 
> > if people really want, we could try and convert to utf-8. I don't really
> > care much one way or the other...just that I don't feel it is worth it.
> > 
> > Jeff
> > 
> > >
> > >
> > >On Fri, 2003-07-25 at 06:40, Jeffrey Stedfast wrote:
> > >  
> > >
> > >>I think a better approach to bug #42170 might be to check the parsed
> > >>addr->str when we're done and do charset conversions there instead? or
> > >>maybe just replace those invalid 8bit chars with a '?' or something?
> > >>it's 100% invalid... those can't be real email addresses, so no sense
> > >>trying to "make it work".
> > >>
> > >>Jeff
> > >>    
> > >>
> > >
> > >  
> > >
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Evolution-patches mailing list
> > Evolution-patches lists ximian com
> > http://lists.ximian.com/mailman/listinfo/evolution-patches




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]