Re: [gmime-devel] Issue with header decoding (continues)



Hey,

Found a little memory leak in rfc2047_decode_tokens(), missing g_free() in the else clause that does charset_convert().

Attaching patch.

Best regards 


On Thu, Dec 22, 2011 at 2:46 AM, Jeffrey Stedfast <fejj gnome org> wrote:
Thanks for pointing this out!

You actually don't need to check if it's ascii in a loop, it's already known by that point in the code (since the above code already checked for it).

all that was needed was:

token->is_8bit = ascii ? 0 : 1;

Anyway, I've fixed this in git master.

Jeff


On 12/21/2011 11:22 AM, evil legacy wrote:
Hey, thanks for the quick patch!

I think a found a little problem with the new tokenized decoding.
When trying to decode a broken, encoding-less header, e.g: 

Subject: ×××ע× ××ש× ××ס××× ×××× ש× ××× ×ע×ר

tokenize_rfc2047_text() does

                        if ((token = rfc2047_token_new_encoded_word (word, n)))

and if it's not encoded it doesn't set token->encoding but also doesn't check if it's not ascii and doesn't set is_8bit
so without token->encoding or token->is_8bit, rfc2047_decode_tokens() falls back and just copies the data as is.
which in some cases makes g_mime_header_decode_text() return a non utf-8 string.

a quick fix I found is doing the ascii check if header isn't encoded ans setting is_8bit if isn't ascii:

in tokenize_rfc2047_text() line 2136:

  } else {
                                /* append the lwsp and atom tokens */
                                if (lwsp != NULL) {
                                        tail->next = lwsp;
                                        tail = lwsp;
                                }

                                token = rfc2047_token_new (word, n);
                                tail->next = token;
                                tail = token;

                                ascii = TRUE;
                                while (n--)
                                        ascii = ascii && is_ascii (*word++);

                                if (!ascii)
                                        token->is_8bit = 1;

                                encoded = FALSE;
                        }

Regards,
Eddie

On Mon, Dec 19, 2011 at 1:00 AM, Jeffrey Stedfast <fejj gnome org> wrote:
The attached patch should fix it if applied to the latest gmime from git master.

A quick test has it passing all of the unit tests (e.g. test-mime), so it's probably good to go.

I don't normally like to land such massive patches in a stable cycle, so if you could test this out on your messages and see how it works in the wild, that'd be great.

This patch should also handle cases where base64 and/or quoted-printable data was split between encoded-word tokens (which addresses another feature request I've gotten a few times now).


Jeff

On 12/18/2011 09:02 AM, evil legacy wrote:
Hi, 

Came across another header decoding problem when dealing with badly split utf-8 headers, i.e:

=?utf-8?B?16nXoteV158g15PXldek16cgR0FSTUlOINei150gR1BTINee15XXkdeg15Qg15XXktc=?=  =?utf-8?B?nSDXkNeo16DXpyDXkNeV16TXoNeq15kg157XoteV16gg157XqdeV15HXlyE=?='

it looks like someone splited an utf-8 string wrongly, leaving "half" a char on each part

g_mime_utils_header_decode_text/phrase split the header into words and decode each word separately, and since it's utf8, iconv isn't used and the string validates with this loop:

while (!g_utf8_validate (p, len, (const char **) &p)) {
len = declen - (p - (char *) decoded);
*p = '?';
}

because the original string is poorly (brokenly) splited, the 'half' chars are replaced with '?'

I'm attaching a patch that moves the utf-8 validation to the end of g_mime_utils_header_decode_text/phrase, where these decoded words are already combined

Best Regards

On Sat, Dec 17, 2011 at 6:49 PM, Jeffrey Stedfast <fejj gnome org> wrote:
Hi,

I've just released GMime 2.4.29 and 2.6.2 with your fix (and other similar fixes).


Jeff

On 12/14/2011 01:26 PM, evil legacy wrote:
Hi, 

After more debugging, I found that the problem is when iconv (cd, NULL, NULL, &outbuf, &outleft) tries to flush the buffer to outbuf, but outbuf isn't big enough to hold it.
This little patch to the charset_convert function seems to fix this problem (works for me):

<patch>

diff --git a/gmime/gmime-utils.c b/gmime/gmime-utils.c
index ca32b61..093deee 100644
--- a/gmime/gmime-utils.c
+++ b/gmime/gmime-utils.c
@@ -1553,7 +1553,15 @@ charset_convert (iconv_t cd, const char *inbuf, size_t inleft, char **outp, size
                }
        } while (inleft > 0);
        
-       iconv (cd, NULL, NULL, &outbuf, &outleft);
+       while (iconv (cd, NULL, NULL, &outbuf, &outleft) == (size_t) -1)
+               if (errno == E2BIG) {
+                       outlen += 16;
+                       rc = (size_t) (outbuf - out);
+                       out = g_realloc (out, outlen + 1);
+                       outleft = outlen - rc;
+                       outbuf = out + rc;
+               }
+
        *outbuf++ = '\0';
        
        *outlenp = outlen;

</patch>

Best Regards
_______________________________________________ gmime-devel-list mailing list gmime-devel-list gnome org http://mail.gnome.org/mailman/listinfo/gmime-devel-list




--
map{map{$a=unpack"C",$_;map{$c=$a-ord;print$_ x$c and goto"a"if$c>0}("Z",
" ");a:}split//;print"\n"}(q{&[%[%`#[%["},q{&[$[![$[%["[%["},q{&[#[#[#[%[
"[%["},q{&["[%["`#a"},q{[%["a"[([%["},q{[%["[%["[([%["},q{!_#[%["[([%["})




--
map{map{$a=unpack"C",$_;map{$c=$a-ord;print$_ x$c and goto"a"if$c>0}("Z",
" ");a:}split//;print"\n"}(q{&[%[%`#[%["},q{&[$[![$[%["[%["},q{&[#[#[#[%[
"[%["},q{&["[%["`#a"},q{[%["a"[([%["},q{[%["[%["[([%["},q{!_#[%["[([%["})




--
map{map{$a=unpack"C",$_;map{$c=$a-ord;print$_ x$c and goto"a"if$c>0}("Z",
" ");a:}split//;print"\n"}(q{&[%[%`#[%["},q{&[$[![$[%["[%["},q{&[#[#[#[%[
"[%["},q{&["[%["`#a"},q{[%["a"[([%["},q{[%["[%["[([%["},q{!_#[%["[([%["})
From 27e059a565ba12d5f170b09f0e900ea1a62146c5 Mon Sep 17 00:00:00 2001
From: quatrix <evil legacy gmail com>
Date: Wed, 4 Jan 2012 03:03:45 +0200
Subject: [PATCH] minor memory leak fix in rfc2047_decode_tokens()

---
 gmime/gmime-utils.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/gmime/gmime-utils.c b/gmime/gmime-utils.c
index 2d83356..6ffeb75 100644
--- a/gmime/gmime-utils.c
+++ b/gmime/gmime-utils.c
@@ -2265,6 +2265,7 @@ rfc2047_decode_tokens (rfc2047_token *tokens, size_t buflen)
 				g_mime_iconv_close (cd);
 				
 				g_string_append_len (decoded, str, len);
+				g_free(str)
 				
 #if w(!)0
 				if (ninval > 0) {
-- 
1.7.0.2



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]