Re: [gmime-devel] Issue with header decoding (continues)



Hi, 

Came across another header decoding problem when dealing with badly split utf-8 headers, i.e:

=?utf-8?B?16nXoteV158g15PXldek16cgR0FSTUlOINei150gR1BTINee15XXkdeg15Qg15XXktc=?=  =?utf-8?B?nSDXkNeo16DXpyDXkNeV16TXoNeq15kg157XoteV16gg157XqdeV15HXlyE=?='

it looks like someone splited an utf-8 string wrongly, leaving "half" a char on each part

g_mime_utils_header_decode_text/phrase split the header into words and decode each word separately, and since it's utf8, iconv isn't used and the string validates with this loop:

while (!g_utf8_validate (p, len, (const char **) &p)) {
len = declen - (p - (char *) decoded);
*p = '?';
}

because the original string is poorly (brokenly) splited, the 'half' chars are replaced with '?'

I'm attaching a patch that moves the utf-8 validation to the end of g_mime_utils_header_decode_text/phrase, where these decoded words are already combined

Best Regards

On Sat, Dec 17, 2011 at 6:49 PM, Jeffrey Stedfast <fejj gnome org> wrote:
Hi,

I've just released GMime 2.4.29 and 2.6.2 with your fix (and other similar fixes).


Jeff

On 12/14/2011 01:26 PM, evil legacy wrote:
Hi, 

After more debugging, I found that the problem is when iconv (cd, NULL, NULL, &outbuf, &outleft) tries to flush the buffer to outbuf, but outbuf isn't big enough to hold it.
This little patch to the charset_convert function seems to fix this problem (works for me):

<patch>

diff --git a/gmime/gmime-utils.c b/gmime/gmime-utils.c
index ca32b61..093deee 100644
--- a/gmime/gmime-utils.c
+++ b/gmime/gmime-utils.c
@@ -1553,7 +1553,15 @@ charset_convert (iconv_t cd, const char *inbuf, size_t inleft, char **outp, size
                }
        } while (inleft > 0);
        
-       iconv (cd, NULL, NULL, &outbuf, &outleft);
+       while (iconv (cd, NULL, NULL, &outbuf, &outleft) == (size_t) -1)
+               if (errno == E2BIG) {
+                       outlen += 16;
+                       rc = (size_t) (outbuf - out);
+                       out = g_realloc (out, outlen + 1);
+                       outleft = outlen - rc;
+                       outbuf = out + rc;
+               }
+
        *outbuf++ = '\0';
        
        *outlenp = outlen;

</patch>

Best Regards
_______________________________________________ gmime-devel-list mailing list gmime-devel-list gnome org http://mail.gnome.org/mailman/listinfo/gmime-devel-list




--
map{map{$a=unpack"C",$_;map{$c=$a-ord;print$_ x$c and goto"a"if$c>0}("Z",
" ");a:}split//;print"\n"}(q{&[%[%`#[%["},q{&[$[![$[%["[%["},q{&[#[#[#[%[
"[%["},q{&["[%["`#a"},q{[%["a"[([%["},q{[%["[%["[([%["},q{!_#[%["[([%["})
From 68f19d84b092bb6bb8f0e4497e64c0c0df452f45 Mon Sep 17 00:00:00 2001
From: quatrix <evil legacy gmail com>
Date: Sun, 18 Dec 2011 15:43:50 +0200
Subject: [PATCH] fixed utf8 validation for split (broken) utf8 headers

---
 gmime/gmime-utils.c |   32 ++++++++++++++++++++------------
 1 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/gmime/gmime-utils.c b/gmime/gmime-utils.c
index 91a9779..d28cdea 100644
--- a/gmime/gmime-utils.c
+++ b/gmime/gmime-utils.c
@@ -1820,17 +1820,8 @@ rfc2047_decode_word (const char *in, size_t inlen)
 		*p = '\0';
 	
 	/* slight optimization? */
-	if (!g_ascii_strcasecmp (charset, "UTF-8")) {
-		p = (char *) decoded;
-		len = declen;
-		
-		while (!g_utf8_validate (p, len, (const char **) &p)) {
-			len = declen - (p - (char *) decoded);
-			*p = '?';
-		}
-		
+	if (!g_ascii_strcasecmp (charset, "UTF-8"))
 		return g_strndup ((char *) decoded, declen);
-	}
 	
 	if (!charset[0] || (cd = g_mime_iconv_open ("UTF-8", charset)) == (iconv_t) -1) {
 		w(g_warning ("Cannot convert from %s to UTF-8, header display may "
@@ -1857,6 +1848,19 @@ rfc2047_decode_word (const char *in, size_t inlen)
 	return buf;
 }
 
+char *
+validated_utf8(const char *decoded, size_t declen)
+{
+    char *p     = (char *) decoded;
+    size_t len  = declen;
+
+    while (!g_utf8_validate (p, len, (const char **) &p)) {
+        len = declen - (p - (char *) decoded);
+        *p = '?';
+    }
+
+    return decoded;
+}
 
 /**
  * g_mime_utils_header_decode_text:
@@ -1881,6 +1885,7 @@ g_mime_utils_header_decode_text (const char *text)
 	size_t nlwsp, n;
 	gboolean ascii;
 	char *decoded;
+	size_t declen;
 	GString *out;
 	
 	if (text == NULL)
@@ -1988,9 +1993,10 @@ g_mime_utils_header_decode_text (const char *text)
 	}
 	
 	decoded = out->str;
+	declen  = out->len;
 	g_string_free (out, FALSE);
 	
-	return decoded;
+	return validated_utf8(decoded, declen);
 }
 
 
@@ -2017,6 +2023,7 @@ g_mime_utils_header_decode_phrase (const char *phrase)
 	size_t nlwsp, n;
 	gboolean ascii;
 	char *decoded;
+	size_t declen;
 	GString *out;
 	
 	if (phrase == NULL)
@@ -2122,9 +2129,10 @@ g_mime_utils_header_decode_phrase (const char *phrase)
 	}
 	
 	decoded = out->str;
+	declen  = out->len;
 	g_string_free (out, FALSE);
 	
-	return decoded;
+	return validated_utf8(decoded, declen);
 }
 
 
-- 
1.7.0.2



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]