[gmime-devel] Another malformed email address not parsed by gmime



I encountered aonther malformed email address that is parsed by
thunderbird and outlook but not by gmime. It's:

"Biznes=?ISO-8859-2?Q?_?=INTERIA.PL"=?ISO-8859-2?Q?_?=<biuletyny firma interia pl>

It seems to be generated by some PHP script but such addresses are
common. The issue here is that RFC2047 phrases are not separated from
other words with white spaces. I'm attaching a patch that can handle
that.

-- 
Damian Pietras

http://www.linuxprogrammingblog.com
Index: gmime/gmime-utils.c
===================================================================
--- gmime/gmime-utils.c	(revision 5767)
+++ gmime/gmime-utils.c	(working copy)
@@ -1986,6 +1986,7 @@
 	const char *lwsp, *text;
 	size_t nlwsp, n;
 	gboolean ascii;
+	gboolean in_phrase;
 	char *decoded;
 	GString *out;
 	
@@ -2003,8 +2004,30 @@
 		
 		text = inptr;
 		if (is_atom (*inptr)) {
-			while (is_atom (*inptr))
+			if (!strncmp(inptr, "=?", 2)) {
+				in_phrase = TRUE;
+				inptr += 2;
+			}
+			else
+				in_phrase = FALSE;
+
+			while (is_atom (*inptr)) {
+
+				/* Handle phrases that are not sepparated from
+				 * other phrases. This is a case for some
+				 * broken mailers.
+				 */
+				if (in_phrase) {
+					if (!strncmp(inptr, "?=", 2)) {
+						inptr += 2;
+						break;
+					}
+				}
+				else if (!strncmp(inptr, "=?", 2))
+					break;
+
 				inptr++;
+			}
 			
 			n = (size_t) (inptr - text);
 			if (is_rfc2047_encoded_word (text, n)) {
@@ -2033,6 +2056,12 @@
 			
 			ascii = TRUE;
 			while (*inptr && !is_lwsp (*inptr)) {
+
+				/* Handle phrases that are not separated from
+				 * other words. */
+				if (!strncmp(inptr, "=?", 2))
+					break;
+
 				ascii = ascii && is_ascii (*inptr);
 				inptr++;
 			}
Index: tests/test-mime.c
===================================================================
--- tests/test-mime.c	(revision 5766)
+++ tests/test-mime.c	(working copy)
@@ -191,6 +191,10 @@
 	{ "\"=?ISO-8859-2?Q?TEST?=\" <p p org>",
 	  "TEST <p p org>",
 	  "TEST <p p org>" },
+	{ "\"Biznes=?ISO-8859-2?Q?_?=INTERIA.PL\"=?ISO-8859-2?Q?_?=<biuletyny firma interia pl>",
+	  "\"Biznes INTERIA.PL \" <biuletyny firma interia pl>",
+	  "\"Biznes INTERIA.PL\" <biuletyny firma interia pl>",
+	}
 };
 
 static void


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]