RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?

From: Dan Winship <danw ximian com>
To: Not Zed <notzed ximian com>
Cc: evolution helixcode com
Subject: RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?
Date: 12 Jun 2001 19:11:56 +0500

x-user-defined, etc. And you will be able to override incorrect
encodings for display, and change the encoding for specific messages in
the composer when you don't want to use the default.


Override?  Oh?  And just how do you plan to do that?  The camel api is
utf-8 ...


You still know the original charset from the mime headers though. So
converting from utf8 to that should get you back the original content,
and then you can convert that back to utf8 using the user-specified
source charset.

(Or if the mime part had no charset or an invalid charset specified,
then you don't even need to do the first conversion.)

I wrote a "camel_mime_part_override_charset" that does this, but I'm not
sure that's the right way/place to handle it. (It's good to have it as a
mutator, because then if you do "override charset" and then reply,
you'll be replying to the charset-overriden copy, which is what you
want.) The attached diff isn't quite right anyway, because you want it
to fail and return an error if the conversion can't succeed losslessly.

-- Dan

Index: camel-mime-part.c
===================================================================
RCS file: /cvs/gnome/evolution/camel/camel-mime-part.c,v
retrieving revision 1.118
diff -u -r1.118 camel-mime-part.c
--- camel-mime-part.c   2001/05/16 18:23:15     1.118
+++ camel-mime-part.c   2001/06/12 14:08:39
@@ -808,4 +808,60 @@
                        camel_object_unref (CAMEL_OBJECT (medium->content));
                medium->content = NULL;
        }
+}
+
+/**
+ * camel_mime_part_override_charset
+ * @part: a CamelMimePart
+ * @charset: the character set to reinterpret @part in
+ *
+ * Sets @part's charset to @charset and re-interprets its content into
+ * that charset from whatever it was before.
+ **/
+void
+camel_mime_part_override_charset  (CamelMimePart *part, const char *charset)
+{
+       CamelDataWrapper *wrapper;
+       CamelStream *mem;
+       CamelStreamFilter *filter_stream;
+       CamelMimeFilter *charenc;
+       const char *old_charset;
+
+       g_return_if_fail (header_content_type_is (part->content_type, "text", "*"));
+
+       wrapper = camel_data_wrapper_new ();
+       camel_data_wrapper_set_mime_type_field (wrapper, part->content_type);
+
+       mem = camel_stream_mem_new ();
+       filter_stream = camel_stream_filter_new_with_stream (mem);
+
+       /* If the data was converted to UTF-8 before, we have to undo
+        * that to get the original data back.
+        */
+       old_charset = header_content_type_param (part->content_type, "charset");
+       if (old_charset && g_strcasecmp (old_charset, "us-ascii") &&
+           g_strcasecmp (old_charset, "utf-8")) {
+               charenc = (CamelMimeFilter *)camel_mime_filter_charset_new_convert ("utf-8", old_charset);
+               if (charenc) {
+                       camel_stream_filter_add (filter_stream, charenc);
+                       camel_object_unref (CAMEL_OBJECT (charenc));
+               }
+               /* else we don't recognize the charset, in which case it
+                * wouldn't have gotten translated before
+                */
+       }
+
+       /* Now re-convert to UTF-8 with the correct encoding. */
+       charenc = (CamelMimeFilter *)camel_mime_filter_charset_new_convert (charset, "utf-8");
+       camel_stream_filter_add (filter_stream, charenc);
+       camel_object_unref (CAMEL_OBJECT (charenc));
+
+       camel_data_wrapper_write_to_stream (camel_medium_get_content_object (CAMEL_MEDIUM (part)), 
CAMEL_STREAM (filter_stream));
+       camel_object_unref (CAMEL_OBJECT (filter_stream));
+
+       camel_data_wrapper_construct_from_stream (wrapper, mem);
+       camel_object_unref (CAMEL_OBJECT (mem));
+
+       camel_medium_set_content_object (CAMEL_MEDIUM (part), wrapper);
+       header_content_type_set_param (part->content_type, "charset", charset);
 }

Follow-Ups:
- RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?
  - From: Not Zed

References:
- RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?
  - From: Nerijus Baliunas
- RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?
  - From: Dan Winship
- RE: [Evolution] Charset for mail messages: ANSI_X3.4-1968?
  - From: Not Zed

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]