Re: [gmime-devel] gmime-devel-list Digest, Vol 77, Issue 8

From: Yuval Peduel <ypeduel oath com>
To: "gmime-devel-list gnome org" <gmime-devel-list gnome org>
Subject: Re: [gmime-devel] gmime-devel-list Digest, Vol 77, Issue 8
Date: Mon, 14 Aug 2017 17:11:58 +0000 (UTC)

Will try. Thank you.

On Monday, August 14, 2017 9:55 AM, "gmime-devel-list-request gnome org" <gmime-devel-list-request gnome org> wrote:

Send gmime-devel-list mailing list submissions to

gmime-devel-list gnome org

To subscribe or unsubscribe via the World Wide Web, visit

https://mail.gnome.org/mailman/listinfo/gmime-devel-list

or, via email, send a message with subject or body 'help' to

gmime-devel-list-request gnome org

You can reach the person managing the list at

gmime-devel-list-owner gnome org

When replying, please edit your Subject line so it is more specific

than "Re: Contents of gmime-devel-list digest..."

Today's Topics:

1. Re: determining encodings (Jeffrey Stedfast)

2. Re: gmime 3.0 installation on centos 7 (Jeffrey Stedfast)

----------------------------------------------------------------------

Message: 1

Date: Mon, 14 Aug 2017 16:25:16 +0000

From: Jeffrey Stedfast <jestedfa microsoft com>

To: Yuval Peduel <ypeduel yahoo-inc com>, "gmime-devel-list gnome org"

<gmime-devel-list gnome org>

Subject: Re: [gmime-devel] determining encodings

Message-ID: <7B76926F-8C92-4B5B-AEE7-1AC46C26B223 microsoft com>

Content-Type: text/plain; charset="utf-8"

Hi Yuval,

It is the correct method to use, however, you need to specify a list of charsets that it should even attempt to try.

What you need to do is:

static const char **charsets = { ?big5?, ?shift-jis?, ?euc-jis?, ?cp1255?, NULL };

options = g_mime_parser_options_clone (NULL);

g_mime_parser_options_set_fallback_charsets (options, charsets);

Then pass those options into decode_8bit().

Hope that helps,

Jeff

From: gmime-devel-list <gmime-devel-list-bounces gnome org> on behalf of Yuval Peduel via gmime-devel-list <gmime-devel-list gnome org>

Reply-To: Yuval Peduel <ypeduel yahoo-inc com>

Date: Wednesday, August 9, 2017 at 2:00 PM

To: "gmime-devel-list gnome org" <gmime-devel-list gnome org>

Subject: [gmime-devel] determining encodings

Most messages with subjects and From: headers using characters outside the ASCII set now use the RFC-2047 encoding to keep the actual bytes in the message "7-bit safe". But there are still a significant number of messages coming in which use national encoding: big5 from China, Taiwan, and Singapore; EUC-JIS and shift-JIS from Japan; cp1255 from Israel; etc.

What is the best way to convert these strings into UTF-8?

Since these contain 8-bit characters, I tried using g_mime_utils_decode_8bit with a NULL encoding, assuming it would determine the best one to use. But in my test, this didn't work at all. (My test consisted of:

- starting with one UTF-8 string for each of 4 encodings, the equivalent of

- "Happy New Year" in Chinese (big5

- "Good Morning" for shift-JIS

- "Good Evening" for EUC-JIS

- "Peace unto you" for cp1255

- I converted the UTF-8 to a byte sequence using the corresponding encoding.

- I then fed the four resulting byte sequences to g_mime_utils_decode_8bit and wrote out the results

I confirmed that the input to g_mime_utils_decode_8bit were correctly encoded by decoding them with the proper decoding.

So:

1. is g_mime_utils_decode_8bit the right tool for the job? I assume it works properly when one actually knows the encoding, but when one doesn't?

2. if so, how should I be using it, because:

output_ptr = g_mime_utils_decode_8bit(NULL, input_ptr, input_length);