Re: [Evolution-hackers] i18 things

On Tue, 2003-04-01 at 00:12, Alex Jiang wrote:
> Thank you for your comments.
> >On Mon, 2003-03-31 at 03:23, Alex Jiang wrote:
> >  
> >
> >>Hi, 
> >>
> >>I am now working on i18n issues of evolution.
> >>
> >>There are several things I think we should do to improve the i18n
> >>performace of evolution.
> >>First, c the patch for bug 40519. which make sure whenever we call
> >>iconv, there is a 
> >>charset(locale_charset for default), so a uniform manner is applied. 
> >>    
> >>
> >
> >I don't like this patch.
> >  
> >
> OK , I  c what you mean. However, I've checked almost all the e_iconv 
> calles in gal/evolution,
> I will try to not touch e_iconv... things, but add necessary things on the
> application-level. (I mean, add more if (charset == NULL), charset = 
> e_iconv_locale_charset())

this should already be handled everywhere in camel afaik.

> >
> >1) US-ASCII does not ever need to be passed in to iconv(), and so
> >therefor e_iconv_charset_name() is perfectly fine returning NULL.
> >  
> >
> Yes, since the funtion returns NULL, there shall be hidden problem.
> There are codes like "
> if (iconv_open(to, from) == (iconv_t)-1) {
> g_warning("%s%s ...", to, from);
> }", which may lead to core dumps.
> It's kind of  style or philosophy things, however, I respect your 
> original style, and I shall make
> patches on application only.

ok, then e_iconv_open() needs to check that the resulting to/from are
not NULL, and if they are - return -1 and set errno appropriately.

> >2) (e_iconv_init): The second parameter of setlocale() from NULL to "".
> >
> >does passing NULL cause it to crash on Solaris? or what is the reasoning
> >for this change? If it is to prevent a crash or something, I'm fine with
> >it.
> >  
> >
> In fact, I don't understand why call setlocale with NULL. In manual, the 
> consequent behavior
> is not defined.


       char *setlocale(int category, const char * locale);

       The  setlocale() function is used to set or query the pro­
       gram's current locale.

       If locale is not NULL, the  program's  current  locale  is
       modified  according  to the arguments.  The argument cate­
       gory determines  which  parts  of  the  program's  current
       locale should be modified.

looks like the behaviour is defined to me ;-)

since we don't want to change the LC_ALL value, we pass in NULL as the
second argument. passing in "" (like your patch does) will clobber the
old value. this is not what we want.

> >3) I don't see why you want to always set the locale charset. if the
> >default locale charset is just going to be US-ASCII (which is what it
> >HAS to be), what difference does it make? Just leave it as NULL.
> >
> >4) I also don't think we should do a default locale_lang. First, we
> >don't even use it yet - but that is irrelevant. The reason I left it
> >NULL is that it will be useful in the composer in that we will clearly
> >see that the user does not have a locale language setup - and therefor
> >we should skip setting a lang on the message (or prompt the user or
> >something, which we should maybe do anyway).
> >  
> >
> OK for (3) and (4)
> >5) No need to have e_iconv_charset_guess() so that e_iconv_init() can
> >parse it and construct a iconv-friendly name for it because it should
> >ALREADY be in whatever form the system's iconv() will want. Besides,
> >e_iconv_open() will always pass both to/from charsets into
> >e_iconv_charset_name() and so even if we use the locale_charset value,
> >it will still be transformed into the iconv-friendly charset name.
> >  
> >
> I am to just make it more elegant(each occurence of camel lib call(), 
> will always generate iconv_friendly
> charset name. while, I don't think it's nessary fro e_iconv_open() to 
> call e_iconv_charset_name()
> again. However, again I respect your paradigm.

it's extremely little overhead and acts as a "just in case".

> >>Secondly, display charset should be stored in gconf(for each
> >>folder(vfolder?)). Or, when 
> >>we change the display charset,  functions like 
> >>process_header()/header_decode_string() will 
> >>not be affected(and it's not easy to pass the chosen charset to them),
> >>which may lead 
> >>to bug 40521 . i.e. those multibyte subject/sender... which not
> >>following rfc2047, will not 
> >>be readable(But this happens very often)
> >>    
> >>
> >
> >I don't understand your proposal here?
> >
> OKay, when there is a multibyted subject or sender ...(broken mailer), 
> even we chose a  right display
>  charset, the mail will be reparsed. However, the mail header, processed 
> by process_header()...
> will not be affected by the rechose charset.
> As to NotZed's comment, the subject is already wrongly converted from a 
> wrong charset to utf-8,
> what can I do with display. The best and most elegant way is to pass 
> display charset to everywhere,
> but perhaps that is worse.

the message is not re-parsed when you change the display charset.

what you will actually want to do is not use
camel_mime_message_get_subject() when display the subjct string, but
instead call camel_medium_get_header() to get the Subject header, and
then decode yourself. this is yuck, but it would work.

> >>Third, when we open evolution twice in different locale, the summary
> >>file will not be rebuilt.
> >> So, improperly converted subject/title will remain unreadable. One
> >>possibly runnable way is 
> >>to make summary in a "mbox.ev-summary.gb2312","mbox.ev-summary.en"
> >>way. But the final 
> >>solution is that we be ablet ot change the summary file when we change
> >>the charset of a mail 
> >>(c the next item) or of a folder.
> >>    
> >>
> >
> >No. This is The Wrong Way (tm) to fix it. the mbox.ev-summary files
> >store everything in UTF-8 so there only ever has to be ONE.
> >
> When converting from other encoded charset to UTF-8,  the charset is 
> wrong. the suffix is
> just a suggestion of using what charset to convert to UTF-8.
> It's an intermedia solution, but I hope you understand what I mean.

I know what you mean, but it is not the right fix. as notzed says, this
is definetely not the way to do it. the Right Fix would be:

- invent a way to do better auto-detection of charsets given the raw
8bit text... but as well all well know, this is a very difficult task
and is still not 100% reliable.

the only other thing we can do is ignore it and say "fix the other
mailers" which is the Real Fix (tm).

> >
> >The only solution here is to ban people from using broken crap mailers
> >that don't set the charset(s) correctly. :-)
> >
> : -). Dear Jeffery, many many mailers do not follow rfc2047 in process 
> headers. Including some
> mail servers used by IBM and other big companies. This is so often, in 
> us, mutlibyte charset
> contries.

I'm fully aware.

> >it should be possible to change the code to render the message header
> >block in the "preview pane" according to the user's picked charset (if
> >it doesn't already) by grabbing the raw headers from the mesage object I
> >suppose? kinda yuck, but oh well.
> >
> >changing the message-list is not doable.
> >  
> >
> OK. but the summary file is just to speedup the listing things...

it's for more than that...

> >>Fouth, we should be able to madatorily change the charset of a single
> >>mail. so perhaps a UI 
> >>method will be introduces first(e.g. when we change the charset in the
> >>mail window, the mail's 
> >>charset will be changed, while we change in the folder view, folder
> >>charset will be affected). 
> >>Perhaps, to keep the wholesome of the summary file, a user-defined tag
> >>in the CamelMessageInfo 
> >>struture will be used to define the charset of a single mail.
> >>    
> >>
> >
> >I don't really like this idea...
> >  
> >
> Even malformed mails are using different charsets. Storing mandatory 
> charset somewhere is a sooner or
> later thing.

nah, eventually all the other mailers will be fixed :-)


Jeffrey Stedfast
Evolution Hacker - Ximian, Inc.
fejj ximian com  -

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]