Re: [evolution-patches] IDN patch for evolution head
- From: Not Zed <notzed ximian com>
- To: Suresh Chandrasekharan <Suresh Chandrasekharan Eng Sun COM>
- Cc: suresh chandrasekharan sun com, evolution-patches ximian com
- Subject: Re: [evolution-patches] IDN patch for evolution head
- Date: Tue, 06 Jan 2004 13:00:49 +1030
On Mon, 2004-01-05 at 12:34 -0800, Suresh Chandrasekharan wrote:
> utf8_host is used only to save calls to idnkit everytime, which is
> expensive, and I thought camel library would be better off if we are having
How expensive?
> utf8_host (which can be used for only display purposes) and a host which is
> always ASCII compatible.
>
> These are not interchangeable as low level functions need ACE, but for
> display purposes utf8 is used. ACE will be cryptic and will be difficult for
> humans to remember, something like xn--wurywe8jekrew for a meaningful UTF-8
> hostname.
>
> >Is there any reason the host wouldn't ALWAYS be in utf8 or compatible
> >format (i.e. isn't an ascii hostname also a utf8 one?).
Well, let me word it another way.
Is there any reason the DECODED host would ever NOT be in utf8 or
compatible format.
i.e. a plain ascii hostname is the same when encoded in ace right?
So why bother keeping around both?
> >
> >> + (e_uri_copy) Added an additional parameter called host_offset, used
> >> + in replacing the original UTF-8 hostname by the ACE encoded or vice
> >> + versa.
> >
> >You did? Where?
> *******
> uri_copy->host_offset = uri->host_offset;
that isn't a parameter, but anyway ...
> In e-url.h host_offset is used and this is used as argument to
> replace_str function, like
> *uri_string = replace_str (*uri_string, uri->host_offset, uri->host,
> idnh);
when i searched for host_offset i'd edited this out so i couldn't find
where it was used.
> used when the utf8 hostname needed to be replaced by ace, host_offset
> points to the start position of the host within the URL.
Anyway i think the replace_str stuff shouldn't be used.
> ******
>
> >> + (e_uri_to_string) Fixed the bug causing the removal of the '/'
> >> + seperating the path name from the rest of url when EUri is
> >converted
> >> + back to string.
> >
> >Again, you did, where?
> *********
> According to rodo feedback that a path should always be started with a '/' I
> fixed this in
>
> + if (path < quest && path != end && path[0] == '/' ) {
> + uri->path = g_strndup (path, quest - path);
> + uri_decode (uri->path);
> + uri_string = end;
> + }
> +
> but forgot to update the ChangeLog comments.
> *******
>
> >
> >> + The folowing functions are added,
> >> +
> >> + (e_uri_encode_host) Pass in a UTF-8 host name and and ACE name will
> >be
> >> + returned. Made robust by additional checks for encoded url etc.
> >> +
> >> + (e_uri_decode_host) Pass in an ACE host name and the equivalent UTF-
> >8
> >> + hostname will be returned. This functions is made robust by
> >additonal
> >> + checks and it can take in UTF-8 string, encoded url with %hex codes.
> >
> >How many of these checks are already made by idnkit?
> >
> >> + (e_uri_replace_with_decd_host) Replaces the UTF-8 hostname part of
> >the
> >> + URI with ACE encoding extra memory if need will be allocated in
> >place.
> >> +
> >> + (e_uri_replace_with_encd_host) Replaces the ACE hostname part of
> >the
> >> + URI with UTF-8 encoding extra memory if need will be allocated in
> >place.
> *********
> Actually this should be
> e_uri_replace_with_decoded_host/e_uri_replace_with_encoded_host
> These are not redundant, the difference between this and
> e_uri_decode_host/e_uri_encode_host is that the former takes in full URL's and
> replaces the hostname part, the latter expects hostnames itself as argument.
>
> This is needed because when we are dealing with these conversions in the code
> everywhere, and is used much more than other e_uri functions in the file.
Well whatever, camel wont be calling any e_uri methods, so work out a
way to do it some other way.
> **********
> >I think these calls are redundant. The api is weird, not really used
> >elsehwere in evolution, the code is messy, and it doesn't save much in
> >all of the callers of the code (and makes some callers code more
> >complex).
> >
> >> + (e_uri_gnome_show_idn_url) Call browser with a supplied URL having
> >> + UTD-8 hostname.
> *******
> True, I did a name change as per rodo's suggestion but forgot to update
> ChangeLog.
>
> About the other thing,
>
> Decoded URL -> URL's with hostname in UTF-8 format ( Or if the user does not use
> UTF-8 format, in ASCII )
>
> So you will pass ACE hostname as an argument to 'decode' functions
>
> Encoded URL -> URL's with hostname in ACE format.
>
> So you will pass UTF-8 as an argument to 'encode' functions.
>
> When you're using a normal ASCII hostname both are the same, but if you're using
> a multibyte hostname, then ACE will be totally different than UTF8 format,
> something like 'xn-fhjgdjfjhg', this when displayed to the user does not make
> much sense, so conversion to UTF8 is needed.
>
> /* URL. */
> url = e_dialog_editable_get (priv->url);
> e_uri_replace_with_encoded_host (&url);
>
> Here user is getting a utf8 string out of e_dialog_editable_get, but
> need to convert the hostname part ACE before passing it down to low level funcs.
Sure, but the api is bad.
You should be using something like:
uri = e_url_new_utf8(url);
g_free(url);
url = e_url_to_string(uri);
where e_url_to_string() produces an encoded uri.
or even
aceuri = e_uri_encoded_uri(uri);
but the replace_uri api thing is messy and ugly.
> Here bear in mind that in addition to the hostnames in
> mail/calender/LDAP etc, the hostnames which constitutes a part of URL also need
> to be converted.
>
> So these functions are needed.
The functionality is needed, the api above is dreadful.
> ***********
>
>
> >This doesn't match the code, i presume you mean e_uri_show().
> >
> >Are you sure this is required, shouldn't any encoded url's already be in
> >encoded format?
> >
> >> + (replace_str) This static function is used to replace an oldstr
> >> + at offset oldstr_offset witha newstr, the additional memory needed
> >> + id allocated within.
>
> ****
>
> Maybe I can rename the funcs like
> e_uri_replace_with_encoded_host ---> e_uri_utf8_to_ace
> and
> e_uri_replace_with_decoded_host ---> e_uri_ace_to_utf8, then it be much more
> obvious.
>
> I did a cache for avoiding the idnkit call if not needed.
Yes, but how expensive is it?
> **********
> >Again i think this is overly complex and solving a problem which
> >doesn't need to be solved.
> >
> >> + (cache_coded_host) A functions which will store and retrieve
> >> + ace_name/utf8_name pairs. Used to reduce the overhead in frequent
> >> + idnkit function calls. Thread safe.
> ****
> What's the issue with cache ? Obviously cache will save much overhead, idn
> functions are expensive with many checks/code coversions performed to make sure
> the utf8 name is a valid hostname compared with the cache mechanism.
It adds complexity which might not be needed. You say its expensive,
but how expensive?
> *****
> >Is this cache really required? How slow are the idnkit functions? My
> >guess is they would be more than fast enough to only use them on
> >demand.
> >
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]