Re: [evolution-patches] IDN patch for evolution head



On Mon, 2004-01-05 at 12:34 -0800, Suresh Chandrasekharan wrote:
> 	utf8_host is used only to save calls to idnkit everytime, which is 
> expensive, and I thought camel library would be better off if we are having 

How expensive?

> utf8_host (which can be used for only display purposes) and a host which is 
> always ASCII compatible.
> 	
> 	These are not interchangeable as low level functions need ACE, but for 
> display purposes utf8 is used. ACE will be cryptic and will be difficult for 
> humans to remember, something like xn--wurywe8jekrew for a meaningful UTF-8 
> hostname.
> 
> >Is there any reason the host wouldn't ALWAYS be in utf8 or compatible
> >format (i.e. isn't an ascii hostname also a utf8 one?).

Well, let me word it another way.

Is there any reason the DECODED host would ever NOT be in utf8 or
compatible format.

i.e. a plain ascii hostname is the same when encoded in ace right?

So why bother keeping around both?

> >
> >> +	(e_uri_copy) Added an additional parameter called host_offset, used
> >> +	in replacing the original UTF-8 hostname by the ACE encoded or vice 
> >> +	versa. 
> >
> >You did?  Where?
> *******
> uri_copy->host_offset = uri->host_offset;

that isn't a parameter, but anyway ...


> 	In e-url.h host_offset is used and this is used as argument to 
> replace_str function, like 
> 	*uri_string = replace_str (*uri_string, uri->host_offset, uri->host, 
> idnh);

when i searched for host_offset i'd edited this out so i couldn't find
where it was used.

> 	used when the utf8 hostname needed to be replaced by ace, host_offset 
> points to the start position of the host within the URL.

Anyway i think the replace_str stuff shouldn't be used.

	
> ******
> 
> >> +	(e_uri_to_string) Fixed the bug causing the removal of the '/' 
> >> +	seperating the path name from the rest of url when EUri is
> >converted 
> >> +	back to string.
> >
> >Again, you did, where?
> *********
> According to rodo feedback that a path should always be started with a '/' I 
> fixed this in 
> 
> +       if (path < quest && path != end && path[0] == '/' ) {
> +               uri->path = g_strndup (path, quest - path);
> +               uri_decode (uri->path);
> +               uri_string = end;
> +       } 
> +
>  but forgot to update the ChangeLog comments.
> *******
> 
> >
> >> +	The folowing functions are added, 
> >> +
> >> +	(e_uri_encode_host) Pass in a UTF-8 host name and and ACE name will
> >be
> >> +	returned. Made robust by additional checks for encoded url etc. 
> >> +
> >> +	(e_uri_decode_host) Pass in an ACE host name and the equivalent UTF-
> >8
> >> +	hostname will be returned. This functions is made robust by
> >additonal
> >> +	checks and it can take in UTF-8 string, encoded url with %hex codes.
> >
> >How many of these checks are already made by idnkit?
> >
> >> +	(e_uri_replace_with_decd_host) Replaces the UTF-8 hostname part of
> >the 
> >> +	URI with ACE encoding extra memory if need will be allocated in
> >place.
> >> +
> >> +	(e_uri_replace_with_encd_host) Replaces the ACE hostname part of
> >the 
> >> +	URI with UTF-8 encoding extra memory if need will be allocated in
> >place.
> *********
> Actually this should be 
> e_uri_replace_with_decoded_host/e_uri_replace_with_encoded_host
> These are not redundant, the difference between this and 
> e_uri_decode_host/e_uri_encode_host is that the former takes in full URL's and 
> replaces the hostname part, the latter expects hostnames itself as argument.
> 
> This is needed because when we are dealing with these conversions in the code 
> everywhere, and is used much more than other e_uri functions in the file.

Well whatever, camel wont be calling any e_uri methods, so work out a
way to do it some other way.

> ********** 
> >I think these calls are redundant.  The api is weird, not really used
> >elsehwere in evolution, the code is messy, and it doesn't save much in
> >all of the callers of the code (and makes some callers code more
> >complex).
> >
> >> +	(e_uri_gnome_show_idn_url) Call browser with a supplied URL having 
> >> +	UTD-8 hostname.
> *******
> True, I did a name change as per rodo's suggestion but forgot to update 
> ChangeLog.
> 
> About  the other thing,
> 
> Decoded URL -> URL's with hostname in UTF-8 format ( Or if the user does not use 
> UTF-8 format, in ASCII )
> 
> So you will pass ACE hostname as an argument to 'decode' functions
> 
> Encoded URL -> URL's with hostname in ACE format.
> 
> So you will pass UTF-8 as an argument to 'encode' functions.
> 
> When you're using a normal ASCII hostname both are the same, but if you're using 
> a multibyte hostname, then ACE will be totally different than UTF8 format, 
> something like 'xn-fhjgdjfjhg', this when displayed to the user does not make 
> much sense, so conversion to UTF8 is needed.
> 
>   	/* URL. */ 
>         url = e_dialog_editable_get (priv->url); 
>         e_uri_replace_with_encoded_host (&url);
>         
>         Here user is getting a utf8 string out of e_dialog_editable_get, but 
> need to convert the hostname part ACE before passing it down to low level funcs.

Sure, but the api is bad.

You should be using something like:

uri = e_url_new_utf8(url);
g_free(url);
url = e_url_to_string(uri);

where e_url_to_string() produces an encoded uri.

or even
aceuri = e_uri_encoded_uri(uri);

but the replace_uri api thing is messy and ugly.
        
>         Here bear in mind that in addition to the hostnames in 
> mail/calender/LDAP etc, the hostnames which constitutes a part of URL also need 
> to be converted.
>         
>         So these functions are needed.

The functionality is needed, the api above is dreadful.


        
> ***********
>        
> 
> >This doesn't match the code, i presume you mean e_uri_show().
> >
> >Are you sure this is required, shouldn't any encoded url's already be in
> >encoded format?
> >
> >> +	(replace_str) This static function is used to replace an oldstr
> >> +	at offset oldstr_offset witha newstr, the additional memory needed
> >> +	id allocated within.
> 
> ****
> 
> Maybe I can rename the funcs like 
> e_uri_replace_with_encoded_host ---> e_uri_utf8_to_ace 
> and 
> e_uri_replace_with_decoded_host ---> e_uri_ace_to_utf8, then it be much more 
> obvious.
> 
> I did a cache for avoiding the idnkit call if not needed.

Yes, but how expensive is it?

> **********
> >Again i think this is overly complex and solving a problem which
> >doesn't need to be solved.
> >
> >> +	(cache_coded_host) A functions which will store and retrieve
> >> +	ace_name/utf8_name pairs. Used to reduce the overhead in frequent
> >> +	idnkit function calls. Thread safe.
> ****
> What's the issue with cache ? Obviously cache will save much overhead, idn 
> functions are expensive with many checks/code coversions performed to make sure 
> the utf8 name is a valid hostname compared with the cache mechanism.

It adds complexity which might not be needed.  You say its expensive,
but how expensive?

> *****
> >Is this cache really required?  How slow are the idnkit functions?  My
> >guess is they would be more than fast enough to only use them on
> >demand.
> >





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]