Re: I/O and text representation in gtk+ 1.3.x



My opinion is that code conversion should be done at the boundary of
the program - convert to UTF-8 when you read in a file (or receive
data from a network stream, etc.), convert back to the appropriate
encoding as you write out.

I'd like to have some functionality in GLib for reading and writing
files (or other streams) with coversion, but I don't have a good 
enough feeling for exactly how this should done yet to actually
go ahead and add something.

I don't think putting conversions in individual widgets is a good
thing to do - it could quickly become a maintainence headache,
and I think a program should have a consistent encoding for all
its internal strings.


The file selector and filenames are a particularly sticky case.
My personal opinion opinion is that having filenames in an untagged
locale-specific encoding is completely broken:

 - It causes horrible problems if people do migrate to UTF-8
   locales, or otherwise change the locale they use.

 - Multiple people may be accessing filenames who use different
   locale settings.

 - Such filenames cause problems if you create tarballs of
   files and send them elsewhere.

Etc.

I don't think too many people would disagree with the theoretical
brokeness of untagged locale-specific filenames, but unfortunately
we may need to deal with them. Currently, we do so reluctantly:

gchar*
g_filename_to_utf8 (const gchar *string, GError **error)

{
#ifdef G_OS_WIN32
  return g_locale_to_utf8 (string, error);
#else
  if (getenv ("G_BROKEN_FILENAMES"))
    return g_locale_to_utf8 (string, error);

  return g_strdup (string);
#endif
}

Though we may need to change that before we ship GLib-2.0 :-(.


In any case, the operation of converting between filenames
and UTF-8 is encapsulated in this fashion, and the fileselector
already uses this to convert the filenames it displays.

When you call gtk_fileselection_get_filename() it converts the
filename back from UTF-8 to something you can use when
calling open(), stat(), etc.

So, I think the fileselector is already OK in this regard,
and I think it is about the only widget that should be handling
anything other than UTF-8.

Regards,
                                        Owen

tajima <hidetoshi tajima eng sun com> writes:

> Hi,
> 
> Thanks for the answers to my question regarding gettext and UTF-8 switching
> of gtk+. Now I understand that the plan is to convert all the .po files to
> UTF-8. This sounds good.
> 
> Now I want to ask again about file and other I/O operation in general.
> 
> As it is probably much more difficult to convert all IO stream data to UTF-8
> than in .po case, we have to still perform encoding convertion in gtk+, 
> either in gtk+ itself or in each single application, when getting text
> contents through an input stream to display on a widget, or when writing text
> buffer of a widget into an output stream.
> 
> I think the way to do code converstion is pretty samely done in I/O.
> 
> For example, file open and save will be:
>  open
>  	- get filename
>  	- get a file encoding somehow
>  	- g_convert file contents to UTF-8 and display
>  save
>  	- get filename
>  	- get a file encoding somehow
>  	- g_convert text buffer of UTF-8 to the file encoding
>  	- write to the file
> 
> How about handling code convertion to regular gtk+ widgets such as
> gtkfileselbox? Just having encoding selection would be fairly simple.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]