GLib file name encoding (Was: Please join me...)



Allin Cottrell writes:
Is there a "dummies' guide" to the filename encoding issues 
anywhere?

Well, the short guide is that file names that GLib or GTK+ gives you
(g_dir_read_name(), filer chooser, etc) and that you give to GLib or
GTK+ (g_file_test() etc) are in UTF-8, and if you want to handle files
using UTF-8 names, it's easiest using the gstdio wrappers. 
(g_fopen(), g_stat() etc).

g_filename_to/from_utf8() is now effectively strdup() on
Windows. g_locale_to/from_utf8() works like before.

The README in GLib says:

* GLib 2.6 introduces the concept of 'GLib filename encoding', which is the
  on-disk encoding on Unix, but UTF-8 on Windows. All GLib functions
  returning or accepting pathnames have been changed to expect
  filenames in this encoding, and the common POSIX functions dealing
  with pathnames have been wrapped. These wrappers are declared in the
  header <glib/gstdio.h> which must be included explicitly; it is not
  included through <glib.h>.

[yeah, can't seem to decide whether to use "filename" or "pathname"]

  On current (NT-based) Windows versions, where the on-disk file names
  are Unicode, these wrappers use the wide-character API in the C
  library. Thus applications can handle file names containing any
  Unicode characters through GLib's own API and its POSIX wrappers,
  not just file names restricted to characters in the system codepage.

  To keep binary compatibility with applications compiled against
  older versions of GLib, the Windows DLL still provides entry points
  with the old semantics using the old names, and applications
  compiled against GLib 2.6 will actually use new names for the
  functions. This is transparent to the programmer.

  When compiling against GLib 2.6, applications intended to be
  portable to Windows must take the UTF-8 file name encoding into
  consideration, and use the gstdio wrappers to access files whose
  names have been constructed from strings returned from GLib.

And GTK+'s README says much the same:

  On Windows, filenames passed to GTK+ should always be in UTF-8, as
  in GLib 2.6. This is different than in previous versions of GTK+
  where the system codepage was used. As in GLib, for DLL ABI
  stability, applications built against previous versions of GTK+ will
  use entry points providing the old semantics.

  When compiling against GTK+ 2.6, applications intended to be
  portable to Windows must take the UTF-8 file name encoding into
  consideration, and use the gstdio wrappers to access files whose 
  names have been constructed from strings returned from GTK+ or GLib.

--tml





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]