Re: Glib: a Win32 discussion



Hi Kean et al.,
At 06.04.2011 23:34, Kean Johnston wrote:
Everyone,

WARNING: long, detailed message. If you don't care about Win32, move on.

I would like to start a discussion on making changes to Glib for
improved Win32 support. These changes will eliminate many of the
pitfalls that usually accompany that platform, and better live up to the
"mission statement" of Glib (which isn't really a mission statement, but
instead is an indication to users of what it is trying to achieve - "It
works on many UNIX-like platforms, Windows, OS/2 and BeOS". These are
just ideas, and I appreciate any and all feedback.

Before even discussing the changes I'd like to propose, it is important
to discuss the actual problems the changes are trying to solve.
Completely agreed, but ...

I am not
making any changes just for change's sake. First and foremost of those
changes is the complete and total avoidance of the C runtime DLL,
msvcrt.dll. There are many good reasons for doing this.
... here you are jumping to conclusions already. Indeed it could be a solution to get rid of C runtime (CRT) usage on win32, but for me there are as many good reasons not to do this. And from my experience with Gtk+, Dia and GIMP porting of the last ten years quite some of the problems you want to solve will remain - or just be shifted to another level.

First, none of
the modern MS compilers allow you to directly target using msvcrt.dll
(which is present on all Windows systems) and instead force you to use
compiler-specific versions of the C runtime, which you then have the
obligation to either install yourself, or ensure are installed.
The 'official' C runtime for GLib and even Windows is msvcrt.dll. It gets update with the system nowadays and I agree it very unfortunate that Microsoft recent compilers are not allowing to target it directly.

It is
possible to craft a set of tools (using a mixture of the driver
development kit, platform development kit, Visual Studio and a large
number of hacks) that will allow modern MS compilers to target
msvcrt.dll, but doing so is extraordinarily complicated and one tiny
mis-step along the way causes things to fail in subtle ways.
This failing in subtle ways comes in part from questinable assumptions during the API design of involved libraries. And it only will completely vanish if all libraries in an application use the same CRT.

An
alternative is to use and support only MinGW and MSYS, and this is
really tempting, except that using these tools you can NOT avoid
dependence on msvcrt.dll.
But the dependency to the CRT is not the problem, if there is only one of them in process.

There are severe limitations imposed by using
it, not least of which is the fact all file descriptors, the heap and
other information is local to each DLL.
This 'each DLL' is each CRT, isn't it? So no problem if there is only one.

Unlike the UNIX world where a
shared object uses the same malloc as an application (under almost all
circumstances),
It can and ususally is, but there is no guarantee. A good API is providing allocation and deallocation functions like, e.g.
 - GLib: g_malloc and g_free
 - libxml : xmlMalloc and xmlFree
 - cairo: cairo_create and cairo_destroy
 - Gtk+: g_object_new and g_object_unref

an object created in one DLL cannot be freed by another,
or a file descriptor opened in one cannot be closed in another.
Exactly, not using matching allocation/deallocation functions is asking for trouble, but this is not all win32 specific.

The only
way to ensure this does not happen is to guarantee that every single
library you build with uses the exact same CRT, and that it is shared.
This is very difficult to achieve.

The other way in achieving this is good API design as outlined above. Bad APIs are leaking implementation details like the underlying runtime objects.

By avoiding the CRT altogether and using native Windows functions that
use handles, and using a memory allocator other than malloc, all of
these problems go away.
This is only true if there are no other libraries involved, which still have the assumption of e.g. system wide file indices by int. If you want to use ZLib and GLib like:

  int fd = g_open (filename, O_RDONLY, 0);
  gzFile zf = gzdopen(fd,"rb");

it will still fail.

Glib already makes this possible by providing
its own malloc wrappers, but the current Win32 implementation is the
same as the UNIX one in that it is just a very thin layer on top of
malloc.
Yes, both assume the happy day scenario of only having one CRT in process.

The first thing that should change is making g_malloc and
friends use the HeapAlloc function, and ensure that
g_mem_is_system_malloc() always returns false. This is really easy to do
and shouldn't upset the apple-cart too much.

This would break a lot more code than currently is broken. But you can already do this by using g_mem_set_vtable() in your application.

Memory allocation is the easy bit. HeapAlloc maps nicely onto malloc.
Not really if there are libraries with CRT dependencies involved.

There are other bits that are more problematic. The next easiest to
solve is stat(). Gio already makes a token gesture at wrapping stat()
but its wrapping is a bit TOO thin.
Agreed, but changing this looks completely independent from the CRT (to use or not to use) problem.

Not only is the stat structure
problematic under Windows, it is problematic under UNIX too, as many
stat structures, even directly out of sys/stat.h, change according to
pre-processor defines (LFS). Ideally, we should create a GStatBuf
structure that is defined only in terms of portable data types that are
like-sized on all systems. We should ensure that the file sizes are
always 64 bits, and that the timestamps are always 32-bits, using
1-1-1970 as the epoch (standard UNIX time).
Not using time_t looks like a step backward to me.

[...]Wrapping stat can also fix the
problem of having the uid_t / gid_t types differ not only from system to
system, but also change on the SAME system depending on defines. We can
force these to always be 32-bit. Similarly, ino_t.

Really? The win64 HANDLE are already 64-bit and it is just coincidence that Microsoft does not use the upper 32 bits, yet.

The biggest and most challenging thing I would like to propose is the
addition of gstdio. Currently there is no attempt to provide a portable
standard IO set of functions, and these are notorious for subtle
differences. For example, does snprintf() return the number of
characters that would be required if N were unlimited? We need not
develop a stdio package from scratch, not by any means, but can instead
base it off glibc's stdio, which is also LGPL'ed. We can modify the
internals when necessary to use glib functions (for example, use
g_malloc instead of malloc). This package can be easily modified to
return a Glib-abstracted FILE structure (GFILE). On systems where stdio
is known to have a certain behaviour this whole package can be a very
thin wrapping, but the behaviour of all of the functions should be
identical on all platforms Glib supports. Obviously fstat() would become
g_fstat() and conform to the unified stat structure mentioned above.

As outlined above this is not going to work for applications still using the CRT directly, or using libraries like cairo, libxml, libz, libpng etc. which do not depend on GLib.

Last, but by no means least, is the reliance on "compiled" files, like
compiled schemas (or in the case of Gtk, icon caches). On UNIX systems
where things are installed in a universally-accessible location, this
isn't a problem, but on Win32, where multiple applications could all
include their own private copies of the DLL's, this is a problem.
From my understanding this problem is solved long time ago by providing g_get_system_config_dirs() and g_win32_get_system_data_dirs_for_module().

Given that runtime relocation is not any longer win32 specific the set of GLib APIs could be improved, though.

Fixing
this is a bit tricky but very doable. Windows does provide two places
that are predictable and universally accessible: the registry, and
%ProgramData%. The registry is a poor choice except perhaps for location
files inside %ProgramData%. The registry is slow, and also imposes some
severe limitations on key sizes etc. This can be very easily addressed
by compiling Glib with LIBDIR set to something like
"%ProgramData%\Glib\2.x" and ensuring that functions like
g_file_get_contents() or g_open() all call ExpandEnvironmentStrings() on
Win32. This is also a relatively small change and doesn't change any
existing API's (although on Win32 it will have a behaviour change).

For me it appears to be very useful to have application specific libary configurations accompanying application specific library versions. It's the responsibility of application developers to decide if they want global configuration or application (version) specific.

I don't know if there has been a discussion on a Glib 3.0, but perhaps
all of this could be the basis for one (especially with the addition of
gstdio). I am volunteering to spear-head all of this work. I don't have
write access to anything but if there isn't already a 3.0 branch, and
people like / agree / support the above, perhaps we can change that and
I can start work.

As outlined above I think there are still design (mission statement) questions to answer before that. Getting rid of CRT usage in GLib is not
a viable option for me. Improving exisiting APIs to reduce pitfalls is.

One last thing, since it has proven to be a source of considerable
incompatibility, and that's the reliance on D-Bus. I think it should
remain possible to use dbus if you want it, if your application really
needs it, but to have relatively (from my position of ignorance)
unrelated things like gapplication absolutely rely on it is a mistake,
as far as I can see. dbus is really not appropriate for, or required
for, many applications that would otherwise want to use Glib/Gtk.
Completely agreed. I already have a quite hacked version of gapplicationimpl-win32.c to build the Gtk+ stack without dbus. I have yet to find a design document of GApplication goals, which would allow to select the right win32 APIs to base the inter-process-communication on.

[...]

Thank you for your time reading this, and I welcome comments and debate.

HTH,
	Hans

-------- Hans "at" Breuer "dot" Org -----------
Tell me what you need, and I'll tell you how to
get along without it.                -- Dilbert


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]