Re: _wstat on Windows (actually stat stuff in general)



I disagree that we don't have an ABI to maintain on Windows.  I think
people on Windows are somewhat likely to download precompiled binary
versions of our DLLs for use with developing their software (since the
build process is so complicated).  We can easily introduce extremely
difficult-to-debug situations for people who assumed that the DLL was
binary-compatible with the old one.
This is true. On Windows the code is actually fairly well insulated from random size changes based on macros. The only problem is Tor chose the "wrong" stat structure IMHO. These days, files > 4GB are common. Since GLib is a platform tool and you would rightfully expect to be able to write, say, an archival tool that could compress big files, and this currently isn't possible. Or you may want to write a backup tool and preserve timestamps. Also not currently possible. Of course said applications could just use other API's or specify their own stat structures and not use g_stat() at all but in that case what is the purpose of having g_stat() at all? Or any of the other gstdio.h wrappers. The purpose (I believe) is to ensure that if you use those wrappers they will behave the same way on all platforms GLib has been ported to. This is largely the case but it does break down on the fringes.

While I mostly agree with this, it's only true in the case that both the
code calling g_stat() and the code inspecting its result are always in
the same codebase.
And therein lies the EXACT reason why having a well defined stat structure with data types wide enough to cover all cases is such a requirement.

library with no code changes will change the ABI of the library).  I'm
not sure there are any cases of this, but it's something to be aware of.
Certainly, and we could do some rather trivial things to insulate against that. Call the structure something else (although as you mentioned GStatBuf is sufficiently new that I don't think we'd have a problem). Or announce the breakage prominently. We can work around the ABI change.

This means that there is an awful lot of valid existing code (including
within GLib itself) that looks like this:

{
   struct stat buf;

   g_stat (filename,&buf);
}

which (if I understand your proposal correctly) would be badly broken.
That code would of course change to
{
  GStatBuf buf;

  g_stat (filename, &buf);
}

The code wouldn't be broken at all. In fact it would be less broken. If, for example, GLib wasn't compiled with _FILE_OFFSET_BITS=64 then internally, all of its usage of g_stat() can only deal with 31-bit file sizes. User code using GLib compiled WITH that set will support file sizes with 63 bits.

Almost all of the functions currently "wrapped" in gstdio.h are problematic with LFS. On Linux they are currently just macros. Changing those to functions won't break any existing code on Linux because those symbols aren't even in libglib-2.0.so. But in order to provide a consistent, doesn't-change-with-macros interface that can *become* the GLib ABI is useful. That code can be constructed inside GLib such that it is always compiled with LFS in mind. For example, we can ensure that g_open() always calls open64 or whatever it's called on the system in question. By rigidly defining GStatBuf to use identically sized on all platforms fields, we make g_stat() more useful. Heck, it even becomes possible to share binary dumps of the thing on like-endian machines should you want to do that.

In a nutshell, if gstdio.[ch] were slightly tweaked to be actual functions and not veneer macro wrappers, and they all took suitably sized arguments then the code becomes that much more portable and easier to debug and less surprising. I'd also add the various seek functions as they too are problematic because they take a file offset but that also doesn't break any ABI it just adds a new one. I think the *only* platform affected by the changes I am proposing is Windows (or any UNIX system that defines G_STDIO_NO_WRAP_ON_UNIX) and that only for g_stat() and that can be easily worked around. But at the end of it we will have a completely consistent API across all platforms. The *only* thorny question really, is what width do we make the st_?time fields? 32-bit or 64-bit (or, as on MacOS, both). And if we make them 64-bit what exactly does that represent? Nanoseconds since the Epoch? I think the easiest way by far is to have those fields defined thus in the structure:

  gint32  st_atime, st_mtime, st_ctime;
  gint64  st_atime64, st_mtime64, st_ctime64;

That supports the vast majority of the code out there that is UNIX-centric and supports the notion of a 31-bit time field measured as seconds since the epoch of Jan 1 1970. But code that wants higher precision can use the 64-bit variants on systems that provide it and simply multiply out the 32-bit ones to give usable values on those that don't.

Note that on Windows the 64-bit time fields are just more seconds since the Epoch, with an upper limit of Dec 31 23:59:59 3000. If we decide that the 64-bit time field is really nanoseconds since the Epoch (a much more usable value IMHO) then it can represent dates up to some time in the year 2262. I don't think it matters what it represents as long as we define it. If GLib only breaks in the year 2262 I'm quite Ok with that :-)

Kean


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]