Re: _wstat on Windows (actually stat stuff in general)
- From: Kean Johnston <kean johnston gmail com>
- To: Ryan Lortie <desrt desrt ca>
- Cc: gtk-devel-list gnome org
- Subject: Re: _wstat on Windows (actually stat stuff in general)
- Date: Wed, 28 Sep 2011 10:00:48 +0200
This is true. On Windows the code is actually fairly well insulated from
random size changes based on macros. The only problem is Tor chose the
"wrong" stat structure IMHO. These days, files > 4GB are common. Since GLib
is a platform tool and you would rightfully expect to be able to write,
say, an archival tool that could compress big files, and this currently
isn't possible. Or you may want to write a backup tool and preserve
timestamps. Also not currently possible. Of course said applications could
just use other API's or specify their own stat structures and not use
g_stat() at all but in that case what is the purpose of having g_stat() at
all? Or any of the other gstdio.h wrappers. The purpose (I believe) is to
ensure that if you use those wrappers they will behave the same way on all
platforms GLib has been ported to. This is largely the case but it does
break down on the fringes.
I disagree that we don't have an ABI to maintain on Windows. I think
people on Windows are somewhat likely to download precompiled binary
versions of our DLLs for use with developing their software (since the
build process is so complicated). We can easily introduce extremely
difficult-to-debug situations for people who assumed that the DLL was
binary-compatible with the old one.
And therein lies the EXACT reason why having a well defined stat structure
with data types wide enough to cover all cases is such a requirement.
While I mostly agree with this, it's only true in the case that both the
code calling g_stat() and the code inspecting its result are always in
the same codebase.
Certainly, and we could do some rather trivial things to insulate against
that. Call the structure something else (although as you mentioned GStatBuf
is sufficiently new that I don't think we'd have a problem). Or announce
the breakage prominently. We can work around the ABI change.
library with no code changes will change the ABI of the library). I'm
not sure there are any cases of this, but it's something to be aware of.
This means that there is an awful lot of valid existing code (including
within GLib itself) that looks like this:
struct stat buf;
which (if I understand your proposal correctly) would be badly broken.
That code would of course change to
g_stat (filename, &buf);
The code wouldn't be broken at all. In fact it would be less broken. If,
for example, GLib wasn't compiled with _FILE_OFFSET_BITS=64 then
internally, all of its usage of g_stat() can only deal with 31-bit file
sizes. User code using GLib compiled WITH that set will support file sizes
with 63 bits.
Almost all of the functions currently "wrapped" in gstdio.h are problematic
with LFS. On Linux they are currently just macros. Changing those to
functions won't break any existing code on Linux because those symbols
aren't even in libglib-2.0.so. But in order to provide a consistent,
doesn't-change-with-macros interface that can *become* the GLib ABI is
useful. That code can be constructed inside GLib such that it is always
compiled with LFS in mind. For example, we can ensure that g_open() always
calls open64 or whatever it's called on the system in question. By rigidly
defining GStatBuf to use identically sized on all platforms fields, we make
g_stat() more useful. Heck, it even becomes possible to share binary dumps
of the thing on like-endian machines should you want to do that.
In a nutshell, if gstdio.[ch] were slightly tweaked to be actual functions
and not veneer macro wrappers, and they all took suitably sized arguments
then the code becomes that much more portable and easier to debug and less
surprising. I'd also add the various seek functions as they too are
problematic because they take a file offset but that also doesn't break any
ABI it just adds a new one. I think the *only* platform affected by the
changes I am proposing is Windows (or any UNIX system that defines
G_STDIO_NO_WRAP_ON_UNIX) and that only for g_stat() and that can be easily
worked around. But at the end of it we will have a completely consistent
API across all platforms. The *only* thorny question really, is what width
do we make the st_?time fields? 32-bit or 64-bit (or, as on MacOS, both).
And if we make them 64-bit what exactly does that represent? Nanoseconds
since the Epoch? I think the easiest way by far is to have those fields
defined thus in the structure:
gint32 st_atime, st_mtime, st_ctime;
gint64 st_atime64, st_mtime64, st_ctime64;
That supports the vast majority of the code out there that is UNIX-centric
and supports the notion of a 31-bit time field measured as seconds since
the epoch of Jan 1 1970. But code that wants higher precision can use the
64-bit variants on systems that provide it and simply multiply out the
32-bit ones to give usable values on those that don't.
Note that on Windows the 64-bit time fields are just more seconds since the
Epoch, with an upper limit of Dec 31 23:59:59 3000. If we decide that the
64-bit time field is really nanoseconds since the Epoch (a much more usable
value IMHO) then it can represent dates up to some time in the year 2262. I
don't think it matters what it represents as long as we define it. If GLib
only breaks in the year 2262 I'm quite Ok with that :-)
] [Thread Prev