Re: Lacking of a ref-counted string.

From: "Havoc Pennington" Date: 23/08/2008 23:58
> On Sat, Aug 23, 2008 at 8:29 AM, Freddie Unpenstein
>> This issue comes up repeatedly, and each time the response is to ask for
>> proof that it would make things better. How about the opposite, where's the
>> proof that it would make things worse?
> In software, everything starts with proof it makes things worse,
> because changing code definitely takes time and introduces bugs ... I
> mean that in all seriousness. This is especially true for API changes
> that ripple through hundreds or even thousands of third-party apps,
> which is what most GLib changes are. So if you do a pros and cons list
> on any GLib change, there is a big con, right up front, 100% known.

Yes, of that I am well aware.  That's always the first answer that people throw out.  Personally, I like it make it the last, because even if you don't go with it, there's a good chance some interesting ideas worth pursuing will come out in the meantime.  It just annoys me when someone responds to a perfectly good proposal with that alone.

> Anyway... on ref-counted strings, I don't really remember this thread
> happening before (archive link?) so I'm not sure it comes up
> repeatedly.

I'm partial to this topic myself, which is probably why I take note when it comes around every so often.  It tends to get side-swiped rather than tackled head-on, more as a means to another end.  But the idea of improving strings in general, has popped up three times just recently.  There was the issue about avoiding the need to sanity-check strings, there's discussions about encodings, etc.  A concerted push for some kind of enhanced string API would make dealing with these things intelligently a whole lot more realistic.

> I'm not sure anyone on this thread has seriously proposed changing the
> whole stack of GTK APIs to use refcounted strings - certainly nobody
> has really written up what that would mean in detail, and tried to
> analyze pros and cons. If you're proposing this, I would say step back
> and explain what the actual proposal is, before addressing the
> objections ;-)

I am sure people have hinted at it, and been shot down with the "too hard" argument right off the bat.  At that point they don't take it to the point of being a serious proposal.  But we have a push to seal the API's going on, and there are mutterings of changing the entire GTK stack to be entirely more flexible because it's hitting fundamental limitations that are blocking some new ideas.  These are changes in the same scope as what I'm trying to get some real serious proposing about.  It's about time someone DID seriously propose it, with that one guaranteed "too hard" issue put aside for the moment, figure out whether it IS worth doing in its own right, and THEN about whether it can be done.  Because personally, I believe it could be done progressively.

My proposal is ref-counting and copy-on-write-if-unshared semantics.  I got used to it in a scripting environment, and then when I finally understood exactly how they were doing it, I realised it's exactly the same practice as I've been using for over a decade all over the place, both in my own stuff and in other languages, etc.

>> Exactly how much slower would GTK get if it had to ref-count instead of copy strings everywhere
> If you're talking about converting existing APIs to refcounted
> strings, that's a very different proposal from just adding some kind
> of refcounted string feature. It would break thousands of apps, or
> else duplicate hundreds of API entry points ...

GTK 3.  You start by changing the suitable API portions, and wrapping the corresponding string arguments with a type-casting #define, and simply reverting it back on the inside.  That's a fairly tedious change but isn't likely to break very much.  Then you start to actually change things as you go, starting with just the widgets being actively worked on, and work your way out from there.  Once you have an opaque "string" entity, you can start to seriously look at the pros and cons of various enhancements.  Until then it's just farting in the wind, if you'll forgive the expression.

Heck, even go ahead and duplicate those API entry points, and phase out the old ones in GTK 4.  Not EVERY string needs to be ref-counted, as I said in my previous message.  Just the ones that get held in an objects and given back or passed on to another one later.  Several of the major updates planned are going to have to do much the same radical API change themselves, though on a slightly smaller scope.  Again, what I really want at this point, is for someone who knows enough about this kind of experimentation to get off the "too hard" track long enough to get some real idea of how it would affect things one the transition is complete.

I've done this kind of API change in my own code, where I'm building a new widget that will have strings hanging around all of its own, and I've found it generally to be a good thing.  But it's pointless if you're just extending an existing widget, and most of the strings are going to be kept by the parent object un-counted.  Doing it in my own code is one thing, trying to dig into the heart of the beast is a little beyond my skills at this point in time.

>> How much more complicated is it for bindings (most of which use ref-counted strings themselves) to wrap a reference to a
>> string instead of wrapping a whole new copy of the string.
> This one I can answer: most bindings would have to copy the strings
> into a native string type just as they do now. A few, maybe Vala and
> C++, could conceivably avoid the copy. So refcounted strings would not
> matter much for bindings in general but might help the C-like
> bindings.

Okay, yes, they would have to copy them eventually.  In the meantime, I'd expect most of them have the ability to hold a reference to a GTK string and copy it only when actually needed.  So if they're merely moving a value from one widget property to another, there's still no need to ever copy the string.  Heck, many of them could likely leave the string in its orignal GTK form.  Any scripting language that uses immutable strings should be able to simply grab the raw string pointer and copy it into their script-side value, holding a reference to keep it stable.

>> The last argument I often hear against ref-counted strings, is thread-safety.
> I thought people were using that as an argument _for_ refcounted
> strings. (Though I agree with your sentiment that GLib's approach to
> threads is not to make each individual data structure transparently
> thread safe.)

It is possible I miss-read that bit...  I've only used threading a handful of times, mostly just to wrap some library that insists on blocking, so I haven't paid it a lot of attention.


>> But even just ref-counting alone would help, and even if it hurts efficiency a little, I fully believe it would be worth it.
> While I don't know what you're proposing in detail, I can't imagine
> efficiency is the issue. Huge API changes would be a much, much more
> significant factor.

I'm a former assembler programmer who's stuck using an over-worked and under-resourced computer (read: painfully slow).  Efficiency is always on my mind.  ;)

> The other potentially obvious factor is
> programming convenience; refcounted strings in the GTK APIs would make
> some things easier, but other things (such as passing in a string
> literal) harder.

Same problem that was faced with i18n.  All of a sudden string literals just weren't quite so literal any more.  Anything that takes a string and then throws it away, doesn't need to be bothered with ref-counting.  So many uses of string literals will remain untouched anyhow.

> Illustrating a particular proposal by "porting" some
> sample apps might demo these tradeoffs.

Yes, but say absolutely nothing about the overall benefit.  I use ref-counted strings sprinkled through my own apps, but it's pointless to ref-count a string that isn't going to be shared with anything but a widget that doesn't understand ref-counted strings, so the opportunities to do that within an app are somewhat limited.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]