Re: string return result conventions



On Mon, 2008-09-15 at 08:59 +0000, Luke Kenneth Casson Leighton wrote:
> tim, thank you for responding.
> 
> >> therefore it's important for me to find out what glib / gobject memory
> >> conventions are, for strings.
> >
> > Strings querried through the property interfacem e.g.:
> >
> > gchar *string = NULL;
> > g_object_get (label, "label", &string, NULL);
> >
> > is always duplicated and needs to be a freed by the caller,
> > because the returned string might have been dynmically
> > constructed in the objects proeprty getter (i.e. not
> > correspond to memroy actually allocted for object member storage).
> 
>  ok - in this situation, fortunately we have control over that.  the
> property getter is entirely auto-generated.  the code review of the
> new webkit glib/gobject bindings brought to light the webkit
> convention of not imposing any "memory freeing" of e.g. strings on
> users of the library.  use of refcounting is done on c++ objects, for
> example.
> 
> the strings in webkit are unicode (libicu).  _at the moment_ i'm
> alloc-sprintf'ing strings - all of them - into utf-8 return results.

Why is a sprintf involved here? g_utf16_to_utf8() will convert a UTF-16
string into a UTF-8 string that can be freed with g_free().

> it was recommended to me that i create a string pool system, to keep a
> record of strings created, and, at convenient times, destroy them all
> (reminds me of apache pools and samba talloc).  exactly when is
> "convenient" is yet to be determined, which is the bit i'm not too
> keen on :)
> 
> looking at the auto-generated code in pywebkitgtk, i'm seeing use of
> PyString_FromString which copies arguments passed to it - there are a
> number of functions which return strings (not property-getters) - so
> there's definitely memory leaks (hurrah).
> 
> clearly, the best overall thing would be to actually return the
> unicode strings themselves rather than convert them (needlessly?) to
> utf-8.
> 
> if that's not possible to do, what would you recommend, in this situation?

Just return newly allocated UTF-8 strings. It's going to be a little bit
inconvenient, with some risk of leakage, for people using your API from
C, but that's the way it works out.

Even if you were writing from scratch in GLib, allocating all returns
might be the right approach. It's pretty mystical if get_node_name()
returns a const char * but get_text_content() returns a char *.

(We've made some effort to avoid "get" names in the GTK+ stack for
things that return allocated strings, but that doesn't work if you are
mapping the DOM.)

Trying to play tricks where the string returned magically gets freed
sometime in the future at an undefined time will definitely cause
problems.

- Owen




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]