Re: Couple of items

From: Owen Taylor <otaylor redhat com>
To: Tim Janik <timj gtk org>
Cc: Gtk+ Developers <gtk-devel-list gnome org>
Subject: Re: Couple of items
Date: Wed, 27 Mar 2002 19:20:16 -0500 (EST)
Tim Janik <timj gtk org> writes:

> On Wed, 27 Mar 2002, Owen Taylor wrote:
> 
> >  * Messages at a severity less than g_warning() [info,message,debug]
> >    go to stdout rather than stderr which makes them entirely useless,
> >    especially for libraries. (Libraries never should interfere with
> >    stdout.)
> 
> g_message() goes to stderr, exactly because it should be used instead
> of g_printerr(), so that programs can catch messages of a certain
> library and the user knows what module produced the output.

OK, the old code was a little unclear on the issue, and I've been
confused on this for a while. 

But that doesn't change the fact that g_info() and g_debug() are
useless and evil in going to stdout.
 
[...]

> > > > As for your points about escape codes and binary data:
> > > > 
> > > >  * I don't really understand what you mean by escape codes. Printing out 
> > > >    characters not in your locale is not generally going to work.
> > > 
> > > you can use escape codes to switch character sets, e.g. to get IBM style
> > > |
> > > +- corners, --- lines, double lines, etc...
> > > |
> > 
> > This is something that you'd only ever want to do if you were sure that
> > your output was going to a terminal. I'd argue that if you want that, you
> > shouldn't be using g_print(), g_printerr(), but probably open ("/dev/tty", ...)
> 
> regardless of your argument (which i'd share for theoretical purposes), in
> practice that's what people do with printf() and people who use GLib do this
> kind of stuff (as well as putting out binary data which is arguably questionable
> also) with g_print().

OK, I've made a quick survey through my entire GNOME2 build tree (about 2.6 million
lines of GLib2 based source code.)

In that code base, there are ~2600 uses of g_print.

The vast majority of these uses are debug or tests/examples spew (code
that is not enabled at all in production builds). Most of this debug
spew is ASCII only... so it doesn't matter. In a some cases, the debug
spew includes text from widgets (UTF-8), strings marked for
translation (UTF-8), err->message strings from (UTF-8) or filenames
(UTF-8 unless you set G_BROKEN_FILENAMES).

Beyond that:

 * there are about 100 uses (mostly in bonobo-activation) of g_print
   on text intended for viewing of the user; in 80 of these cases, the text
   is untranslated and presumably ASCII. In the other 20, the text is marked
   for translation, so is going to most likely be UTF-8 and need conversion
   for proper display.

 * There are about 90 uses of g_printerr(), 50 of them being on translated
   strings.

 * There are exactly two programs that use g_print() for text not intended
   for the user:

     pango/pango/querymodules.c - doesn't expect translation, but unless you
      have non-ascii filenames in the path below $sysconfdir, isn't going
      to break anything.

     gnome-control-center/root-manager/root-manager.c - doesn't expect
      translation, translation does cause problems in one case.

      (Checked the original 'usermode' version of this, and the inappropriate
       uses of g_print() were removed when it was converted to GLib-2.0)

There are no uses of g_print() on binary data or with TTY escapes.

So, approximate tallies, would be:
                                        No Convert             Convert

  Work properly                       |     1900        |      2550
--------------------------------------+-----------------+--------------------
  Sort of broken, but doesn't matter  |      500        |       30
--------------------------------------------------------+--------------------
  Broken                              |      70         |        1

  
(Where I'm pretty sure about the last line.)

Plus:

 * If we go the "no convert" route, then we have a gaping hole in our
   API, which, to fix, we have to introduce strangely named functions
   (Since g_print()/g_printerr() are already taken for something useless.)

   And we either have to do this in 2.0.x, and introduce dependencies
   on dot versions (something to be avoided if possible), or we have
   to wait 2-3 months for 2.2.

 * If we go the "convert" route, there is no problem at all, since if
   you want fprintf(stderr/stdout, ...) you just use those.

So, I really think we should go the route of making these functions convert.

Yes, we'll have to have a note in the GLib release announcements about
this, and yes, there is some pain involved. 

But we did include a note in the GLib-2.0.0 README about this
particular issue, so it's not out-of-the-blue.

> > > > We are going to have a chance of hurting some people over either way.
> > > > I'd really prefer to take the route that leaves us with useful 
> > > > g_print() and g_printerr() functions rather than useless g_print()
> > > > and g_printerr() functison.
> > > > 
> > > > Maybe we should hold off on this patch until after 2.0.1, so we can get some
> > > > confirmation that people aren't using g_print() for binary data, but that
> > > > worries me too. Every day we have a GTK+ out there with buggy g_log() functions
> > > > is a day where someone may write:
> > > > 
> > > >  str = g_utf8_to_locale (message);
> > > >  g_warning ("%s", str);
> > > > 
> > > > [ Or, mostly likely, get it wrong and write g_warning (str); ]
> > > 
> > > i'm not sure you got my note correctly. the functions in question are only
> > > g_print() and g_printerr(). everything that goes through g_logv() (i.e. warnings,
> > > errors, etc...) is converted to the current locale. note that the conversion
> > > occours in the g_log default handler though, which means, if you install
> > > a user handler for a specific log domain to catch log messages, you're still
> > > passed the original unconverted UTF-8 string.
> > 
> > No, I understood what the situation is. What I'm saying is, is:
> > 
> >  - I'd like to make everything (g_log() and g_print*()) convert.
> 
> i agree that g_log() should convert to locale and that there should be stdout
> and stderr print variants that do. however, i think we shouldn't break completely
> valid uses of the current g_print() and g_printerr() which are based on the
> contract of printf() mimicking.

And we leave most completely valid uses of g_print() and g_printerr()
*broken* if we don't convert.
 
> >  - If we are worried about g_print() and binary streams, maybe we should
> >    put the entire patch in after we release 2.0.1 so people have some
> >    time to fool around with it before we release.
> 
> don't throw out the baby with the bath-water. there're lots of important
> fixes and cleanups in CVS now, the conversion in question are just a few lines
> in g_log_default_handler():
> 
>       if (g_get_charset (&charset))
>         g_string_append (gstring, message);     /* charset is UTF-8 already */
>       else
>         {
>           if (!g_utf8_validate (message, -1, NULL))
>             {
>               g_string_append (gstring, "[Invalid UTF-8] ");
>               g_string_append (gstring, message);
>             }
>           else
>             {
>               string = g_convert_with_fallback (message, -1, charset, "UTF-8",
>                                                 ".", NULL, NULL, NULL);
>               g_string_append (gstring, string);
>               g_free (string);
>             }
>         }
> 
> that had to be changed. though i don't think there's a point in reverting
> that part, other than to use this as a threat in order to push conversion
> for g_print() and g_printerr() without further discussion.

Basically, I see the gmessages.c changes being in three parts:

 - Cleanups and fixes. Big, and not critical. (It's certainly better
   code now, but there were not problems that people were running into
   with the old code.)

 - Conversion for g_logv(). Small and critical.

 - Conversion for g_print[err](). Small and critical.

I don't see a lot of point in doing the first, and not doing the second two.
I think it will be confusing to do the second and not do the third, then
do the third later. So, I'm hoping we can get consensus on the third
part.

Doesn't look like I'm going to get 2.0.1 out tonight anyways, so I'll work
on other parts of it, and wait for your response in the.

Regards,
                                        Owen
Follow-Ups:
- Re: Couple of items
  - From: Thomas Leonard
- Re: Couple of items
  - From: Tim Janik
References:
- Re: Couple of items
  - From: Tim Janik
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]