Re: Some profiling


On Thu, Jan 29, 2004 at 04:39:16AM +0100, Tim Janik wrote:
> > stefan@luna1:~/src/beast-plain/beast/tests$ perftest
> > 1.371361 seconds for 10000 invocations => 7292.026062 invocations/second, 0.000137 sec per invocation
> > So the first assumption I had was that its the context switching (between
> > the threads) itself that is soo slow. But writing two programs, one that
> > uses select and one that uses pthread conditions, to just test context
> > switching, didn't confirm this. The timings on my system for 1000000 round
> > trips (switches back and forth) are:
> i'm not sure i understand why you did this in the first place.
> were these complete stand-alone programs? [...]
> so all in all, with perftest benching procedure invocations,
> you get a good mix of condition and select() waits, almost
> 50:50 i'd guess.

Well, yes, these were standalone programs, and I assumed that maybe one of
the methods of doing a context switch between two threads would be superior,
but that wasn't the case.

> >  - build glib-2.2.3 from scratch with -pg compiler option
> >  - build sfi, bse and tests with -pg compiler option
> >  - make a list of all object files required by perftest
> hm, interesting, how did you do that?

find -name *.o on the source trees, then create four files, called objlist.glib
objlist.sfi, objlist.bse and objlist.perftest, comment out the lines that
were not relevant, link together with small shell script.

I attached the files (if anybody wants to reproduce the results), however,
they probably will need to be changed over time.

> > I think it
> > might also be worthwile to check how to reduce the stress we put on the type
> > system (i.e. by trying to avoid redundant checks/conversions of GValues in
> > the critical path).
> depends on what's redundant here, have you had an actuall look at the
> checks in the path and can point out specific spots that are redundant?

I am working on that now.

> > At least I don't see why ~100 g_type_is_a calls per
> > procedure invocation should be really necessary.
> i think i do see where those come from. for the most part, functions do:
> value_set_int (GValue *v, int i)
> {
>   g_return_if_fail (G_VALUE_HOLDS_INT (v));
>   . . .
> }
> which is a type_check_is_value_type_U() and a g_type_is_a() each time.
> a similar performance bottle neck popped up in gtk programs with
> object instance checks, up to a point, where we switched to inlined
> type checks for GCC, which gave in a bit of the type safety being
> checked for. we could do something similar for values, please redo
> your benchmarks with the following macro definition applied (best is
> to put this in your local copy of gtype.h):
> #ifdef  __GNUC__
> #  define _G_TYPE_CVH(vl, gt)             (G_GNUC_EXTENSION ({ \
>   GValue *__val = (GValue*) vl; GType __t = gt; gboolean __r; \
>   if (__val && __val->g_type == __t) \
>     __r = TRUE; \
>   else \
>     __r = g_type_check_value_holds (__val, __t); \
>   __r; \
> }))
> #else  /* !__GNUC__ */
> #  define _G_TYPE_CVH(vl, gt)             (g_type_check_value_holds ((GValue*) vl, gt))
> #endif /* !__GNUC__ */
> and if you could make both available, that'd be great ;)

Ok, you can grab the profiles at:

Note that I now know that gprof does not handle threads. Thus, while I think
the call counts in the first profile I made were correct, probably the
sampling happend only in the first thread. So I don't think the percentages
were. Anyway, I made profiles with an without the preloading library that
I found (

I am not sure whether they are correct, as they indicate a much longer run
time than the time the program actually took, but maybe thats just because
gprof doesn't know about the threads, but the rest is still usable.

   Cu... Stefan
  -* Stefan Westerfeld, (PGP!), Hamburg/Germany
     KDE Developer, project infos at *-         


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]