[Evolution-hackers] Crazy Bug in Evolution? Glib? GTKHTML? GCC?



Ok, first off, I am a AMD-64 Gentoo user.  I know nobody cares about our
bug reports but I've reached the end of my rope so I am looking for
help.  As you'll see, my rope is pretty long. :)

I'm reporting this here since so far, I have only seen it with the new
Evolution 2.8.  Other software is GCC 4.1.1, Glib 2.12.3, gtkhtml 3.12,
gdk-pixbuf 0.22, GTK+ 2.10.3

First, the symptoms.  I open a message in LKML to read and Evolution
seg-faults.

It isn't just any message, but messages with a long CC line, which are
common on LKML.  It has to be long enough to make Evolution want to draw
the little plus sign next to the CC line.

Long research with GDB revealed a few things.

Deep inside GTKHTML and the GDK pixbuf loader there are Glib closures
being invoked.  The crash is caused by calling the POST_NOTIFY
closure_invoke_notifiers with a NULL closure.

At first I thought some code must be writing over the closure value by a
buffer overrun.  But no, that is not it.

As best I can tell, the trick is that, at least with my compiler, the
closure variable is stashed in register RBX (amd-64, remember).  The
assembly code expects RBX to still be correct after the return from
calling marshal(closure, ...) (glib's gclosure.c:490)

marshal is a function pointer, and tracing it down the call stack seems
endless, the backtrace ended up over 30 deep before I gave up.  This
glib closures code is crazy.

When it gets back up to g_closure_invoke again and completes
marshal(closure,...), the RBX register is set to 0.  I verified that
this really is the problem by using GDB to set the register back to the
value which closure had before the marshal call.  Result: a perfect
little plus image next to CC.

So my best guess is that one of the functions *somewhere* in that mess
isn't following the right sort of AMD-64 call conventions.

I'm sharing this with y'all in hopes someone might see the same thing,
have a burst of divine inspiration, has already fixed it, or something
similar to that.  If any of you Evolution people are also, or talk to,
the Glib and GTKHTML people, feel free to share this. :)

I'll probably put this on the GCC lists myself after trying a few more
variations of compilation options.  It takes a while to rebuild all that
code.

If anyone does see a crash while reading LKML messages with extra long
CC lines, here is my GDB script.  It could be helpful.

file /usr/bin/evolution
break main
run --disable-crash-dialog
break html_engine_url_requested_cb
break gclosure.c:490
disable 3
commands 2
  silent
  enable 3
  ignore 3 5
  continue
end
continue

-- 
Zan Lynx <zlynx acm org>

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]