Re: compiler optimization causing issues with glib
- From: Mike <puffy taco gmail com>
- To: Simon McVittie <simon mcvittie collabora co uk>
- Cc: gtk-devel-list gnome org
- Subject: Re: compiler optimization causing issues with glib
- Date: Fri, 1 Jun 2012 11:17:22 -0500
>>>>> I am using glib 2.26.0
>>>> [...]
>>>>>
>>>>> The application I'm having issues with uses gdbus
>>>> [...]
>>>>> - Thread stuck in futex wait inside kernel
>>>
>>> When I've attached GDB (or done SysRq), it's the gdbus thread that is stuck.
>>>
>>>> If your application is calling GDBus APIs from more than one thread
>>>> (it's unclear from your message whether it is or not), you should
>>>> certainly upgrade. GDBus earlier than 2.32 has known thread-safety bugs,
>>>> notably <https://bugzilla.gnome.org/show_bug.cgi?id=665211>.
>>>
>>> I use g_dbus_connection_emit_signal in the callback from
>>> g_child_watch_add. Does that occur in the main loop? Does the gdbus
>>> thread ever call back in to my code, or does it also go through the
>>> main loop?
>>>
>>>> If you can't upgrade all the way to 2.32 right now for whatever reason,
>>>> 2.28 or 2.30, or perhaps even a later 2.26.x version, would still be an
>>>> improvement: for instance,
>>>> <https://bugzilla.gnome.org/show_bug.cgi?id=651268> and
>>>> <https://bugzilla.gnome.org/show_bug.cgi?id=662100> were both fixed
>>>> since 2.28.0.
>>>
>>> I did attempt 2.28.0, same issue. I'll just have to upgrade my host
>>> tools and build 2.32. Still of course unclear whether I'm looking at
>>> a kernel or userspace issue. The part that bothers me the most is
>>> that simply sending a SIGCHLD to the gdbus thread causes it to wake up
>>> and function.
>>
>> So I upgraded to 2.32.3. I seem to have the original problem and more
>> unexplainable errors (there's a bit of time between each message, a
>> second or so):
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_message_new_method_reply:
>> assertion `G_IS_DBUS_MESSAGE (method_call_message)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_message_set_body:
>> assertion `G_IS_DBUS_MESSAGE (message)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_connection_send_message:
>> assertion `G_IS_DBUS_MESSAGE (message)' failed
>>
>> At the end is when it segfaulted. GDB claimed is crashed here
>>
>> #0 g_dbus_method_invocation_return_value_internal
>> (invocation=0x28a00, parameters=0xbeb248d4, fd_list=0x0) at
>> gdbusmethodinvocation.c:357
>> #1 0x00008fc4 in handle_method_call (connection=0x1d800,
>> sender=0x2a238 ":1.11", object_path=0x2a8e0 "/",
>> interface_name=0x28538 "org.example.Interface", method_name=0x2a770
>> "Execute",
>> parameters=0x18878, invocation=0x28a00, user_data=0x0) at main.c:166
>> #2 0x400fa470 in validate_and_maybe_schedule_method_call
>> (connection=0x2a238, message=0x28a00, registration_id=100472,
>> subtree_registration_id=<value optimized out>, interface_info=0x2a6f0,
>> vtable=0x402e3330, main_context=0x8, user_data=0x40210de4) at
>> gdbusconnection.c:4733
>> #3 0x402127ec in g_main_dispatch (context=0x0) at gmain.c:2539
>> #4 g_main_context_dispatch (context=0x0) at gmain.c:3075
>> #5 0x40215068 in g_main_context_iterate (context=0x19b70, block=1,
>> dispatch=1, self=<value optimized out>) at gmain.c:3146
>> #6 0x402152d0 in g_main_loop_run (loop=0x1d2e8) at gmain.c:3340
>> #7 0x00009154 in main () at main.c:234
>>
>> The parameters variable is pointing at the stack -- no idea why as I
>> recorded the value sent to that function and it definitely is not that
>> one. I'm recompiling with "-O0" to see if I can trap it with better
>> debug.
>>
>> I'm setting up a laptop with a newer glib version and will run the
>> same test there (of course, that is x86 not the target arch which is
>> arm), just to make sure. I may end up recompiling everything just to
>> make sure I don't have someone odd laying around.
>
> I attached a very simple example program that is causing me issues.
> The issue seems to somehow relate to calling fork. If I don't call
> fork, the program seems to run forever (where forever means I've run
> for a few hours). If I call fork, eventually (on the order of a
> couple minutes) I will either segv or hang on a futex on the gdbus
> thread. I've reduced it down to this from calling g_spawn_async just
> to keep it simple. This is running on an arm9, linux 2.6.33.20 and
> glib 2.32.3.
>
> Now, I'm running the same application on an x86 machine with linux 3.2
> and some version of glib 2.32. I have experienced no problems there
> (couple days runtime).
>
>
> This is my test script:
>
> #!/bin/sh
>
> while [ true ]; do
> dbus-send --session --print-reply --dest=com.example.UpgradeServer
> --type=method_call / com.example.UpgradeInterface.ExecuteCommand
> done
>
>
> Any ideas on this? Or does this seem more like a compiler or kernel issue?
I should also add that I added code as a test that verified that in
the method call, calling G_IS_DBUS_MESSAGE on invocation->message was
still true. This is the first GLib-GIO-CRITICAL error I generally
get. It was always true first thing in the method call, but if I
check it after forking, it will be false on the iteration that it
crashed. The only thing in between is a call (in the original app) to
g_spawn_async. So, somehow it gets trashed.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]