Re: compiler optimization causing issues with glib



>>>>> I am using glib 2.26.0
>>>> [...]
>>>>>
>>>>> The application I'm having issues with uses gdbus
>>>> [...]
>>>>> - Thread stuck in futex wait inside kernel
>>>
>>> When I've attached GDB (or done SysRq), it's the gdbus thread that is stuck.
>>>
>>>> If your application is calling GDBus APIs from more than one thread
>>>> (it's unclear from your message whether it is or not), you should
>>>> certainly upgrade. GDBus earlier than 2.32 has known thread-safety bugs,
>>>> notably <https://bugzilla.gnome.org/show_bug.cgi?id=665211>.
>>>
>>> I use g_dbus_connection_emit_signal in the callback from
>>> g_child_watch_add.  Does that occur in the main loop?  Does the gdbus
>>> thread ever call back in to my code, or does it also go through the
>>> main loop?
>>>
>>>> If you can't upgrade all the way to 2.32 right now for whatever reason,
>>>> 2.28 or 2.30, or perhaps even a later 2.26.x version, would still be an
>>>> improvement: for instance,
>>>> <https://bugzilla.gnome.org/show_bug.cgi?id=651268> and
>>>> <https://bugzilla.gnome.org/show_bug.cgi?id=662100> were both fixed
>>>> since 2.28.0.
>>>
>>> I did attempt 2.28.0, same issue.  I'll just have to upgrade my host
>>> tools and build 2.32.  Still of course unclear whether I'm looking at
>>> a kernel or userspace issue.  The part that bothers me the most is
>>> that simply sending a SIGCHLD to the gdbus thread causes it to wake up
>>> and function.
>>
>> So I upgraded to 2.32.3.  I seem to have the original problem and more
>> unexplainable errors (there's a bit of time between each message, a
>> second or so):
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GObject-CRITICAL **: g_object_unref: assertion
>> `G_IS_OBJECT (object)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_message_new_method_reply:
>> assertion `G_IS_DBUS_MESSAGE (method_call_message)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_message_set_body:
>> assertion `G_IS_DBUS_MESSAGE (message)' failed
>>
>> (process:494): GLib-GIO-CRITICAL **: g_dbus_connection_send_message:
>> assertion `G_IS_DBUS_MESSAGE (message)' failed
>>
>> At the end is when it segfaulted.  GDB claimed is crashed here
>>
>> #0  g_dbus_method_invocation_return_value_internal
>> (invocation=0x28a00, parameters=0xbeb248d4, fd_list=0x0) at
>> gdbusmethodinvocation.c:357
>> #1  0x00008fc4 in handle_method_call (connection=0x1d800,
>> sender=0x2a238 ":1.11", object_path=0x2a8e0 "/",
>> interface_name=0x28538 "org.example.Interface", method_name=0x2a770
>> "Execute",
>>    parameters=0x18878, invocation=0x28a00, user_data=0x0) at main.c:166
>> #2  0x400fa470 in validate_and_maybe_schedule_method_call
>> (connection=0x2a238, message=0x28a00, registration_id=100472,
>> subtree_registration_id=<value optimized out>, interface_info=0x2a6f0,
>>    vtable=0x402e3330, main_context=0x8, user_data=0x40210de4) at
>> gdbusconnection.c:4733
>> #3  0x402127ec in g_main_dispatch (context=0x0) at gmain.c:2539
>> #4  g_main_context_dispatch (context=0x0) at gmain.c:3075
>> #5  0x40215068 in g_main_context_iterate (context=0x19b70, block=1,
>> dispatch=1, self=<value optimized out>) at gmain.c:3146
>> #6  0x402152d0 in g_main_loop_run (loop=0x1d2e8) at gmain.c:3340
>> #7  0x00009154 in main () at main.c:234
>>
>> The parameters variable is pointing at the stack -- no idea why as I
>> recorded the value sent to that function and it definitely is not that
>> one.  I'm recompiling with "-O0" to see if I can trap it with better
>> debug.
>>
>> I'm setting up a laptop with a newer glib version and will run the
>> same test there (of course, that is x86 not the target arch which is
>> arm), just to make sure.  I may end up recompiling everything just to
>> make sure I don't have someone odd laying around.
>
> I attached a very simple example program that is causing me issues.
> The issue seems to somehow relate to calling fork.  If I don't call
> fork, the program seems to run forever (where forever means I've run
> for a few hours).  If I call fork, eventually (on the order of a
> couple minutes) I will either segv or hang on a futex on the gdbus
> thread.  I've reduced it down to this from calling g_spawn_async just
> to keep it simple.  This is running on an arm9, linux 2.6.33.20 and
> glib 2.32.3.
>
> Now, I'm running the same application on an x86 machine with linux 3.2
> and some version of glib 2.32.  I have experienced no problems there
> (couple days runtime).
>
>
> This is my test script:
>
> #!/bin/sh
>
> while [ true ]; do
>        dbus-send --session --print-reply --dest=com.example.UpgradeServer
> --type=method_call / com.example.UpgradeInterface.ExecuteCommand
> done
>
>
> Any ideas on this?  Or does this seem more like a compiler or kernel issue?

I should also add that I added code as a test that verified that in
the method call, calling G_IS_DBUS_MESSAGE on invocation->message was
still true.  This is the first GLib-GIO-CRITICAL error I generally
get.  It was always true first thing in the method call, but if I
check it after forking, it will be false on the iteration that it
crashed.  The only thing in between is a call (in the original app) to
g_spawn_async.  So, somehow it gets trashed.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]