Re: gconfd process eating CPU
- From: Brian Cameron <Brian Cameron Sun COM>
- To: Mark McLoughlin <markmc redhat com>
- Cc: gconf-list gnome org
- Subject: Re: gconfd process eating CPU
- Date: Mon, 10 Jan 2005 18:21:10 -0600
Mark:
Firstly, you've never mentioned what version of GConf, ORBit2, glib
etc. this is. I see its using liblinc, which means its quite old, but
I've no idea what code to read if I don't know exactly how old.
Right. Yes, we are seeing this problem with our GNOME 2.0 distribution which
included linc 0.7.0 and ORBit 2.4.3.
On Tue, 2005-01-04 at 14:21 -0600, Brian Cameron wrote:
To get more information, I then ran "truss -fald -u a.out -u ld -u '*::'".
The "-u a.out" tells truss to report on functions within the binary
executable
Right, so this stuff is useful and gives an idea of what is going on.
I assume you've snipped the log this short because it just repeats
after this?
Yes.
Anyway, what's happening is that the main loop seems to have a big
bunch of LincSources, but each source is for a connection that has been
closed by the client. For whatever reason, these sources aren't being
removed as soon as gconfd-2 detects that connection has been closed by
the other end. So, you're talking about a bug in ORBit2 somewhere.
Let me narrate ...
Thanks for this, it was a helpful introduction to the code.
[...]
4335/1 1: 0.9887 -> liblinc:linc_source_check(0x99708, 0x2, 0x1, 0x0)
4335/1 1: 0.9890 <- liblinc:linc_source_check() = 1
4335/1 1: 0.9892 -> libglib-2.0:g_ptr_array_add(0x32f20, 0x99708, 0x3b, 0x0)
4335/1 1: 0.9894 -> libglib-2.0:g_ptr_array_maybe_expand(0x32f20, 0x1, 0x0, 0x0)
4335/1 1: 0.9897 <- libglib-2.0:g_ptr_array_maybe_expand() = 0x32f20
4335/1 1: 0.9899 <- libglib-2.0:g_ptr_array_add() = 0x32f20
4335/1 1: 0.9901 -> liblinc:linc_source_check(0x97d60, 0x4, 0x3, 0x0)
4335/1 1: 0.9904 <- liblinc:linc_source_check() = 1
4335/1 1: 0.9906 -> libglib-2.0:g_ptr_array_add(0x32f20, 0x97d60, 0x3b, 0x0)
4335/1 1: 0.9908 -> libglib-2.0:g_ptr_array_maybe_expand(0x32f20, 0x1, 0x0, 0x0)
4335/1 1: 0.9911 <- libglib-2.0:g_ptr_array_maybe_expand() = 0x32f20
4335/1 1: 0.9913 <- libglib-2.0:g_ptr_array_add() = 0x32f20
4335/1 1: 0.9915 -> liblinc:linc_source_check(0x97f28, 0x4, 0x3, 0x0)
[snip same repeated several times]
-> Here we have a big bunch of sources saying data is pending which
seems odd
[...]
4335/1 1: 1.0161 <- libglib-2.0:g_main_context_check() = 1
4335/1 1: 1.0163 -> libglib-2.0:g_main_context_dispatch(0x35e20, 0x0, 0x1, 0x0)
4335/1 1: 1.0166 -> libglib-2.0:g_main_dispatch(0x35e20, 0x0, 0x0, 0x0)
4335/1 1: 1.0169 -> libglib-2.0:g_source_callback_ref(0x986e0, 0x31, 0x0, 0x0)
4335/1 1: 1.0171 <- libglib-2.0:g_source_callback_ref() = 0x986e0
4335/1 1: 1.0173 -> libglib-2.0:g_source_callback_get(0x986e0, 0x99708, 0xffbfecc8, 0xffbfeccc)
4335/1 1: 1.0176 <- libglib-2.0:g_source_callback_get() = 0x986e0
4335/1 1: 1.0178 -> liblinc:linc_source_dispatch(0x99708, 0xff1e3768, 0x34900, 0x1000)
-> Now we're dispatching the first source that reported pending data
4335/1 1: 1.0185 -> liblinc:linc_connection_io_handler(0x0, 0x1, 0x34900, 0x3b)
4335/1 1: 1.0189 -> libgobject-2.0:g_type_check_instance_cast(0x34900, 0x50, 0x0, 0x0)
4335/1 1: 1.0191 <- libgobject-2.0:g_type_check_instance_cast() = 0x34900
4335/1 1: 1.0194 -> libgobject-2.0:g_object_ref(0x34900, 0x50, 0x0, 0x2)
4335/1 1: 1.0196 -> libgobject-2.0:g_type_check_instance_is_a(0x34900, 0x50, 0x0, 0x0)
4335/1 1: 1.0199 <- libgobject-2.0:g_type_check_instance_is_a() = 1
4335/1 1: 1.0201 <- libgobject-2.0:g_object_ref() = 0x34900
4335/1 1: 1.0203 -> liblinc:d_printf(0xff1e5d04, 0xa, 0x3985c, 0x2)
4335/1 1: 1.0206 <- liblinc:d_printf() = 0xff1e5d04
4335/1 1: 1.0208 -> libORBit-2:giop_connection_handle_input(0x34900, 0xa, 0x3985c, 0x2)
4335/1 1: 1.0210 -> libgobject-2.0:g_object_ref(0x34900, 0x0, 0x0, 0x0)
4335/1 1: 1.0213 -> libgobject-2.0:g_type_check_instance_is_a(0x34900, 0x50, 0x0, 0x0)
4335/1 1: 1.0215 <- libgobject-2.0:g_type_check_instance_is_a() = 1
4335/1 1: 1.0217 <- libgobject-2.0:g_object_ref() = 0x34900
4335/1 1: 1.0220 -> liblinc:linc_connection_read(0x34900, 0x14418c, 0x0, 0x0)
4335/1 1: 1.0222 -> liblinc:d_printf(0xff1e5bb0, 0x0, 0xa, 0x0)
4335/1 1: 1.0224 <- liblinc:d_printf() = 0xff1e5bb0
4335/1 1: 1.0228 <- liblinc:linc_connection_read() = 0
-> Here we read the connection and find that it returns zero. That
means the other side of the connection has closed. Looking at
recent ORBit code I'd expect linc_connection_state_changed() to
be called and we'd close our side and remove the source
This makes sense, although looking at the code for link_connection_io_handler
(both the older code and the current CVS code), I don't see how
link_connection_state_changed could get called in this scenario. We know that
in order for the handle_input function to be called that the condition value
passed into link_connection_io_handler must contain LINC_IN_CONDS (G_IO_PRI or
G_IO_IN according to linc-private.h).
I don't see any logic that changes the condition to include LINK_ERR_CONDS
when we read zero length. link_connection_read simply returns 0 when !len is
true. And giop_connection_handle_input just returns TRUE when the
link_connection_read returns 0. The return code isn't checked in
link_connection_io_handler.
I note that all the logic in link_connection_io_handler after the call
to handle_input is only is entered if the condition contains G_IO_OUT or
LINK_ERR_CONDS. I notice that if if falls into this logic that
the link_connection_state_changed function gets called, but don't see
how it will do so (since condition seems to still be G_IO_PRI or G_IO_IN
as mentioned above).
It seems only link_connection_state_changed_T_R, queue_flattened_T_R,
link_connection_flush_write_queue_T_R, and link_connection_writev set
LINK_ERR_CONDS. But I don't see any place where any of these
functions would be called when condition is LINC_IN_CONDS.
Am I missing something? Should link_connection_state_changed get called
somewhere else besides the link_connection_io_handler function? Or does
the condition value get modified in a way that I am missing?
4335/1 1: 1.0230 -> libgobject-2.0:g_object_unref(0x34900, 0x14418c, 0x0, 0x0)
4335/1 1: 1.0233 -> libgobject-2.0:g_type_check_instance_is_a(0x34900, 0x50, 0x0, 0x0)
4335/1 1: 1.0235 <- libgobject-2.0:g_type_check_instance_is_a() = 1
4335/1 1: 1.0238 <- libgobject-2.0:g_object_unref() = 0x34900
-> We're unreffing the LincConnection, but there's nothing about
removing the source. This looks like we're just releasing the
ref we took at the start of the function.
[...]
Anyway, that should point you in the right direction to debug the
problem.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]