Re: finally caught an error on crash
- From: Jack <ostroffjh users sourceforge net>
- To: balsa-list gnome org
- Cc:
- Subject: Re: finally caught an error on crash
- Date: Tue, 05 May 2020 16:11:50 -0400
On 2020.05.05 16:01, Albrecht Dreß wrote:
Am 04.05.20 23:01 schrieb(en) Jack via balsa-list:
This continues to happen at a low, but still annoying frequency.
The line number in the error is now 247, having been 249 for a
while, presumably just due to other edits to that file. As I can
remember, it happens after either deleting message(s) from the
inbox, and/or moving message(s) to different mailboxes, in case that
points to any possibilities.
Hmmm, the error message (failed assertion msgno <=
mbox->msgno_2_msg_info->len) “smells” like a corrupted GPtrArray;
maybe caused by a race between two threads, when messages are moved
and/or deleted. I'm not really familiar with the mailbox code
(Peter? Pawel? Are you?), so this is a wild guess, though. The
failing function seems to be message_info_from_msgno() which is
called in several places; do you have a backtrace from a crash?
I agree where it happens, but no, I have not yet caught the crash when
running under a debugger.
Given the recent changes (in the cleanup-logging branch) I'm
wondering if I might just for my own use add some additional debug
statements, either in the function where the g-assert fails, or else
before the calls to that function.
As running in gdb is not always feasible, this may actually be
helpful. Just add g_debug() statements…
I finally did this yesterday - in fact putting a call right before
every call to message_info_from_msgno(). There are a LOT of those
calls, sometimes running through all messages in INBOX in succession,
and sometimes even the same call two or three times in a row. I can
easily sent a sample log. But no crash yet since doing that.
Any quick pointers (or even just a pointer to an example line or
two) would be helpful, and also advice on which debug domain to use
(or even to set for current testing) or how to go about adding a
new domain.
If you use the new branch, the log domain is set to (line 50) to
#define G_LOG_DOMAIN "mbox-mbox"
so running balsa by calling
G_MESSAGES_DEBUG=mbox-mbox /path/to/balsa
would print /only/ messages from this domain.
I actually realized that that domain seems to cover all the code in
that file (and likewise for some of the others, so I didn't need to
define the domain, just set the env var.
It might also be helpful (given that my assumption of a race
condition is true) to add debug messages to libbalsa/mailbox.c,
namely in libbalsa_lock_mailbox() and libbalsa_unlock_mailbox() (and
probably more), like
g_debug("%s: mbox %s", __func__, libbalsa_mailbox_get_name(mailbox));
That file didn't have its own log domain – which is bad, I just
pushed a tiny change to the branch setting it to
#define G_LOG_DOMAIN "mailbox"
The above call should now read
G_MESSAGES_DEBUG="mailbox mbox-mbox" /path/to/balsa
I just pulled your changes, and merged into my debugging branch. I'll
try to add that later today.
Finding the source of a race (which, again, is just a guess) is
always /very/ difficult, and adding debug messages may actually
change the internal timing of the application, effectively hiding the
bug…
Heisenbug, huh?
Hope this helps,
I think confirms my thinking, and pushes me along the right path.
Albrecht.
Jack
------quoted attachment------
_______________________________________________
balsa-list mailing list
balsa-list gnome org
https://mail.gnome.org/mailman/listinfo/balsa-list
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]