Re: Rescanning IMAP folders
- From: Dave Cridland <dave cridland net>
- To: Philip Van Hoof <spam pvanhoof be>
- Cc: tinymail-devel-list gnome org
- Subject: Re: Rescanning IMAP folders
- Date: Tue, 13 Feb 2007 13:34:56 +0000
On Tue Feb 13 11:39:54 2007, Philip Van Hoof wrote:
Rescanning an IMAP folder has two purposes:
-> Updating the flags
-> Removing the expunged messages from both the cache and the
summary
The second part is implemented badly. That's because IMAP has two
things
to identify one message:
Its sequence number, and its UID.
The sequence number MUST be the index of the array of the summary.
The
UID is a field like any other per item.
Almost... The mailbox is always ordered by UID. Like every other item
except FLAGS (discounting ANNOTATE for the minute) it's also
immutable.
An expunge will happen like this: "* 2 EXPUNGE". This means that the
message with sequence 2 in the current folder has been expunged.
What will happen with the Push-Email implementation (the IMAP IDLE)
is
that simply message #2 will be removed from the summary.
But if there was no IMAP IDLE for that folder, the only way to
synchronize both is to check one by one whether the sequence (the
index
of the array + 1, as the IMAP sequence starts at 1 and C arrays
start at
0) and the UID match.
No, you do exactly the same thing. IDLE or not IDLE should always be
treated identically. (For a client - for a server, it's a little
different).
There's one case where you need to be careful, and that's when you're
in IDLE, and want to send a command with sequence numbers in - then
you have to say DONE and wait for the IDLE to complete before you
send the command.
Imagine an expunge of message #1. What will happen:
SEQUENCE: 1 2 3 4 5
INDEX: 0 1 2 3 4
UID: 1 2 4 5 6
Server expunged #1
SEQUENCE: 1 2 3 4
INDEX: 0 1 2 3 4
UID: 1 2 4 5 6
No, that's wrong - you know that the first message has been removed,
so that has to be UID 1 that's removed.
But was this really necessary? Not really, we can re-calculate the
local
summary to avoid the refetching of all in imap_update_summary,
right?
If you treat the mailbox as a 1-indexed array of messages ordered by
UID, then you just remove the entry the EXPUNGE tells you to.
If any expunge was detected, the condstore implementation isn't
used.
The condstore implementation doesn't cope with expunges. So it ain't
used for those situations. Which is why imap_rescan is ALWAYS
important
enough to be *very* optimized.
Gah. Sort of.
If you're connected, then there's no need to do any scanning at all -
you get the EXPUNGE, you remove that message, you're done.
(That's not to say rescanning on SELECT shouldn't be carefully
optimized, though).
The detection of whether expunges happened checks the last local UID
with "UIDNEXT - 1" and compares the EXISTS with the local size.
Only if
CONDSTORE is activated, "LAST LOCAL UID == REMOTE UIDNEXT - 1" and
"LOCAL SIZE == EXISTS" do I assume that condstore can be used. Else
imap_rescan is used. Is that a correct assumption?
No... Not quite. Look at how Polymer (or rather, the IPL) does it. At
the moment, it's synchronous code, so it's very easy to follow.
http://svn.dave.cridland.net/svn/projects/infotrope/python/infotrope/imap.py
It's line 3148, mailbox_reselected() - the code below it is all
rather heavily used too - down to about line 3392 is all about
keeping the basic UID mapping in sync.
Loosely, on SELECT, I cache:
A) EXISTS
B) UIDNEXT
And I set C to 0. Every time I witness an EXPUNGE, I increment C.
On the next SELECT, I look at the new value of EXISTS (D) and UIDNEXT
(E).
If (E == B), then you know no new messages have arrived. We'll call
this condition F.
If (A - D + C) == (B - E), then we know that no messages have been
expunged. We'll make this condition G.
So, if F is true, then we'll move onto looking at HIGHESTMODSEQ, and
if we've seen those changes, we can avoid doing a FETCH at all. Neat,
eh?
If G is not true, then we have expunges somewhere. The fastest way to
update these is to use "UID SEARCH ALL", or, with ESEARCH, "UID
SEARCH RETURN () ALL". This can still be quite big with huge
mailboxes, though. Luckily for us, users are rather predictable, and
almost all EXPUNGE events happen within the last few messages, so I
find them by looking at the UIDs corresponding to the sequence
numbers for the last messages I knew about, which generally finds
them. (If not, the code gets scary).
You could easily enough skip this, if you wanted, since when you do a
FETCH, you'll get back both the UID and the sequence number. If those
match what you thought they were going to be, you know you haven't
missed any expunges up to that point.
Note that I do process "* seq EXPUNGE" in IDLE (they'll simply get
removed from the local summary by using the sequence-1 as array
index,
and this seems to work correctly).
It'll work correctly whenever you get an EXPUNGE.
IDLE only actually does one thing - because it's a command, you can
get EXPUNGEs. Otherwise it's the same as a long-running NOOP.
I added the author of Polymer/Telomer in CC (Dave). That's because I
will most likely take a look at his source code for this too.
Jolly good. See if you can spot the bug. ;-)
Dave.
--
Dave Cridland - mailto:dave cridland net - xmpp:dwd jabber org
- acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
- http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]