Re: Rescanning IMAP folders

From: Dave Cridland <dave cridland net>
To: Philip Van Hoof <spam pvanhoof be>
Cc: tinymail-devel-list gnome org
Subject: Re: Rescanning IMAP folders
Date: Tue, 13 Feb 2007 13:34:56 +0000

On Tue Feb 13 11:39:54 2007, Philip Van Hoof wrote:

Rescanning an IMAP folder has two purposes:

-> Updating the flags
-> Removing the expunged messages from both the cache and thesummary
The second part is implemented badly. That's because IMAP has twothings
to identify one message:

Its sequence number, and its UID.
The sequence number MUST be the index of the array of the summary.The
UID is a field like any other per item.

Almost... The mailbox is always ordered by UID. Like every other itemexcept FLAGS (discounting ANNOTATE for the minute) it's alsoimmutable.

An expunge will happen like this: "* 2 EXPUNGE". This means that the
message with sequence 2 in the current folder has been expunged.
What will happen with the Push-Email implementation (the IMAP IDLE)is
that simply message #2 will be removed from the summary.

But if there was no IMAP IDLE for that folder, the only way to
synchronize both is to check one by one whether the sequence (theindexof the array + 1, as the IMAP sequence starts at 1 and C arraysstart at
0) and the UID match.

No, you do exactly the same thing. IDLE or not IDLE should always betreated identically. (For a client - for a server, it's a littledifferent).

There's one case where you need to be careful, and that's when you'rein IDLE, and want to send a command with sequence numbers in - thenyou have to say DONE and wait for the IDLE to complete before yousend the command.

Imagine an expunge of message #1. What will happen:

SEQUENCE: 1 2 3 4 5
INDEX:    0 1 2 3 4
UID:      1 2 4 5 6

Server expunged #1

SEQUENCE: 1 2 3 4
INDEX:    0 1 2 3 4
UID:      1 2 4 5 6

No, that's wrong - you know that the first message has been removed,so that has to be UID 1 that's removed.

But was this really necessary? Not really, we can re-calculate thelocalsummary to avoid the refetching of all in imap_update_summary,right?

If you treat the mailbox as a 1-indexed array of messages ordered byUID, then you just remove the entry the EXPUNGE tells you to.

If any expunge was detected, the condstore implementation isn'tused.
The condstore implementation doesn't cope with expunges. So it ain't
used for those situations. Which is why imap_rescan is ALWAYSimportant
enough to be *very* optimized.

Gah. Sort of.

If you're connected, then there's no need to do any scanning at all -you get the EXPUNGE, you remove that message, you're done.

(That's not to say rescanning on SELECT shouldn't be carefullyoptimized, though).

The detection of whether expunges happened checks the last local UID

with "UIDNEXT - 1" and compares the EXISTS with the local size.Only if

CONDSTORE is activated, "LAST LOCAL UID == REMOTE UIDNEXT - 1" and
"LOCAL SIZE == EXISTS" do I assume that condstore can be used. Else
imap_rescan is used. Is that a correct assumption?

No... Not quite. Look at how Polymer (or rather, the IPL) does it. Atthe moment, it's synchronous code, so it's very easy to follow.

http://svn.dave.cridland.net/svn/projects/infotrope/python/infotrope/imap.py

It's line 3148, mailbox_reselected() - the code below it is allrather heavily used too - down to about line 3392 is all aboutkeeping the basic UID mapping in sync.

Loosely, on SELECT, I cache:

A) EXISTS
B) UIDNEXT

And I set C to 0. Every time I witness an EXPUNGE, I increment C.

On the next SELECT, I look at the new value of EXISTS (D) and UIDNEXT(E).

If (E == B), then you know no new messages have arrived. We'll callthis condition F.

If (A - D + C) == (B - E), then we know that no messages have beenexpunged. We'll make this condition G.

So, if F is true, then we'll move onto looking at HIGHESTMODSEQ, andif we've seen those changes, we can avoid doing a FETCH at all. Neat,eh?

If G is not true, then we have expunges somewhere. The fastest way toupdate these is to use "UID SEARCH ALL", or, with ESEARCH, "UIDSEARCH RETURN () ALL". This can still be quite big with hugemailboxes, though. Luckily for us, users are rather predictable, andalmost all EXPUNGE events happen within the last few messages, so Ifind them by looking at the UIDs corresponding to the sequencenumbers for the last messages I knew about, which generally findsthem. (If not, the code gets scary).

You could easily enough skip this, if you wanted, since when you do aFETCH, you'll get back both the UID and the sequence number. If thosematch what you thought they were going to be, you know you haven'tmissed any expunges up to that point.

Note that I do process "* seq EXPUNGE" in IDLE (they'll simply get
removed from the local summary by using the sequence-1 as arrayindex,
and this seems to work correctly).

It'll work correctly whenever you get an EXPUNGE.

IDLE only actually does one thing - because it's a command, you canget EXPUNGEs. Otherwise it's the same as a long-running NOOP.

I added the author of Polymer/Telomer in CC (Dave). That's because I
will most likely take a look at his source code for this too.

Jolly good. See if you can spot the bug. ;-)

Dave.
--
Dave Cridland - mailto:dave cridland net - xmpp:dwd jabber org
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Follow-Ups:
- Re: Rescanning IMAP folders
  - From: Philip Van Hoof

References:
- Rescanning IMAP folders
  - From: Philip Van Hoof

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]