Re: Rescanning IMAP folders



On Tue Feb 13 14:11:20 2007, Philip Van Hoof wrote:
On Tue, 2007-02-13 at 13:34 +0000, Dave Cridland wrote:
> On Tue Feb 13 11:39:54 2007, Philip Van Hoof wrote:
> > > > An expunge will happen like this: "* 2 EXPUNGE". This means that the
> > message with sequence 2 in the current folder has been expunged.
> > > > What will happen with the Push-Email implementation (the IMAP IDLE) > > is
> > that simply message #2 will be removed from the summary.
> > > > But if there was no IMAP IDLE for that folder, the only way to > > synchronize both is to check one by one whether the sequence (the > > index > > of the array + 1, as the IMAP sequence starts at 1 and C arrays > > start at
> > 0) and the UID match.
> > > > > No, you do exactly the same thing. IDLE or not IDLE should always be > treated identically. (For a client - for a server, it's a little > different).

Oh, yes. it's the same code that does this. So the same thing happens.


Good - you can get EXPUNGE only when commands that don't have sequence numbers in the arguments are running.

EXISTS and FETCH can come at any time at all, even when no commands are running. (I think Oryx's server, Archiveopteryx, does this, but it's probably the only one.)


> There's one case where you need to be careful, and that's when you're > in IDLE, and want to send a command with sequence numbers in - then > you have to say DONE and wait for the IDLE to complete before you > send the command.
> > > Imagine an expunge of message #1. What will happen:
> > > > SEQUENCE: 1 2 3 4 5
> > INDEX:    0 1 2 3 4
> > UID:      1 2 4 5 6
> > > > Server expunged #1
> > > > SEQUENCE: 1 2 3 4
> > INDEX:    0 1 2 3 4
> > UID:      1 2 4 5 6
> > > > > No, that's wrong - you know that the first message has been removed, > so that has to be UID 1 that's removed.

Oh but I wanted to illustrate the local state when remotely an expunge happened before locally things are synchronized. It's the last series of
numbers that is the end-result after synchronization. In-between the
first two series nothing at the client has happened yet.


Ah...

From an IMAP perspective, these things are essentially synchronized, and the client can only break this if it uses sequence numbers in a command other than at the beginning of a pipeline.

So if a message is EXPUNGEd, and you give the server no opportunity to tell you (ie, you don't use IDLE and you don't issue a command), then you can still refer to that message with its sequence number. (You'll quite possibly find an empty message, but you can change its flags quite happily).

> > If any expunge was detected, the condstore implementation isn't > > used. > > The condstore implementation doesn't cope with expunges. So it ain't > > used for those situations. Which is why imap_rescan is ALWAYS > > important
> > enough to be *very* optimized.
> > > > > Gah. Sort of.
> > If you're connected, then there's no need to do any scanning at all - > you get the EXPUNGE, you remove that message, you're done.

Right, that's how it currently works. Yet it will upon selecting a new
folder still check for changes (because only the current folder is
selected and I noticed that only for the current folder events happen
while in IDLE).


Yes, that's right. Even with NOTIFY, you'll only get information that some messages have been expunged in other folders, not which. (But a server doing NOTIFY probably does QRESYNC, which hands you a list of messages which have been expunged when you SELECT).

You can use multiple connections to monitor multiple mailboxes, or else you can just get really good at synchronization.

> Loosely, on SELECT, I cache:
> > A) EXISTS
> B) UIDNEXT
> > And I set C to 0. Every time I witness an EXPUNGE, I increment C. > > On the next SELECT, I look at the new value of EXISTS (D) and UIDNEXT > (E). > > If (E == B), then you know no new messages have arrived. We'll call > this condition F.

Errm, but this only works for when IDLE is available? And what about
changes to a folder that wasn't selected? Or changes that have happened
while the client was offline?


This is post-SELECT synchronization, so deals with the last two questions. IDLE has nothing at all to do with this.


> If (A - D + C) == (B - E), then we know that no messages have been > expunged. We'll make this condition G. > > So, if F is true, then we'll move onto looking at HIGHESTMODSEQ, and > if we've seen those changes, we can avoid doing a FETCH at all. Neat, > eh?

Yes, neat. But I wont get EXPUNGEs for just any folder... so what about
C for those? :-)


There's another value here - the unwitnessed expunges. If F or G is true, none have happened.


For example my user has INBOX selected and he's online. So he gets
EXPUNGE, FETCH, EXISTS and RECENT for INBOX because I've set his session with the IMAP server in IDLE. I act on that, so that folder will be kept
in sync (assuming the IMAP server's IDLE implementation is cool).


Curiously, even if IDLE *isn't* cool, it'll appear to be. As in, everything will still work the same.


Now my user selects Inbox.100, a subfolder that wasn't selected. In
another E-mail client another person expunged stuff from Inbox.100 while
my user had INBOX selected.

I never received any 'C' for Inbox.100, only for INBOX. I already have quite some offline data of Inbox.100, how can I avoid that my user has
to download everything of Inbox.100 again?


Yes, you still have C, what you don't know is the unwitnessed expunges, which F and G tell you.


So with CONDSTORE I check the highestmodseq of course. But if that is enabled I know that my own code can't cope with expunges in Inbox.100 yet. The old imap-rescan can. So my idea was or is to check the local size vs. the exists and the uidnext value - 1 against my last local uid.

Which might be incorrect if an expunge and a add happened both at the
end of the mailbox by that other user (that I understand).


No, if that happens then F will be false, since UIDNEXT has changed, and because UIDNEXT has not advanced in sync with EXISTS, which is G, then you also know that there (might have been) expunges.

But even then, use CONDSTORE still. Just issue a UID SEARCH [RETURN ()] ALL to refresh the match.


Maybe I should also check the sequence of the last 5 or 6 messages
versus their uid? Well the sequence and the uid of the last message is also always compared with what I have locally by the way. This was old camel code that I didn't remove. And if they don't match, again a full
rescan will also happen nonetheless.


Right, that works too. Line 3,344 does this, more or less, in my code, although it's pulling at least the last 1,000 UIDs using a SEARCH, to minimize more complex code.

If that fails, I look for where the expunge(s) happened, in real_resync(), which issues more complicated searches to find mismatching UIDs. The IPL gets complex here because it doesn't bother storing the whole UID mapping, but the basic idea is the same.

Dave.
--
Dave Cridland - mailto:dave cridland net - xmpp:dwd jabber org
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]