Re: Number 9 of the mytest summary store: writing things



On Mon Jan  7 11:03:31 2008, Philip Van Hoof wrote:

On Mon, 2008-01-07 at 09:47 +0000, Dave Cridland wrote:

> Basically, this is handling the stuff that I think your is currently > doing least optimally. ENVELOPEs don't change, so storing them in the > same place as FLAGS doesn't make sense. I'm not! :). The mytest experiments store flags in flags_0.idx.


Ooops, sorry. :-)


But you are right, I haven't put a lot of thought into handling the
flags efficiently.

Mine simply delays the writing of that file to the thawing of the
summary if the summary is frozen, to the finalise if pending writes are required (unlikely, as you'll have thawed if you use the API correctly)
or immediately if the summary isn't frozen.


Right - I need a similar API, of course, and one that cleverly handles a series of expunges.


Immediately means that I write all of the flags for all items even if only one item's flags changed (so you really want to freeze and thaw).


Right, whereas I won't be doing that. And indeed don't, even without a freeze/thaw API.

> That all said, it's working, and seems reasonably quick - faster than > the code I have, so I'll probably try to blend it in.
> > So... How it works.
> > I'm blocking the data in multiple mmap files - this needs more > cleverness, because really, I need to be using smaller blocks toward > the end of larger mailboxes. mmap files are called by the sequence > number they start at. > > Finding data by sequence number is relatively easy, although less > than efficient as yet, I'm just running through all blocks in hash > order to find one that looks as if it might be right. > > When a UID gets removed - currently not really implemented - I don't > rewrite all subsequent blocks. Instead, I rename them. This means > less I/O, which is always a good thing.

Clever ...


I thought so. :-)

I do something similar, but in-memory, for CONTEXT handling, and it seems to work as a general principle. Making it use mmap is useful, though, obviously.


> I think that this is basically a sound design, albeit badly > implemented.
> > Some potential improvements:
> > 1) I suspect that having an index file containing block sequence > starts, lengths, and UID extents would be faster than using the > directory listing. Renaming files causing the directory "file" to be > rewritten, so it makes sense to avoid this, and give the blocks a > 64-bit ID instead.

Uhu (I think the kernel is quite good at caching/buffering the
directory's file. So you might get away with this yet make it perform
quite good).


Yes, but equally, I need to - I think - extend the data held in the block filenames to include UID extents, and that will probably lead to ugliness. Whereas, if I store the blocks named with some opaque identifier, then I think I can get it NFS-safe and lockless. Which'd be kind of handy.

Note that in advance of this, I'm not writing via the mmap.

Dave.
--
Dave Cridland - mailto:dave cridland net - xmpp:dwd jabber org
 - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
 - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]