Re: [Evolution-hackers] mmap() for the summary file



On Mon, 2006-06-12 at 16:23 -0400, Jeffrey Stedfast wrote:
> A couple possible problems with this approach:
> 
> 1. systems that don't have mmap or the mmap is broken (I can't think of
> any off the top of my head... except maybe Win32? Tor?)

The current proposal keeps the format of the summary file backward
compatible (the length is increased by one, and only a '\0' is added at
the end of each string), but not forward compatible. So older or current
implementations should still work after one very small modification in
the writing procedure. I would suggest a "mv camel-folder-summary.c
camel-folder-summary-mmap.c" and a "touch camel-folder-summary-fread.c",
and a compilation switch to choose between the two?

> 2. NFS... how well would this work over NFS? What are the performance
> implications?

Would it be detectable if the summary file is being used from an NFS
share? In that case both camel-folder-summary-fread.c and camel-
folder-summary-mmap.c could be compiled and the implementation
dynamically picked after checking whether or not the filesytem isn't
remote.

I would add one implementation / complexity to maintain, true.

> 3. It will keep a lot of fd's open... EMFILE anyone? :(

Currently, for each open folder, it will keep one fd open. Indeed. That
is inevitable with mmap, as far as I know. Maybe it would be possible to
reduce this to "active" folders only? Such adaptations might soon look a
lot like what the disk-summary ideas are about.

In tinymail I will not have this problem because I (can and do) destroy
the CamelFolder instance when you select a new folder to become active.
With the vfolders of Evolution this might not be possible? Maybe would
the disk-summary concepts make more possible?

I would agree that this mmap() proposal is a cheap, less platform
independent, version of what the disk-summary concepts also implements.
Disk-summary probably does in a better way and in a more controlled
way .. with mmap() and this proposal you are at the mercy of the kernel.

The kernel, however, given that it has more information about the file
system, disk hardware and memory page sizes, is probably always going to
be (a little bit) faster than doing read() manually. Unless, maybe, if
you read() large chunks and that way decrease the amount of read()
syscalls? But I haven't measured the differences myself. I would be
quite interested in such measurements . . .

Note that a lot but not everything of the summary file must be read with
this mmap() proposal. If backward compatibility could be given up (not a
very good idea imho), we could probably make it in such a way that only
the beginning of the file would be read (the entire file would be
mmap()'d but it would only read at the begin the address space). That
would be, I think, faster (in initial load) than any read() implemen-
tation. But I should verify that by reading kernel mmap() code first.

-- Once the offset of the start of the real strings would be known, we
could calculate (and store that value) the ptr to their actual positions
in the address space. I think.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]