Re: GNOME Message Managment System (Storage mechanism)



James Henstridge wrote:
> 
> Since IMAP works on the principle of storing the mail on the server, and
> the ideas have been to have message <-> folder association be one to many,
> it may be necessary to use a modified IMAP server that can handle the new
> mailbox format.

Actually, you'd only have to do that for offline IMAP.  (My personal
opinion is that the people who want offline IMAP really want POP
instead.  My ISP supports both, so maybe that's coloring my view.)

We could make the message control file flexible so that it merely points
to the on-server copy of the email for regular IMAP.  Others have
suggested that this email system allow people to link messages to each
other in an arbitrary fashion -- this feature could be an extension of
that.  Instead of the link containing the local message ID, it could
contain a protocol ID (IMAP), a server name (imap.my.isp.net) and the
server's message ID.  Then the client would know that it has to talk to
the ISP's IMAP server to get that message.

In case you missed it, here's my file layout scheme, as modified by
recent discussions and Scott Wimer's ideas:

gmail/
|-- mail
|   |-- Gnome-L
|   |   |-- 05927923832
|   |   `-- 32874932729
|   |       |-- control
|   |       |-- headers
|   |       |-- msg
|   |       |-- foo2.tar.gz
|   |       |-- foo1.jpg
|   |       `-- myplugin
|   `-- Inbox
`-- news
    `-- alt.hackers
        `-- 75082790232
            |-- control
            |-- headers
            |-- msg
            `-- myplugin

(gmail is just a generic name for the sake of discussion -- GNU mailer)

news and mail are hard-coded categories used by the gmail system , but
others could be added by the user.  That would allow the user to set up
categories for web links, downloaded info files, locally-created
documents, etc.

Under that are subcategories.  The subcategories under the mail
directory are what we call "folders" in current mail clients.  Folders
get populated by filtering incoming email.  Inbox is another hard-coded
name, but Gnome-L is a hypothetical user-added subcategory: those are
messages that have gnome.org in their To: or Cc: fields.

Under that are files, named after a locally-generated message ID.  (We
probably can't count on remote message IDs to be unique.)  Each of these
files is an ar(1) or cpio(1) archive, containing the files listed.  For
example, message archive 32874932729 contains a control file, the raw
message headers, the message text itself, two file attachments and
another control file, generated and used by some custom plugin.

Why ar(1) or cpio(1), by the way?  Because the formats are well-known
and tested, and good, relevant code should be easy to come by.  This
also saves inodes over saving each part in a separate file, and saves
our program from having to completely parse each message each time it's
accessed -- we only have to parse it once, and then we have a
well-defined format we can use from there on.

The control file contains directives that our mailer uses: what other
messages it links to, whether the actual message text is stored on an
IMAP or Notes server somewhere else, the names and order of the
attachments in the file, and a parsed (read: cleaned-up) version of the
headers.

That brings me to the headers file: we keep the raw email/news headers
in case we need them again, but for simpler uses, we have the parsed
versions in our control file.

This filing system also handles netnews, in much the same way as it
handles email.  The link-to-server mechanism mentioned above could also
allow for leave-news-on-server setups.

As Scott says, it would be a good idea to allow for keeping each
directory under some number of files.  You could do this by using the
first one or two digits of each message ID as a subdirectory:

|-- Gnome-L
|   |-- 02
|   |   |-- 244354634
|   |   |-- 248386513
|   |   `-- 534243233
|   |-- 05927923832
...etc.

So whereas message 05927923832 is its own file, message 02244354634 is
in a subdirectory called 02, because we've seen that there are a lot of
messages beginning with 02.  Perhaps for speed we wouldn't want to
rename the files, but just move them to the new subdirectory.

One last point: let's not reinvent sendmail's approach to flexibility. 
It would be nice to be as flexible as sendmail, but let's see if we
can't also be easier to understand than sendmail.
-- 
= Warren -- http://www.cyberport.com/~tangent/
= ICBM Address: 36.8274040 N, 108.0204086 W, alt. 1714m
= Chance favors the prepared mind.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]