Re: RFC mailbox interface



On 2001.11.09 20:56 Magick wrote:
> We can do this for the 'get headers' function, the question is than how to
> handle multi line headers. The easiest way is to leave the line-ending in
> place and just treat it as white-space.
> But for the 'get message' function, having it as one big string makes it
> possible to hand it over to for example gmime. I think that one big string
> is kind of the natural state of a message, because that is how it is
> transfered.
> 
Agreed. It's not a matter of always having the headers around in parsed 
format, only I would like to have one central place where parsing is done, 
not several different ones. Also, it's more efficient to parse the headers 
once and then search the list for the header you want than to parse the 
headers again for every header field that needs to be etracted.

>> If local IMAP shadowing is used, you may also want to think about having
>> a function that returns mailbox TOCs in sorted sequence, on various
>> fields. This would make it possible to offload sorting to servers that
>> support it and to use the local shadow mailbox more efficiently for that
>> purpose. The local shadow mailbox could even be an indexed or hashed
>> database, making this the fastest IMAP lib ever.
> This can be a useful function, but i think this is something for version
> 2.
> 
Well, it needs to be taken into account for the initial version of the lib 
already, so that message storage and representation are chosen in such a 
way to not conflict with later addition of local storage for remote folders.
As a matter of fact, having globally unique, persistent message IDs for 
each message would greatly facilitate the creation of virtual folders, that 
can display messages from a collection of various other folders 
simultaneously, like search results, for instance.
Subject and body search should really also be a function of this library, 
because at this level the lib would be able to choose the most efficient 
way to do it.
So, the result for a query would be a handle to an in-memory virtual folder 
that doesn't need to be destroyed explicitly, because it has no persistence 
across sessions.

>> I was thinking about this type of lib myself, but I can't find the time
>> to actually do it. I had mapped out a similar API, but oject oriented.
> OO like C++ or like glib/gtk? Do you have your design digital, or want to
> type it up? I'd love to have a look.
> 

Well, I actually started on implementing it in real C++, but stopped before 
it really got anywhere because of untold library problems. Using C++ would 
have introduced so many new library dependencies that the resulting 
application's portablility would have suffered too much. I'm not familiar 
enough with gtk+'s class structure to roll my own, so I didn't pursue that 
any further. Right now I don't even know what machine that lib is on, much 
less what directory it's in. Sorry there!

>> However, I think that callbacks would confuse the issue to no end, so I
>> didn't go any further in that direction.
> Which callbacks did you had in mind?
> 
Any type of callback. As soon as callbacks are involved, at least to me, 
code becomes less readable. In order to find the functions actually being 
called I not only have to look at the actual signal emission, but I also 
have to look everywhere a callback is registered. In the end, I'm not much 
wiser as to what list of functions is called in what order.
I would not use any callbacks in this low level lib, leave callbacks and 
gtk+ class structures to libbalsa! A message is an object, there are 
methods to handle that object, they do what they should and then return, no 
muss, no fuss. Straightforward, simple, efficient, that's what I would like 
to see.

> My idea is to have a single entry point for the basic operations. Added
> with utility/optimization functions which can say: no i can't do that for
> you, do it yourself. Like the IMAP sorting.
> 
I don't really like single entry-point solutions, I prefer individual 
functions, arrays of function pointers if C is used, or virtual functions 
in C++. The qay I would code it would probably be thus:
A structure defining the methods for dealing with a message of a vertain 
type, consisting of an array of function pointers _only_. This would be in 
a shared object loaded at runtime.
Another structure containing the mailbox's data, including a pointer to a 
structure describing the server connection, if any. There would be only one 
such "server descriptor" for a iven server/username pair, so the value of 
the pointer can be used to determine what server a message is on.
This structure (the mailbox one) would also have a pointer to the structure 
with the function pointers and a list of pointers to the contained messages.
Finally, a structure containing message data. This would have a pointer 
list to the mailbox descriptors referencing the message.
This many-to-many linking ability would make virtual folders possible.

Deleting a message would, for example, access 
(*message->mailbox->mailbox_type->delete)();

In the case of a virtual folder, this would really do nothing but removing 
the refrence. With a real folder, all the proper on-disk stuff would take 
place there, too.
A message that no longer has any mailboxes associated may be freed.
A mailbox association is created through opening, insertion and virtual 
folder shadowing. It is destroyed by closing the mailbox or deletion of the 
message or the mailbox.

Am I making sense?


>> Yes, and also easier to use them in non-graphical, non-gnu environments.
>> I think, the first lib of this kind that really does all it's advertised
>> to do may well become the standard for mailbox access under *nix. The
>> more generally usable, the better.
> Yes, something like libesmtp wants to be ;-).
> 
Well, stealing the glist code would just about be all I would think we'd 
need from glib. I would not use glib for that, just steal the code. One 
less dependency to worry about.

>> See above, I would recommend parsing them from the start, it's easier
>> than hacking that in later.
> Well when you are reading/scanning an mbox file, you'll be reading it line
> for line. My idea was we have a line why not pass it along to some
> function which can then extract useful information about the message.
> Information like Subject, Sender or Date. Maybe even message size, which
> can than be given as a hint to the mailbox scanner.
> 
My idea: parse all the generally useful headers into the message structure 
itself, into dedicated fields. It would look like this:

struct message
{
	char *headers;
	char *from;
	char *to;
	char *subject;
	char *content_length;
.
.
.
	char *raw_headers;
	char *message_text;
	char *raw_message;
};

Headers would point to a memory area where the headers are stores as 
"Header: value\0Header: value\0\0". The individual pointers would point to 
the start of the "value" field of the header in that list, or be NULL if 
there is no such header.

The last 3 pointers would be initialized to NULL and and updated when the 
proper "get_*" function is called. Of course, getting the raw message will 
automatically get the message text as well, and also set that pointer, and 
quite logically, it should be the same the other way around. One may even 
think about conserving memory and adding a header_length field, so that 
only one pointer, raw_message, remains. raw_message+header_length would be 
the message text, raw_message for header_length bytes would be the headers 
and raw_message, treated as a string would be the message.


>> is related to that... form follows function, you know? :)
> Yeah, but not about stuff like keeping an index file for faster scanning,
> but more focused on how Balsa will see the mailboxes

Yes. I only brought that up to show what needs to be considered in the 
initial design. Sometimes, you inadvertently close doors you will need open 
later on. Going back to the drawing board with a finished piece of code 
results in unmaintainable spaghetti code, or forces a complete rewrite. 
Both are unacceptable.

Melanie



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]