Re: [gmime-devel] Mapping MIME parts to byte offsets

From: Jeffrey Stedfast <fejj novell com>
To: Alex Hudson <alex bongo-project org>
Cc: gmime-devel-list gnome org
Subject: Re: [gmime-devel] Mapping MIME parts to byte offsets
Date: Sat, 25 Apr 2009 19:05:46 -0400

Alex Hudson wrote:
> Jeffrey Stedfast wrote:
>> Alex Hudson wrote:
>>  
>>> So instead of values being stored, we grab the substream, reset it to
>>> find the start and then ask it for the length?
>>>     
>>
>> Or you can look at the bound_start and bound_end members on stream and
>> do the math :-)
>>   
>
> That too :D
>
>> I did, however, think of a bug in it last night... if you change the top
>> level mime part's headers and write the message to disk, I suspect that
>> it won't reflect the changes because the message will write out the
>> cached stream. Easy fix and not something I expect you'd have to worry
>> about in Bongo.
>>   
>
> Well, although I don't care about that right now, in the near future I
> might - we've been toying with the idea of putting MIME composition
> into the server, mainly so that parts of the mail system have a
> relatively easy and safe way of twiddling mail on the way through. As
> one example, a web application would be able to draft mail without
> having to have a server-side session doing the compose for it (say,
> putting the mail text together with an attachment), and then on
> sending you could have a daemon which adds global signatures without
> having to re-compose the message itself. If that makes sense.

sounds like what most mail client composers would want :-)

>
> I guess it raises some interesting questions about the semantics of
> your new API, though. As a stream, is GMime going to care if I twiddle
> around with it seeking/resetting, etc?

GMime will always seek to the correct position within the stream before
it does it's thing with the stream. If you ask the HeaderList for its
stream and use it, you don't have to worry about later writing that
HeaderList to disk and things going wrong just because you forgot to
reset the stream after using it. GMimeHeaderList will take care of that
by resetting the stream internally before using it itself.

Is that what you were asking?

> Does it absolutely have to be a sub-stream of the main stream, or
> could you in fact replace the original sub-stream with a new stream as
> a way of replacing content? That would be a pretty simple way of
> putting new content into the mail, and the later streams are still
> "correct" (at least, I think, given the brief thought I've given it).

(FWIW, a sub-stream is just a new stream object referencing the parent
stream - so it all reads through the same file descriptor and doesn't
have much memory overhead, maybe 50 bytes? depending on which stream
class you use obviously)

You could do that, yes, although I'd probably be careful about doing
that with the HeaderList stream - probably better to add/modify headers
using the HeaderList API directly (although it should work - just
remember that the toplevel mime-part's header stream contains the stream
for the message envelope as well since it's rather annoying to split
them, plus the idea is that writing the message back out should write
out the headers exactly as it found them). For a mime-part's content,
this is exactly what the stream API was designed for :-)

>
> Another interesting aspect of this, slightly veering onto a different
> topic, is whether or not there is a sensible persistent caching
> strategy for GMime messages - can I save them somehow and then restore
> them later to use again without having to re-parse the message? I
> could see that being useful if you had a large message, say with an
> attached e-mail which is pretty large, and wanted to just get at the
> headers. Maybe it could be possible to lazily restore the cached
> streams based on serialized information from a previous parse attempt
> or something...

This is pretty much exactly what GMime does :-)

A few notes:

1. the GMimeMessage's mime-part tree leaves all content on disk (other
than parsed headers for quick querying) and so won't use up massive
amounts of memory for messages with large attachments.

2. parsing the message back into the object tree is blazingly fast :-)

3. as long as you don't set persist_stream to FALSE on the parser, it
will leave all content on disk and all of the header streams will also
reference the on-disk stream as well, allowing for lazy loading of the
content when you need it (if you need it). The parser defaults
persistent streaming to TRUE unless the input stream is unseekable, at
which point it has no choice but to load it into memory streams (a new
memory stream containing just the individual mime part's content for
each mime part, so that if you delete a mime part, that memory becomes
immediately available).

4. the parser supports parsing of individual mime-parts as well, so if
you want to have each mime part in a different file while composing,
this is easily doable.

5. For the most part, while composing a message, it's unlikely you'll
have many (parsed) headers in memory and so all of your GMime objects
will be fairly light because the largest collection of headers on a
message is usually the Received: headers which only get set as the
message travels from mail server to mail server :-)

References:
- [gmime-devel] Mapping MIME parts to byte offsets
  - From: Alex Hudson
- Re: [gmime-devel] Mapping MIME parts to byte offsets
  - From: Jeffrey Stedfast
- Re: [gmime-devel] Mapping MIME parts to byte offsets
  - From: Alex Hudson
- Re: [gmime-devel] Mapping MIME parts to byte offsets
  - From: Jeffrey Stedfast
- Re: [gmime-devel] Mapping MIME parts to byte offsets
  - From: Alex Hudson

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]