Re: [Evolution-hackers] A Camel API to get the filename of the cache, also a proposal to have one format to rule them all
- From: Jeffrey Stedfast <fejj novell com>
- To: Philip Van Hoof <spam pvanhoof be>
- Cc: Evolution Hackers <evolution-hackers gnome org>
- Subject: Re: [Evolution-hackers] A Camel API to get the filename of the cache, also a proposal to have one format to rule them all
- Date: Mon, 05 Jan 2009 09:41:13 -0500
Philip Van Hoof wrote:
> On Mon, 2009-01-05 at 08:25 -0500, Jeffrey Stedfast wrote:
>
>
>> migrating away from the IMAP specific data cache would be good.
>>
>
> Yes. I think IMAP and the local providers are the only ones that are
> still using a specialized datacache.
>
> The IMAP4 one, for example, ain't using a specialized one.
>
>
>>>> b) migrate away the mbox data cache (the all-in-one file crap)
>>>>
>>>>
>>> I'm all for it. Once I thought of doing this, but the options were like
>>> Maildir or a format of one mbox file per mail in a distributed folder
>>> [CamelDataCache sort of format, like imap4/GW/Exchange]. But IIRC Fejj,
>>> had some concern like, Local still might be good to be held in a
>>> 'standards' way. I know it hurts us on expunge/mailbox rewrite etc.
>>>
>>>
>> what mbox data cache? CamelDataCache would probably be the best cache to
>> use for IMAP.
>>
>
> Although I would change CamelDataCache to store individual MIME parts as
> separate files instead of files that look like a single-mail MBox file.
>
it's really just the raw message/rfc822 format, not really mbox -
there's no "From " line for example.
that doesn't need to be part of the cache logic. that can be part of the
key.
> I would also decode the separate MIME parts before storing if the
> original E-mail had them encoded (which is usually the case, and always
> for binary attachments). This to make it more easy for metadata engines
> to index the MIME parts, and to allow such to do this efficiently.
>
> Perhaps also to reduce disk-space, as encoded consumes more disk-space,
> but that is for me just a nice side-effect.
>
> So my format would create a directory foreach E-mail, or prefix each
> MIME part with the uid. Perhaps
>
> INBOX/subfolders/temp/1. // headers+multipart container
> INBOX/subfolders/temp/1.1 // multipart container
> INBOX/subfolders/temp/1.1.1 // text/plain
> INBOX/subfolders/temp/1.1.2 // text/html
> INBOX/subfolders/temp/1.2.1 // inline JPeg attachment
> INBOX/subfolders/temp/1.BODYSTRUCTURE // Bodystructure of the E-mail
> INBOX/subfolders/temp/1.ENVELOPE // Top envelope of the E-mail
>
sure, this can be done with the key tho. instead of using the uid as the
key, use uid.1 or uid.1.2 etc
> ps. Perhaps I would store 1.BODYSTRUCTURE in the database instead. I
> would probably store 1.ENVELOPE in the database (like how it is now).
>
yea, I think it makes sense to store BODYSTURCTURE in the folder summary.
> I would probably on top of storing BODYSTRUCTURE and ENVELOPE in the
> database also store them in separate files. Even if most filesystems
> will consume 4k or more (sector or block size) for those mini files.
>
> To get the JPeg attachment:
>
> $ cp INBOX/subfolders/temp/1.2.1 ~/mommy.jpeg
>
> $ exif INBOX/subfolders/temp/1.2.1
> EXIF tags in 'INBOX/subfolders/temp/1.2.1' ('Intel' byte order):
> --------------------+----------------------------------
> Tag |Value
> --------------------+----------------------------------
> Image Description |Mommy with cake at birthday
> Manufacturer |SONY
> Model |DSC-T33
> ...
>
> $ tracker-search -s EMails birthday
> Results:
> email://user server/INBOX/temp/1
> email://user server/INBOX/temp/1#2.1
> ~/mommy.jpeg
>
>
> [CUT]
>
>
>> this can cause problems if you need to verify signed parts because
>> re-encoding them might not result in the same output.
>>
>
> Ok, for signatures I guess we can make an exception and keep then
> encoded in their original format then.
>
>
>>>> For Maildir I recommend wasting diskspace by storing both the original
>>>> Maildir format and in parallel store the attachments separately.
>>>>
>>>> Maildir ain't accessible by current Evolution's UI, by the way.
>>>>
>>>> For MBox I recommend TO STOP USING THIS BROKEN FORMAT. It's insane with
>>>> today's mailboxes that easily grow to 3 gigabytes in size per user.
>>>>
>>>>
>>> I second your thoughts for MBox stuff.
>>>
>>>
>> Eh, I think mbox works fine but I can understand wanting to move to
>> Maildir which is also fine :-)
>>
>
> Maildir doesn't store individual MIME parts separately. So Mailbox is
> equally hard to handle for metadata engines as MBox is. Only difference
> with MBox is that we need to seek() to some location.
>
> So Maildir doesn't make it possible for us to let app developers
> implement indexing plugins easily, like a typical exif extractor.
>
I guess, but they could just link with gmime or camel :p
Jeff
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]