Re: [Tracker] Initial email indexing support



Laurent Aguerreche wrote:
Hello,

this is a patch for an initial email indexing support with Evolution. It
extracts content of sent and received emails: text + attachments.

superb - my hero :)


Currently, tracker_db_save_email() tries to save text and header infos
but it is missing some DB stuff... (Jamie...)

yes I will get on to this.

I have almost finished finished sqlite support so I expect both this and the email/db support to be ready this weekend


For email bodies in HTML, there is nothing done and I don't know exactly
how it will be done.  :-)

Cant you get Gmime to convert it (iirc they have an html filter in there - google or look at the docs for it)

if not we already have an html->text converter using the htmless app in filters/text/html which we could call as a last resort (it wont be nice though cause we would need to store the body as a tmp file for htmless to convert it to a text file)


Attachments have their metadatas/content extracted but that's all, and
they have their uri in something like /tmp/pid/attachment...
Currently, parent directories of treated files won't be deleted! So,
check your /tmp directory regularly.

thats okay so long as the actual attachement files are unlinked.




Now, I'm waiting for DB things in:
* tracker_db_get_last_mbox_offset
* tracker_db_update_mbox_offset
* tracker_db_save_email

(Since tracker_db_get_last_mbox_offset() isn't implemented it always
returns 0 which it makes indexing to always start from the beginning of
the mbox to treat.)


I will take a look at KMail.
In thunderbird, there is a stupid file to parse to find profiles then
directories for mboxes...



Note: this patch isn't intended for real use...

thats okay I expect we should have an -evolution parameter to trackerd to indicate whether to index evolution stuff (I will add this later)

I also noticed no checks are done in tracker-mbox-evolution to determine if an email is marked as junk or deleted (X-Evolution entry has bits at the end to determine status (flagged, replied, seen, deleted, junk)) - you may have to experiment with deleteing/marking as junk to work out the exact flags used.

For thunderbird, I believe it uses MOZILLA-STATUS but with different flags from EVO.

this stuff can wait I suppose - I will need a day or two to go through your patch although it looks good at first glance.

Thanks for your time on this.

--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]