Re: [Tracker] Tracker to do list



Laurent Aguerreche wrote:

2) As (1) but parse only new mails (given a file offset of the last known email). All new mails are always appended to an mbox file.
I added a tracker_add_watch_file() to be called on each mbox file.
mbox files can be dynamically added (eg in thunderbird or evo you can create new vfolders with their own mbox file) so the directory must be watched

Ok...

I plan to add mbox as watched files (or directories for vfolders I
think)

all the email clients allow you to create new mbox files so directory watching is probably essential to pick these up

but I wonder if tracker_create_file_info() should be modified to
let programmer to set info->file_type to FILE_EMAILS directly, or right
after its call.
To find whether a file is a mbox, I will use a list of mboxes (or a hash
table?) to check it in process_event() for inotify.

Then, extract_metadata_thread() will identify file as an e-mail and will
treat it accordingly.

Some commentaries?


I recommend following:

1) In the global Tracker struct add a GSList for email sources. The sources should be a struct with directory of mbox files and type (evo, kmail etc)

2) when inotify/fam receives any file change event we check against those sources during process files thread (check prefix against email source directories) and if an mbox file we call a new function index_mails (instead of the index_file in process_files_thread).

3) index_mails will (if mbox size has increased) need to get the last known offset for the mbox file from the DB (I need to create seperate tables for emails as well as modify the stored Procs) and parse all new messages since that point. Your mbox functions should have a parse_from_offset and a parse_next calls. Parse_next will return Null when no further emails to process.


so  index_mails code should look something like:

MailBox *mb;
MailMessage *msg;

mb = tracker_mbox_parse_from_offset (uri, offset);

while (msg = tracker_mbox_parse_next (MailBox mb)) {

        tracker_db_save_email (msg);
}



MailBox struct would need to encapsulate the Gmime stuff and also keep track of offsets for the next email to be read

MailMessage struct should contain all the metadata for one email

{
        char    *mbox_uri;      
        guint64 offset;         (start address of the email)
        char    *message_id;
        char    **references;   (array of message_ids)
        char    *reply_to_id;   (message_id of email that it replies to)
        long    *date;
        char    *mail_from;
        char    *mail_to;
        char    *mail_cc;
        char    *subject;
        char    *content_type;  (eg text/plain or text/html etc)
        char    *body;          
        GSList  *attachments;   (names of all attachments)
        
}


to index attachments we will need function :

char * tracker_mbox_index_attachment (msg, attachment_name);

this should check the mime of the attachment and if text or a document then extract it to tmp directory and copy the code from index_files (but ignore the tmp path!) to index it.


--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]