Re: [Tracker] Tracker to do list



Le jeudi 14 septembre 2006 Ã 20:46 +0100, Jamie McCracken a Ãcrit :
Laurent Aguerreche wrote:
Le jeudi 07 septembre 2006 Ã 17:23 +0200, Laurent Aguerreche a Ãcrit :
Le jeudi 07 septembre 2006 Ã 12:48 +0100, Jamie McCracken a Ãcrit :
Im posting some to do items in case any of you lot have some spare time 
and want to use it hacking on tracker and help speed up development :)
...

C programming:

To pave the way for email indexing we will need  mail/mbox handling 
utilities.

Suggest use GMime
more info at http://spruce.sourceforge.net/gmime/ and tutorial at 
http://spruce.sourceforge.net/gmime/tutorial/

We will need utility functions to :


1) parse entire mbox file - extracting message ID and all other fields 
into a GHashTable.

Where should I place a function (or a new program?) to extract e-mails
informations and content?

please create a new file tracker-mbox.c

we will probably need tracker-mbox-evolution, tracker-mbox-thunderbird 
as well for their specifics


Is trackerd currently capable to detect that a file is an mbox or I have
to add code for that somewhere?

no it cant. Evolution, Thunderbird and KMail all store their mbox 
folders in certain locations and with certain names/extensions so its 
just a matter of watching those directories (recursively)

you will need functions for watching each type (WatchEvolution, 
WatchThinderbird etc)


2) As (1) but parse only new mails (given a file offset of the last 
known email). All new mails are always appended to an mbox file.

I added a tracker_add_watch_file() to be called on each mbox file.

mbox files can be dynamically added (eg in thunderbird or evo you can 
create new vfolders with their own mbox file) so the directory must be 
watched

Ok...

I plan to add mbox as watched files (or directories for vfolders I
think) but I wonder if tracker_create_file_info() should be modified to
let programmer to set info->file_type to FILE_EMAILS directly, or right
after its call.
To find whether a file is a mbox, I will use a list of mboxes (or a hash
table?) to check it in process_event() for inotify.

Then, extract_metadata_thread() will identify file as an e-mail and will
treat it accordingly.

Some commentaries?



3) work out whether a mail is marked as deleted or junk (evo and 
thunderbird use different flags in the email headers to determine this - 
google for the exact flags)

I 'll look at that.

4) Extract plain text (we have an html filter in tracker already for html)

5) extract and decode mime attachments

All the above should be easy to implement using GMime.

I begun to test GMime only two days ago... Sorry..

But I'm already able to extract infos from emails. So now, I think that
I will have problems only with trackerd.  :-)

great stuff

Thanks


Laurent.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]