Generic Maildir backend

	Have a maildir directory on your disk that beagle does not currently index ? 
If no (or "maildir what ?!"), you may skip this email. If yes, read on ...

A moderately common request is for a generic maildir backend in beagle. 
Sometimes there are users using a non-kmail mail client (though I dont know 
why) which stores its emails in maildir directories. Sometimes, there are 
users whose local imap server stores the emails in maildir format in some 
directory. And beagle does not index them. 

Actually, beagle probably already indexes those files. What happens is that 
mail clients sometimes write different headers to the maildir files which 
causes our (xdgmime based) mimetype detection to fail. As a result, beagle 
recognizes those files as text/plain and uses the text filter. Instead of 
message/rfc822 and mail filter (which can handle any mail message).

Checkout svn trunk (rev3962).

Then, add the maildir directory to beagle config.
$ beagle-config indexing AddMaildir /path/to/maildir/directory extension
   => this tells beagle that all files with extension 'extension' under 
directory (and its subdirectories) '/path/to/maildir/directory' are mail 
files. extension can be wildcard-ed i.e. it can be "*" to denote any 

$  beagle-config indexing ListMaildirs
  => List current added maildirs: verify that your maildir is listed

(to remove an existing option)
$  beagle-config indexing DelMaildir /path/to/maildir/directory extension

Also you can verify using $beagle-info --list-filters. It should list the 
directory under FilterMail. Now beagle will consider all files (recursively) 
under the directory matching extension 'extension' as mail files. You can use 
beagle-extract-content to test any maildir file in your maildir directory.

Caveats (IMPORTANT):
- This takes precedence over any kind of mimetype based detection. Currently, 
beagle derives mimetype from (1) data source (2) extended attribute 
user.mime_type (3) detect using xdgmime, in that order.

- For client writers, hit.mime_type refers to the mimetype that beagle 
derived. So, some files will now be detected as text/plain and yet indexed as 
mail files. Therefore (hit.mime_type=="message/rfc822") should NOT be used to 
check for email hits, instead use (hit["beagle:FileType"]=="mail" or ). Also, 
since these files are found on disk, hit.Source will be "Files".
    (BTW, hit["beagle:FileType"] can remove a lot of mimetype based 
categorization in clients)

- If there non-mail files in those directories, and they cannot be 
distinguished using an extension, then they will also be indexed using the 
mail filter (well, if they are not mail files, then the mail filter will 
probably not be able to index them, so this is not so bad).

- The files will be indexed when the file backend's crawler finds them. That 
means, if some files are in hidden subdirectories, this approach will not 

Finally, writing a maildir backend is not difficult. If the above does not 
meet your need, then you should consider writing one. Feel free to ask me for 

- dBera

Debajyoti Bera @
beagle / KDE fan
Mandriva / Inspiron-1100 user

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]