Re: Extended attributes lastCrawlAttr



On Tuesday 02 Nov 2004 15:30, Jon Trowbridge wrote:
> On Tue, 2004-11-02 at 14:57 +0000, Julian Satchell wrote:
> > My indexer design did not store the crawl time, but the last modified
> > time; this is available from most file systems. I only re-read the file
> > when the last modified time is later than the entry in the index.
>
> Just to clarify: for the most part, what Beagle is storing in the EA is
> the mtime, not the crawl time.
>

This is rather strange, why store the mtime when the filesystem has that 
anyway?

> Crawl times (in addition to the usual mtimes) are attached to
> directories.  These times are used to choose the order in which beagle
> re-crawls: never-before-crawled and less-recently-crawled directories
> get crawled first.

I think that the freedesktop.org thumbnail standard 
(http://triq.net/~jens/thumbnail-spec/index.html) shows the way. Thumbnails 
could be seen as meta-data about an image. The freedesktop.org standard 
defines a common directory to store thumbnails in. Why not define a 
common .file for storing the crawler metadata about the files in a directory? 
It would mean that the meta-data does not follow the file if it is moved from 
one directory to another but at least the meta-data would survive tar or 
rdiff backups. 

Obviously a alternative strategy would be needed for readonly filesystems.

EA are very trendy at present but there implementation in the filestore does 
not signal that they are ready to be used. They are not well supported by 
backup tools or other filesystem tools and they are not portable between all 
filesystems. They are going to be very confusing for many users.

Regards

Richard


-- 
You can normally find me on Jabber as RichardTaylor jabber org



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]