Hi, Martin Soto wrote:
It works generally well except for a single problem: it seems to feellike indexing my home directory forever.
A likely possibility is that Beagle's indexer process is crashing in the middle of indexing a specific file. When this happens, the main Beagle daemon notices that the indexer has gone away and spawns a new one. Since the last one crashed, it left behind lock files, and the new process ends up purging the index. This is a nasty but unfortunately unfixable circumstance.
Of course, I've tried to check the logs but they don't say that much to me, and I haven't been able to find any guide that explains how to use them. As far as my interpretation goes, beagle-index-helper crashes (maybe it chokes on some file but, if that's the case, I can't tell which one it is) and when restarting, it decides to purge the index because it finds a dangling lock. The first lines of the IndexHelper log after such a restart look like this:
Take a look at the *end* of the previous IndexHelper log. That's the one more likely to have crashed and caused a problem. Normally there should some line about exiting in there.
By the way, I really don't know, but is Lucene so lacking in robustness that you have to completely erase an index that took days to build just because a process crashed while accessing it?
If we're in the middle of writing out to the index and the process crashes, there's no way we can guarantee the consistency of the data in the index. I don't know if this could be considered a real weakness in Lucene; I don't think it's unreasonable for it to expect valid data to be written out to it.
Joe