Re: Help! Your beagle likes to eat my file index!



Hi,

I'm experiencing a similar problem.

On Tue, 2006-08-29 at 15:13 -0400, Joe Shaw wrote:
> Martin Soto wrote:
> > It works generally well except for a single problem: it seems to feel
> > like indexing my home directory forever. 
> 
> A likely possibility is that Beagle's indexer process is crashing in the 
> middle of indexing a specific file.  When this happens, the main Beagle 
> daemon notices that the indexer has gone away and spawns a new one. 
> Since the last one crashed, it left behind lock files, and the new 
> process ends up purging the index.  This is a nasty but unfortunately 
> unfixable circumstance.

On my system, the indexer does not crash, but stops indexing, i.e. runs
with 100% cpu on some file without ever completing.

If this happens, beagle-shutdown does not work. The helper has to be
killed with
> killall -9 beagled-helper

If you shutdown the system, a similar way of killing the beagle process
will take place, I suspect.

So, in both situations (if you kill the process manually or shutdown the
system) the FileIndex will be destroyed and beagle starts reindexing the
next time it is run. As this happens regularly, it makes beagle useless
for finding files (at least for me).

[For finding mail, it works great and I use it a lot. So it would be
great if we could fix the above issue.]

> > Of course, I've tried to check the logs but they don't say that much to
> > me, and I haven't been able to find any guide that explains how to use
> > them. As far as my interpretation goes, beagle-index-helper crashes
> > (maybe it chokes on some file but, if that's the case, I can't tell
> > which one it is) and when restarting, it decides to purge the index
> > because it finds a dangling lock. The first lines of the IndexHelper log
> > after such a restart look like this:
> 
> Take a look at the *end* of the previous IndexHelper log.  That's the 
> one more likely to have crashed and caused a problem.  Normally there 
> should some line about exiting in there.

Here is what my (currently stuck) index helper log says at the end,
after indexing 130937 files:

060901 0138450102 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/ampperld.m
060901 0205206770 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/dosplitn.m
060901 0248304832 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/proto.m
060901 0313406569 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/rsamp.m
060901 0447078848 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/figsplit.m
060901 0910377101 22495 IndexH DEBUG:
+file:///home/hkunz/Desktop/septumania/scn/matlab/prcrs.m

So, there is no problem reported here.

Because I started beagled in the foreground (--fg) I could see that it
took beagle a long time indexing those files (in the order of minutes).
Although the files have a size of 0.5-2 kilobytes.

Could it be that my index got too large for beagle to manage it in an
efficient way (so that adding a new file would take an awful lot of
time)?

> > By the way, I really don't know, but is Lucene so lacking in robustness
> > that you have to completely erase an index that took days to build just
> > because a process crashed while accessing it?
> 
> If we're in the middle of writing out to the index and the process 
> crashes, there's no way we can guarantee the consistency of the data in 
> the index.  I don't know if this could be considered a real weakness in 
> Lucene; I don't think it's unreasonable for it to expect valid data to 
> be written out to it.

I see two possibilities here: 
* Either, there should be a way of terminating a stuck beagled-helper
process in a way, that doesn't make it necessary to rebuild the whole
index.
* Or, we solve the problem why the helper process gets stuck in the
first place.

I you need any further information, I'm happy to provide it.

cheers,
Hp.

BTW: I'm running beagle-0.2.8 on debian sid.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]