Re: beagles eating my /var filesystem



> Either there is some error in beagle-manage-index which could be
> revealed if you remove the redirection to /dev/null (along with
> --enable-deletion) or the index _is_ legitimate. Can you rerun with
> the redirection removed and attach the log (It should not be too large
> since nothing would have changed in the documentation directories).
> 520 MB index data looks a bit large though. But  it is for 37090 files
> and it is not thoroughly improbable since the documentation files are
> mostly text data. Do you have a lot of files that are supposed to be
> indexed in the documentation index ?

I ran the same on my machine (with 0.2.18): 285MB in the documentation 
directories specified in the crawl-documentation. Beagle made a index of size 
27MB.

I did some "investigation" (aka google search) and it looks like some error in 
our interaction with lucene. For some reasons, old files are not getting 
deleted or optimization (which automatically happens at the end of indexing) 
is failing silently! I would suggest deleting the documentation/ directory 
and re-creating the index. In the mean time, I am digging into lucene to see 
what could have caused this and how can this be prevented.

> Always: Starting beagle-build-index (pid 27679) at 28/10/2007 4:11:01 PM
> Debug: Set best effort IO priority to lowest level (7)
> Debug: Reniced process to 19
> Debug: Loaded 284 records 
> from /var/cache/beagle/indexes/applications/FileAttributesStore.db in 0.004s 
> Debug: Starting IndexWorker
> Debug: Size: VmRSS=11.9 MB, size=1.00, 0.0%
> Debug: Flushing driver, 30 items in queue
> Debug: -file:///usr/share/applications/screensavers/distort.desktop
> Debug: -file:///usr/share/applications/screensavers/galaxy.desktop
...

This run looks fine.

> It is probably worth noting that I always run the Ubuntu development
> version on the machine so package churn can be quite huge.  Is garbage
> collection happening?  i.e. when a documentation file disappears because
> the package is upgraded/removed is will the beagle index items be
> cleaned out?

Thats with the --enable-deletion switch. It should be added to the 
crawl-scripts.

- dBera

-- 
-----------------------------------------------------
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]