Re: Querying expense and weak hashtables
- From: D Bera <dbera web gmail com>
- To: Daniel Drake <dsd gentoo org>
- Cc: dashboard-hackers gnome org
- Subject: Re: Querying expense and weak hashtables
- Date: Sat, 26 Nov 2005 12:08:28 -0500
> I've noted in the past that querying a lot is one way to bump up beagled's
> memory usage fairly easily.
>
> There's a discussion going on at the dotLucene forums about a fairly serious
> memory problem in Lucene's field caching:
>
> http://sourceforge.net/forum/forum.php?thread_id=1378460&forum_id=408004
The aforesaid discussion has come to an end. They found a few leaks in
the implementation of IndexSearcher and also fixed them (without using
WeakHashMap).
Curious enough, I patched my local copy with those changes and rebuilt
beagle. Perhaps the patch fixed some memory leaks but beagle querying
is still quite expensive - order of magnitude.
Some observations for querying against the File backend. I started
beagled and gave it a large enough FileSystem root to index. Waited
till it finished indexing. Then stopped the daemon and started it
again. For the rest of the time I didnt touch the files in the root -
so the expense can be attributed solely to querying-expense.
Fired a query with only a few results - vmsize increased by 20MB !
Subsequent queries raise the vmsize based on the following pattern: If
any of the query results are from a different part of the filesystem
tree from what has been reported before, vmsize increases by a few MB.
I suspect this is partly due to some caching involved when
file/directory paths are retrieved by the beagle-query-driver. The
driver has to map uid-s from the hits to actual filesystem uri-s. This
querying is kind of expensive - so there is some caching involved. I
have a hunch that this cache size is growing over queries. One way to
fix this might be by keeping a cache per query - and clearing the
cache when the query is over.
- d.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]