Re: Policy question



Maybe instead of loading the entire set of mappings, you can load just
load the most frequently used keywords. This way you can reduce the
amount of storage required and it's quicker than usual most of the time.

Julian Satchell wrote:
The sqlite database makes the backend for document indexer slow for
large numbers of documents.
I can cure the problem by preloading all the data that maps keywords to
documents into memory, so it no longer touches the database. This makes
the backend very fast - it is now the first backend to respond to a
query.
The downside is two fold:

1) Startup is slow, and increases linearly with the amount of text that
has been indexed. It can easily be 10s of seconds.

2) Memory consumption is increased, and will rise proportionally to the
amount of text indexed. The consumption has been minimised by object
sharing, but still requires one object reference per textual word (which
is the minimum I can imagine). I think this is 4bytes per keyword on
x86. If you have a lot of big documents, you could easily have many
millions of words (for example, I often write or work with reports that
are 5,000 to 20,000 words long and many people would have hundreds of
documents like this). The memory consumption could easily run to many
10s of Mbyte.

Is there a policy about performance vs resource consumption trade-offs?

Julian

_______________________________________________
Dashboard-hackers mailing list
Dashboard-hackers gnome org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers




--
Stephen Caldwell

Touqen Labs
www.touqen.com



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]