Re: GSoC Weekly Report

> Making an Sqlite db with just the uri and raw text caused an almost 3x
> increase in the text cache size (3.6 MB (on-disk) vs. almost 15MB in
> my test case). This despite the fact that the size of the raw text was
> only 7.9 MB. I need to figure out why this happens. In the mean time,
> I also implemented another version of this which stores (uri, gzipped
> text) pairs in the Sqlite db instead of (uri, raw text). Surprisingly,
> this actually seems to work very well (the db for the test case
> mentioned shrunk down to 2.6 MB, which is just a little more than the
> actual size of the compressed data itself).

> Current TextCache:
> no-disk-cache: ~1m
> with-disk-cache: ~9s
> New TextCache (raw and gzipped versions had similar numbers):
> no-disk-cache: ~42s
> with-disk-cache: ~10s

The numbers look pretty good. Size on disk is the main focus here. The disk 
cache will come into heavy play on a machine constantly serving queries. So 
even if that suffers a little bit (but only a little bit), I think its still 
ok if we gain in other places. The speedup with no-disk-cache is an added 

Do the performance degrade when looking up small result sets ? In the current 
implementation, that will involve lesser disk seek whereas for the sqlite 
based approach, the I/O overhead will probably be similar.

- dBera

Debajyoti Bera @
beagle / KDE fan
Mandriva / Inspiron-1100 user

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]