GSoC Weekly Report



Hello All,
This week I've been working on the new TextCache implementation that
I'd mentioned the last time (replacing the bunch of files with an
Sqlite db).

Making an Sqlite db with just the uri and raw text caused an almost 3x
increase in the text cache size (3.6 MB (on-disk) vs. almost 15MB in
my test case). This despite the fact that the size of the raw text was
only 7.9 MB. I need to figure out why this happens. In the mean time,
I also implemented another version of this which stores (uri, gzipped
text) pairs in the Sqlite db instead of (uri, raw text). Surprisingly,
this actually seems to work very well (the db for the test case
mentioned shrunk down to 2.6 MB, which is just a little more than the
actual size of the compressed data itself).

Performance numbers on a search which returns 1205 results are below.
I basically ran the measurements twice -- once after flushing the
inode, dentry and page cache, and another time taking advantage of the
disk caches.

Current TextCache:
no-disk-cache: ~1m
with-disk-cache: ~9s

New TextCache (raw and gzipped versions had similar numbers):
no-disk-cache: ~42s
with-disk-cache: ~10s

One very important factor remains to be seen -- memory usage. I am
working on figuring out what the impact of the new code on memory
usage is. Numbers should be available soon.

On the Xesam front, I will be updating the code tomorrow,day-after to
reflect the latest changes to the spec.
-- 
Arun Raghavan
(http://nemesis.accosted.net)
v2sw5Chw4+5ln4pr6$OFck2ma4+9u8w3+1!m?l7+9GSCKi056
e6+9i4b8/9HTAen4+5g4/8APa2Xs8r1/2p5-8 hackerkey.com



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]