Re: [Tracker] indexer-split branch missing lots of important stuff



On Tue, 2008-07-29 at 08:08 +0100, Martyn Russell wrote:
Jamie McCracken wrote:
On Mon, 2008-07-28 at 10:33 +0100, Martyn Russell wrote:
yes best done in daemon

OK, that can easily be added.

3) stop words in tracker-parser.c - why did you remove this?
If you mean tracker->stop_words then it wasn't used as far as I can see.
If you mean tracker-parser.c, it is now in src/libtracker-common.

http://svn.gnome.org/viewvc/tracker/trunk/src/trackerd/tracker-parser.c?view=markup

this uses stop words - the one in libtracker-common does not

Pasting that link doesn't really help me. I already looked at that file
before replying to you. The tracker->stop_words is either always NULL or
never used to insert anything, therefore, to me, it looked completely
pointless. Is this the member you were talking about?

nope it loads the stop word lists for the users language - dont think it
was null unless you had a config problem


pls make sure you load the appropriate stop word list for the stemming
langauge - see trunk for this

I *think* that Ivan actually fixed this yesterday. Ivan, care to comment?

4) index merging - where did this go? Without this indexing wont scale
up (its slow updating a large index) + you will suffer from
fragmentation and wasted disk space due to hashtable resize relocations
(these are never recovered) - merging solves both these issues
I will look into this.

Ivan care to comment here. I think you discussed this yesterday with
Carlos right?

well I have been looking at sqlite FTS 3 and i would like to move to
that at some point. FTS 3 is a loadable module and is independant of
sqlite but we would need to fork it and modify it to suit our needs (add
our parser, add category and ranking to the index structure, zip
compression to the text fields etc). 

you can find it in the ext directory of the sqlite 3.6.0 tarball

im not sure if its worth spending time adding index merging back which
is a little complex if we move to FTS soon after merge

jamie


Thanks for your quick review Jamie.

np - im on holiday next few days but will look to review again next
weekend

Yesterday Ivan and Phillip got quite a bit done. The tracker-search-tool
seems to be working again and we now clean up index files on --reindex
which we didn't do before. This was stopping the whole t-s-t from
working for me.

We are continuing to work on your TODO list this week.


thanks

jamie




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]