[Tracker] branch wip/fts4 ready for review



Hey,

In the past days I've been working on having Tracker use latest FTS4
module from sqlite, this code is quite better structured than FTS3, and
has allowed to replace existing functionality with no modifications to
the module code itself.

I think the branch is completely functional now, just some notes:

      * tracker:fulltextNoLimit isn't yet respected, the tokenizer
        implementation is quite decoupled from the sql layer, so it's
        hard to find out which property we're parsing.
      * Text ends up doubly stored in the database, on this ~8000
        elements database, meta.db size grows from 15.5MB to 17MB, of
        course this depends on the ammount of text
      * In order to squeeze all fts4 potential, sparql->sql translation
        could need to be refactored to only call the fts functions that
        are requested in sparql (ATM both fts:rank and offsets are
        unconditionally added to the query), the fts4 page has some
        notes about doing FTS queries that don't pull up a lot of data
        from disk [1], I think we'd be falling in there now
      * Somewhat related to the last point, fts:snippet could be
        implemented feasibly.
      * The rank function could now do complex stuff, based eg on the
        number of occurrences of the word in that row, or pumping up
        less frequent search terms, no changes have been done to the
        older behavior.

Cheers,
  Carlos

[1] http://www.sqlite.org/fts3.html#appendix_a




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]