Re: [Tracker] The quadruple, team conversations on IRC

From: Jamie McCracken <jamie mccrack googlemail com>
To: Philip Van Hoof <spam pvanhoof be>
Cc: tracker-list <tracker-list gnome org>
Subject: Re: [Tracker] The quadruple, team conversations on IRC
Date: Thu, 30 Jul 2009 18:01:30 -0400

I would like to add another related matter

As you know, tracker is about much more than mobiles and we want it to
rock on all platforms be they desktops, netbooks or mobiles

As the most tracker-friendly distro (after Maemo!), Ubuntu is keen on
using CouchDB for user metadata and as we will need something other than
turtle to act as our backup of user metadata, I would like to abstract
away any interface to it. I would probably imagine CouchDB as being too
big for mobiles so said abstraction could provide sqlite in addition to
CouchDB as compile time options

The abstraction would not dictate whether its a triple or a future quad
store. CouchDB can store an infinite amount of fields so it wont be a
limitation in that area.

CouchDB would provide automatic sync of user metadata without quads as
well as enabling the indexer to distinguish between embedded and
non-embedded which are the two main goals of quads

CouchDb, unlike sqlite, is optimistic locking and uses the advanced
multi generational architecture found in big rdbms so its a very safe
backup

CouchDB can use indexed views for fast access so it should perform just
as well as sqlite as both a triple and quad store

Of course there is no sparql so it cannot replace our decomposed sqlite
based store and I am not proposing that - its just for replacement of
turtle backup only (or potential implementation option for quad store)

Of course if Nokia is happy to use CouchDB on mobiles it might solve a
lot of problems here and make the abstraction unnecessary

Of course others may have different opinions but please share them with
us

jamie

On Wed, 2009-07-29 at 14:53 +0200, Philip Van Hoof wrote:

This is a mail describing what the team has been discussing on IRC, so
that people who aren't on IRC can follow up too.

o. We want to start having a quadruple alongside the decomposed sqlite
   tables. 

   o. Such a quadruple will allow us to implement backup

   o. Such a quadruple will allow us to implement named graph support

      o. Makes it possible for synchronization software to make somewhat
         intelligent decisions based on for example origin of the triple
      o. Allows us to store the origin of metadata (local, flickr,
         facebook, etc)
      o. Would make the KDE people interested in using Tracker for their
         mobile targets (according to discussions with them at the
         desktop summit conference earlier this month)
      o. Avoids the embedded/non-embedded metadata issue when updating
         indexed files as we'd know the origin for each statement

o. The quadruple will store all metadata, also embedded metadata. This
   will create an overhead of about 10% during storage. It wont give any
   overhead for read queries.

o. We will check if we can win back some of that 10% (some ideas are
   floating around)

o. If the embedded metadata can be left out of the quadruple then we
   should try to achieve that.

o. We wont try to remove the transaction from the decomposed tables yet.

People can followup development of the quadruple in the "quad" branch on
GNOME's git. This branch will likely soon be merged to "master".

The branch contains the support for the quadruple and, implemented on
top of that, support for backup. The branch doesn't yet implement named
graphs (the origin column is left empty for now).

Known concerns: Jamie's concern is a performance issue if we store both
embedded and non-embedded data in the quadruple. We will investigate
this performance issue and see if we can mitigate it.

References:
- [Tracker] The quadruple, team conversations on IRC
  - From: Philip Van Hoof

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]