Re: [Evolution-hackers] [Evolution] Beagle and Tracker, letting Evolution feed those beasts RDF triples instead



On Tue, 2008-12-09 at 18:00 +0530, Sankar wrote:

Hey Sankar,

I'm writing a plugin that will implement the "Manager" class as
described here. Tracker will then implement being a "Registrar".

http://live.gnome.org/Evolution/Metadata

I will be using camel-db.h as you hinted me on IRC to implement the
features in a well performing way (direct SQLite access).

I will start working on this plugin tomorrow or next week. At the same
time I will be implementing support for it in Tracker, which will serve
as a prototype for other metadata engines.

I hope to inspire people from the Evolution team, and from the different
metadata engines, to comment on the proposed D-Bus API.

Let's try to get this valuable metadata out of those damn E-mail clients
and let's try to get it right this time. Not ad-hoc, but right.

If the namespace should be translated from org.gnome to org.freedesktop
we can of course do this afterwards. The "metadata.Manager" part would
also have to be renamed to a better name. But in the end implementation
would not be affected a lot by such renames. Meanwhile we can prototype
it in GNOME's Evolution D-Bus namespace.

The reason for all this prototyping is that we wouldn't like to release
a Tracker that doesn't support Evolution's new summary format.

This time we're into getting it right so just hacking around the new
summary format by fixing something that wrongfully interpreted
Evolution's cache by itself instead of letting Evolution tell us about
it ... 

* Well we could do this, but really ... let's just get it right now that
  I can spend time on this. At least that's my point of view on this.

* Other apps trying to read Evolution's caches externally just isn't
  ever going to be generic for all E-mail clients, and is not really
  right. For example file locking and (now that it's SQLite based)
  caring about transactions being held by Evolution and all that stuff.
  Caring about the possibility of Evolution changing the database
  schema.

* It's just not very nice to do it that way in my opinion: It adds a
  unasked for burden on the Evolution team too: having to negotiate with
  us when you want to change the schema of the database. Else you will
  break a lot of people's desktops unannounced. Evolution would need to
  make a mechanism for us to tell us about the version of the schema,
  for example. And we would have to implement things in Tracker that
  deal with all versions of Evolution's cache versions.

  One big spaghetti mess distributed over multiple projects.

  So, let's just do it right 


> On Mon, 2008-12-08 at 18:59 +0100, Philip Van Hoof wrote:
> > All metadata engines are nowadays working on a method to let them get
> > their metadata fed by external applications.
> > Such APIs come down to storing RDF triples. A RDF triple comes down to a
> > URI, a property and a value.
> > 
> > For example (in Turtle format, which is SparQL's inline format and the
> > typical w3's RDF storage format):
> > We'd like to make an Evolution plugin that does this for Tracker. 
> > 
> > Obviously would it be as easy as letting software like Beagle become an
> > implementer of "prox"'s InsertRDFTriples to start supporting Beagle with
> > the same code and Evolution plugin, this way.
> > 
> > I just don't know which EPlugin hooks I should use. Iterating all
> > accounts and foreach account all folders and foreach folder all
> > CamelMessageInfo instances is trivial and I know how to do this.
> > 
> > What I don't know is what reliable hooks are for:
> > 
> >   * Application started
> 
> org.gnome.evolution.shell.events:1.0 - es-event.c - 
> 
> sample plugin:
> groupwise-account-setup/org-gnome-gw-account-setup.eplug.xml 
> 
> 
> >   * Account added
> 
> org.gnome.evolution.mail.config:1.0 
> 
> sample plugin:
> groupwise-account-setup/org-gnome-gw-account-setup.eplug.xml 
> 
> For account-added: id = org.gnome.evolution.mail.config.accountDruid
> For account-edited: id = org.gnome.evolution.mail.config.accountEditor
> 
> >   * Account removed
> 
> You may have to write a new hook
> 
> >   * Folder created
> >   * Folder deleted
> >   * Folder moved
> >   * Message deleted (expunged)
> >   * Message flagged for removal 
> >   * Message flagged as Read and as Unread
> >   * Message flagged (generic)
> >   * Message moved (ie. deleted + created)
> >   * New message received
> >     * Full message 
> >     * Just the ENVELOPE
> > 
> 
> If you try to update your metadata for every of the above operations, it
> may be a overkill in terms of performance (and I believe more disk
> access as well for updating your metadata store). You can add a new hook
> while any change is made to the summary DB and listen to that. All the
> above changes will have to eventually come to summary DB for them to be
> valid.
> 
> 
> However, I personally believe:
> 
> More and more applications are using sqlite (firefox and evolution my
> two most used apps.)  So, it may be a better idea to directly map the
> tables in an sqlite database into the search applications' data-store
> (beagle, tracker etc.) instead of depending on the applications to give
> the up-to-date data. 
> 
> When we implemented on-disk-summary for evolution summaries, we removed
> the meta-summary code (used by beagle). We had to provide a way for
> helping Beagle / Tracker to know of modified/new mails, so they could
> (re)index these mails. Some suggested that we should add a DATETIME
> field which contains the time-stamp of the time last modified/created
> for each record. However, this in addition to bloating the database,
> also does not provide any information about deleted records.
> 
> If, inside the sqlite db, if we have a special table comprising:
> table-name,primary keys of records of last N records
> modified/added,time-added; Any search application can make use of this
> and update its lucene (or whatever) data ,  
> 
> It may not be the neatest approach, but what I want to say is : Instead
> of depending on the enduser applications (which use sqlite) for giving
> data, search applications, should be able to get the data from the db
> itself. This also provides additional benefits like creating/updating
> search indices when the machine is idle, instead of choking the
> applications when they are running, etc.
> 
> My 0.2 EUROes ;-)
> 
> --
> Sankar
> 
> 
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]