[Tracker] On Bindings and Abstractions

From: John Carr <john carr unrouted co uk>
To: tracker-list gnome org
Subject: [Tracker] On Bindings and Abstractions
Date: Thu, 12 Nov 2009 15:31:22 +0000
Hi,

For those of you that don't know me, i'm Jc2k on #tracker and i hack
on python, javascript and vala bindings and toys for tracker.

There have been a few mentions of providing higher level libraries for
working with the tracker store recently so i thought i'd try to
summarise the work i'm aware of and maybe try and attract some people
to working on them and make sure im not heading in the wrong direction
too.

== Summary ==

 * libtracker-common: Hides the dbus interface behind GObjects. Also
provides some simple search functions. I don't think it helps out with
the Subjects{Added,Changed,Removed} signals.

 * TrackerSparqlBuilder: I see this as been like the SAX of the SPARQL
world. You are writing to a stream: "add predicate", "add object".
Currently I think it can only be used for update queries. Is it
internal or public api? I think it used by the miners at least.

 * libqttracker. A C++ abstraction of SPARQL. You create objects
representing variables and define relationships between them. These
are transformed into SPARQL under the hood. This is a bit more like
DOM to the SAX of TrackerSparqlBuilder.

 * tralchemy: A python abstraction around SPARQL. Provides classes for
each rdfs:Class. Hooks into python import mechanism leaving user
unaware that the bindings are dynamically generated. The python help
system works, providing access to the documentation that exist within
tracker. Initially sync, moving towards async.

 * sparql-glib: Very very verbose abstraction. You are building a tree
of objects that represent a SPARQL query and these are transformed
into the SPARQL. Hides the dbus interface and gives you a Model (see
git.gnome.org/cgit/model) interface for operating on the data. Async.

 * jalchemy: Started out like tralchemy but in javascript. Javascript
has no properties system and no interactive help so the benefit of
replicating tralchemy in javascript quickly passed. Eventually started
morphing into a wrapper around sparql-glib, using JSON to hide how
verbose sparql-glib is. Async.

Are there any more Tracker-orientated or GObject-orientated SPARQL
bindings we should mention?

== Things i'm doing ==

tralchemy is based on classes and has a simple querying mechanism
where you can do Contact.find(name="John"). I don't really think this
can work for more complicated graphs without something that feels
really hacky. So i'v been toying with representing SPARQL as JSON,
which is great in JavaScript and Python (for me :P).

q = query({
    "name": "uri",
    "type": "nco:Contact",
    "nco:birthDate": some_value_from_python,
    "nco:fullname": "?fullname",
    "nco:photo": {
        "nie:isStoredAs": "?photo",
    }
})

I'm representing a set of triples as a dictionary. I guess "name" is
reallly "subject". "type" is optional and is the rdf type that is used
after an "a". Predicates can be constants or variables, and if we care
about a predicate of the variable rather than the value of the
variable we just have an inline dictionary. Of course, inline
dictionaries can have there own inline dictionaries. So, anyway, this
JSON is transformed to something like:

SELECT ?uri ?fullname ?photo WHERE { ?uri a nco:Contact ;
nco:birthDate "whatever" ; nco:fullname ?fullname ; nco:photo ?a1 .
?a1 nie:isStoredAs ?photo }

Filters don't really work too great here but otherwise i find it a
very useful way of avoiding variable substitution and string escaping
and personally makes it easier for me to see how there is a graph and
how things are related.

Given Vala and C don't have JSON, i've yet to have any bright ideas to
try out there. Instead, sparql-glib is very very verbose. ATM, you
have to create lots of objects...

== Things the sweaty masses want ==

 * An abstraction for GObject: Not sure here. 2 immediate approaches
are smart objects that either cache all the predicate/objects for a
subject (once created and using the signal mechanism to stay up to
date) or provide a getter that takes a list of predicates and calls a
callback with the results. This does not help with building
complicated queries, though. I have no ideas there.

 * c# bindings. I imagine these should be LINQ.

 * Direct access to tracker database. My motivation here is for easier
testing. Right now my testing environments have to set up their own
dbus session bus... It also would be nice to be able to set multiple
ontology paths so I can load the system ontologies + my own custom
ontologies that i'm testing.

== Moving forward ==

Things i'm considering doing:

 * TrackerSparqlBuilder should be set free and able to do more than
updates. Spin (a copy of) it into sparql-glib?
 * Is the tree of objects transformed to SPARQL useful to anyone? Then
rebase it on top of TrackerSparqlBuilder.
 * If the JSON abstraction is useful, move the javascript code into sparql-glib.
 * Try and build some of the C/Vala abstractions

I want the core code to stay as ontology free as possible. And if
needed, have generated (seperate) code for GObject users.

Tralchemy will (for the moment) duplicate work in sparql-glib and
jalchemy. When introspection is finished or sparql-glib has solid
python bindings a simpler version could be provided as part of
sparql-glib.

Anyone have any input or ideas for the Tracking Binding and
Abstraction Adventure?

</rambling>,
John
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]