Re: [Tracker] Extracting new types of data (or: what on earth is nepomuk?)



Hi!

El vie, 24-07-2009 a las 00:11 +0200, ext John Millikin escribiÃ:
I've working on a toy project to index and display a directory full of
audio files, a bit like an automatically-updated Rhythmbox or Banshee.
Tracker 0.7 seems like an ideal backend for this, because it can be
easily queried and will manage the indexes automatically. 

 Great! 

However, I'm
running into a bit of trouble when it comes to storing and extracting
data -- namely, figuring out where to store and query metadata.

First, there are several missing metadata fields that would be
tremendously useful -- album artist, ReplayGain, MusicBrainz
identifiers, track count, sort names, etc. From the Tracker-related
posts on Planet GNOME I believe that these fields needed to be added
to a "nepomuk ontology". However, the NEPOMUK homepage[1] and
development site[2] are tremendously unhelpful. I can't find a list of
valid field names, which values are to be stored in them, or anything
except a wall of vague and buzzword-laden text.


 The useful documentation are just the ontology descriptions, starting
in NIE and following the links:
http://www.semanticdesktop.org/ontologies/nie/

 That documentation is not completely up-to-date (we have some changes
in the tracker copy) but it gives you a good overview.

The Tracker source is more informative, and there appears to be an
ontology for storing ID3 information in
<tracker/data/ontologies/37-nid3.ontology>. However, none of the
fields defined within provide any results when I query them using
"tracker-sparql". 

 We think that ID3 ontology is not really a good idea, and we are
working in a different ontology (NMM). AFAIK our extractor is using that
ontology.

 Anyway, it wont include all the properties you need, but it is fine to
_extend_ the ontology adding what you need. As long as you dont modify
the basic ontology structure it is fine.


I've tried adapting the queries in Mr. Van Hoof's
blog post Introduction to RDF and SPARQL[3] to query ID3 metadata, but
with no success. However, even the ID3 ontology in Tracker appears to
be missing several important fields (like the above mentioned).

 Try with NMM classes and properties. If you still have problems, we can
provide you some examples of SparQL queries.

 E.G. "All music files (the uris)" should be something like

SELECT ?u WHERE { ?u a nmm:MusicPiece. }


Is there any way to add these fields to one of the supported
ontologies? If not, is there a "best way" to add new ontologies?


 Use the turtle files in tracker's data/ontologies directory as example
about how to define properties and classes. Write a new file, install it
in $(PREFIX)/share/tracker/ontologies and restart tracker with -r (so it
re-create the dbs with your new properties)

Second, there seems to be a duplication of information for each file,
and I do not know which fields are "correct" and which are development
leftovers or deprecated. For example, to find the title of a given
track, I am offered: dc:title, nie:title, and nid3:title. Which of
these should be used? Which are legacy duplicates? It is a mystery.

 There is a hierarchy (from top to bottom): dc:title, nie:title,
nid3:title. 

 This means that when you set a value in nid3:title for an instance,
nie:title and dc:title will have the same value.

 I tried to explain this in 
http://live.gnome.org/Tracker/Documentation/AppDevelopersManual#head-5ea53d8b68454d0318b4f36a3363e0e6ecc42370

Lastly, I do not see any way to define additional fields after
installation. If I were to add (for example) support for storing a
per-track rating to the music player, it would have to be in an
external data store rather than Tracker. Obviously, this is
sub-optimal -- is there any way to define custom or per-application
fields?

You can always extend the ontology, but it is not recommended. For
instance, the rating should be shared between all applications (if it is
relevant for the user, it is relevant whatever application the user is
using). 

 About "after installation", we plan to support addition of ontologies
dynamically (you install your application and the new properties/classes
are available automatically). It is not implemented yet, and wont be
available in the short time but it is in the roadmap. Patches or ideas
are welcome

 Thanks for the interest, and feel free to ask here or on the IRC,

 Ivan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]