Re: [Tracker] Guessing metadata and retrieval from external resources

On 10-10-11 12:01, Martyn Russell wrote:
On 09/10/11 23:33, Age Bosma wrote:

Why would one want this? Often more info than can be extracted from
files is appreciated. It will prevent applications from having to
reinvent the wheel, deviating from Tracker as their meta-data source
because it does not have the information.
E.g. Rygel could start listing movies on a TV with the actual movie
title instead of using file names or list them by director even though
no tags where present. Banshee (if/when they start using Tracker) does
not have to maintain their own MusicBrainz query service because Tracker
already provides the information.

There are a number of issues here. What springs to mind is:

 - Do we write back the data to the file itself (I would like to see
that, but support there is limited right now by file type)?

Personally I don't think we should go there:
- Tracker is the metadata resource apps will use, no need to include it
in the files again.
- Apps, including Tracker, should not touch original files, unless
specifically requested to do so. It would introduce unexpected/unwanted
behaviour otherwise.
- And there's of course the difficulty of actually being able to store
the info in a specific container, as Jens Georg pointed out.

 - Guessing metadata based on filename, etc is currently build time
optional. Part of me wonders if this should be in the
tracker-preferences dialog somewhere so users can configure this more
dynamically. Part of me thinks it's not useful though. Perhaps a silent
configuration not in the UI is more preferred.

It is? How can I enable it and where is it located in the source?

Does tracker allow extending functionality as described above?

Yes and no. You could write a miner as suggested, but I feel this is not
the right approach. While the name "miner" makes sense, what we're doing
here is more "post-processing" and we've considered having some daemon
to go around cleaning up classes and information which can be derived
from content inserted by miner-fs or applications. A couple of examples
here are:

Has an attempt been made as well after the consideration? Is there an
alternative currently available?

 - You insert a contact for an email, you delete the email, the contact
then stay around. Really shouldn't the contact be removed? It does
depend on who uses it (the graph) but if it is just there for the email,
it should be removed ideally. If some gnome-contacts or other
application makes use of it using their graph to insert the data, we
wouldn't clean it up.

What is happening now?

I guess you could write a miner to do this. It would listen to graph
update signals to know when to find out about new music/videos and
update the store.

You could also write this into tracker-extract/libtracker-extract and
have some common functions to get this information. 

Both a miner and extractor are not quite meant for post-processing as
you've mentioned.

What advantage or disadvantage does an extractor have over a miner?
Are there more reasons why a minor would not be the right approach
besides its name?


Age (Forage)

Attachment: signature.asc
Description: OpenPGP digital signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]