Re: [Tracker] Storing Metadata with files
- From: Martyn Russell <martyn lanedo com>
- To: Nikolaus Rath <Nikolaus rath org>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] Storing Metadata with files
- Date: Wed, 02 Jun 2010 14:48:20 +0300
On 25/05/10 16:37, Nikolaus Rath wrote:
On 05/25/2010 03:00 AM, Martyn Russell wrote:
What I would like to do is to make it possible to have a tighter
coupling between the indexed files and additional metadata about them.
In what way?
Basically I do not want metadata to be removed just because the indexed
file apparently does not exist at the same place anymore.
Ideally, tracker should e.g. detect if a file has just been renamed and
migrate the existing metadata.
We have been discussing this during the code camp. We have a proposed
solution in mind to fix this and will be looking at that in the coming
weeks.
Hmm, what causes this for you? That's not expected.
Well, I thought tracker was deliberately designed to do that. Imagine
this situation:
- A file "letter_to_company" is added the tracker archive folder
- Additional metadata about "letter_to_company" is added to the
tracker database with the dbus API
- Someone wants to "clean up" the archive folder and moves
"letter_to_company" into a "letters" subdirectory
- Tracker indexes the "new" "letters/letter_to_company" with the
metadata that's available in the file
- Tracker removes all the metadata about the original
"letter_to_company", because that file no longer exists
- The metadata added via DBus is now lost irrevocably
Isn't that what's going to happen?
At the moment yes. Also, there is some question as to how persistent the
metadata (not added by Tracker but by 3rd parties) should be and how to
manage out of date or no longer needed data.
Presumably if the file is moved to another directory we see a MOVE event
and we should deal with that correctly. If we see events for the file
being deleted and added later to another directory (i.e. we have no
relationship between the two) then we have to assume the file is deleted
and in that case it invariably makes sense to remove all metadata
related to that. Files are generally a different case to other resources.
Therefore I like the idea of storing the metadata in a separate file.
This is far from trivial and we have moved away from separate databases
since 0.6 (for the same type of data) for a number of reasons:
* Speed
* Maintainable
* ...
I agree, conceptually, it makes sense, but the reality is this is much
harder to do. We do use a journal to backup all the data, this would
include your user metadata.
Hm. Interesting. So there is a permanent record of every file that has
ever been added to tracker? Or is this journal expired from time to time?
Currently the journal records all events to be able recreate the
database. There is a branch to compress the journal but that's as yet
unmerged to master.
How do I access this journal to e.g. obtain the metadata from a deleted
file?
You're not supposed to access it. It is there for us to reproduce the
databases in the event of corruption. This is done automatically, there
is no user interaction required.
Hmm, a new extractor won't work here. To catch ALL files, you would need
to write a generic one and generic extractors are fallbacks for specific
ones at this point.
I was thinking about a specific extractor just for .xmp files which adds
the extracted metadata to the "real" file. Wouldn't that be possible?
Sure assuming your getting inotify events for .xmp files and the
extractor knows how to associate those with the REAL file. But it goes
against the principle of the "extractor" to _extract_ not _writeback_.
So we don't recommend it.
Also, the extractor only gets the metadata for that file format, it
doesn't extract or insert the file metadata (size, name, etc).
I don't quite understand.
What I mean is that we just return _embedded_ data, not _all_ the
metadata about a file. The miner-fs concatenates the file data (like
size, mtime, etc) to the embedded data and sends it to tracker-store.
We would definitely accept a patch to fix this :)>
This sounds a little bit ambiguous :-). To be avoided but you'd accept a
patch?
I mean generally, if an idea makes sense we would accept patches for a
fix or implementation of sorts.
--
Regards,
Martyn
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]