Re: Proposing Tracker for inclusion into GNOME 2.18



On 25/10/06, Jamie McCracken <jamiemcc blueyonder co uk> wrote:
>> I have been thinking of allowimg all metadata to be registered with an
>> optional hardcoded dublin core type(1) so we could use maybe
>> "dc.creator" that would search all metadata registered against that type.
>>
>> You would not be able to get/set the value of any of the dc types
>> obviously as they would potentially point to more than one metadata.
>>
>> Would that satisfy you Ross?
>>
>> (1) http://dublincore.org/documents/1999/07/02/dces/
>
> If it was specific to dublin core, and only had one level of
> specialisation, then it would only be of limited use.

it would be limited - but is that a bad thing?

If it doesn't solve the problems at hand, it would be a bad thing.  As
I mentioned previously, it sounds like it wouldn't handle the case of
creating relationships between two similar metadata schemas.

Im planning to add extensible metadata support to tracker-search-tool
but I need it to be easy for users to add new metadata. Having a simple
window which allows you to define the name, type (indexable string,
string, numeric or date) and class (a dropdown combo of the 13 or so DC
types) should give us a good balance.

The relationships between metadata types wouldn't really affect this
UI though, right?  The relationships would be provided by application
developers along with the metadata extraction tools for their file
types, right?


I would be worried about anything more complicated. That said for the
application api I could be more open.


>
> If we ignore the implicit inverse relationship issue, you really want
> a table that contains "property type foo is a subclass of the property
> type bar".  For the case of two properties being equivalent, we can
> represent that with two relationships:
> * foo is a subclass of bar
> * bar is a subclass of foo

yes thats easy to do

>
> Of course, using a table like this is not going to be fast when doing
> SQL queries.  So you probably want to have a flattened version that
> can be used to map from a property to all its subclasses (usually
> including itself, to make the SQL easier).  Such a flattened table is
> pretty easy to maintain with database triggers or in the application
> code that updates the metadata type relation table.

We shouldn't need a flattened table as the two metadatatype tables would
be properly indexed so joins would be fast (especially with sqlite)

Really?  From the description I give above, the relationships between
different metadata types form a directed graph.  Taking the music
example again, consider the following set of relationship types:
1. foo performed bar
2. foo performed vocals on bar
3. foo performed lead vocals on bar

If I register that (2) is a subclass of (1) and (3) is a subclass of
(2), I want searches for relationship type (1) to pick up metadata of
types (1), (2) and (3).

SQL systems don't generally deal with graphs particularly well (unless
they've got some extension to do so).  So I was suggesting that a
flattened table be maintained as a cache which could map any
relationship type to all the relationship types reachable in the
graph.  That way you can it is a single join (or subquery) to select
all the relevant metadata types.

If you know a better way to do this efficiently without a flattened
table, that's great.


Technically there's no problem here for me to implement this, my only
concern is keeping it KISS and simple for users/apps to use without too
much complication

Ideally this would be mostly invisible to users: their queries for
generic relationship types would just work, even if the metadata
extractors for particular file types produced more specialised types.

It does mean that application authors who wish to introduce new types
of metadata need to think about how they relate to the existing types,
but that is necessary for the metadata to stay useful when doing
generic queries.

James.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]