Re: Reviving Semantic Relationships in Beagle

On 10/25/07, Enrico Minack <minack l3s de> wrote:
> Hi Kevin,
> > ... it became apparent that it wasn't so much the storage of metadata
> > that was important (lucene stores Properties, or 'Fields' just fine) but
> > relationships between data.
> right, Lucene is excelent in storing and querying fields (properties of
> resources are string values), but fails when you want to follow / exploit
> relations (references to other resources, which are stored in Lucene as
> strings refering to URIs of other lucene documents). An RDF store supports
> such a join natively, because it is the very nature of an RDF query to have
> joins / to contain relations.
> > ... so I propose the following means of 'hooking up' a RDF store to
> > Beagle.
> >
> > -New Query_Part which allows a rdf type query (raw) against the store.
> > -Wire into LuceneQueryDriver and LuceneIndexDriver to store new
> > relationships in RDF store and query them upon creation of a Hit.
> > -Add a more accessible API to Filter for Adding Parents/Children to
> > indexables.
> relations are not always parent - child relations, they usually refer to
> related resources. For instance, an email tells you something about people
> that send / received the email, so a relation between the email and the
> involved people can be created. Or the email contains a link to a web page,
> which is clicked and indexed by beagle later, so these resources can also be
> connected via a relation. But these examples are no parent - child
> relations, but still provide valuable information for the user to find
> resources / browse results. In RDF, there is no need to restrict the notion
> of a relation to a parent-child relation.
A good point, I guess its the byproduct of so much Sql work, where
things tend to be parent-child so often...
> > While this seems like we are replicating much of the data in the
> > Lucene Fields, this is actually something completely different, we are
> > referencing an exact entity, not just a name, or subject. As a result
> > of this tree, not only can we adjust our scoring to account for
> > related items, but we can provide right-click options like 'See all
> > files by this author' etc. in a more intelligent mannor.
> right, storing these relations does not mean the replication of metadata
> (properties). With the URIs that can be found on both sides of a relation,
> you can get all metadata from the Lucene index. You only additionally store
> the relation, not the metadata that are associated with a URI.

To talk a little more about the internals here, we could allow a
relationship to simply consist of 2 queries and a descriptor of some
sort. In the event of a one to one relationship, those queries would
just be Uri queries for exactly what we wanted. However, a more common
scenario would be  something like one contact to anything with 'Kevin
Kubasik' in the message_from property. Its not as concrete as all 1 to
1 solutions, however, it might work. What I'm still iffy on is why not
just use lucene fields and relate based on those?
> Regards,
> Enrico M.
> _______________________________________________
> Dashboard-hackers mailing list
> Dashboard-hackers gnome org

Kevin Kubasik

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]