Re: [Tracker] Fwd: tracker 1.11.2



On Sun, 2016-12-11 at 17:07 +0100, Carlos Garnacho wrote:

[cut]

- Getting as close to supporting the full sparql 1.1 spec as possible
in libtracker-data:
  * property paths: last weekend got halfway with this \o/
  * graph management: for DROP GRAPH I think triggers will perform

Did ever something happen to cleaning up anonymous nodes of deleted
subjects/context, and or do reference counting on them (and clear them
once they reach zero references)?

If not then we are still leaking those in the db afaik. We always wanted
to do something about that.

Yeah, we still do leak those... I remember you/Jürg suggested setting
a foreign key with ON DELETE RESTRICT action from the various tables'
ID rows to the Resource table, so the cleaning up the Resource table
would fail for the still referenced nodes. I got as far as seeing
that:

- Performance would be just fine for the common ops, the trigger would
only run when trying to delete the parent key in the Resource table,
which is the once-in-a-while operation, modifications on the cols
setting the foreign key would be just as fast as they're now.

nod

- It however wouldn't fix alone the other issue I saw happening before
the revert (graph URNs being deleted). I had a patch around that added
a Graph table, so IDs in the Resource table were ensured to be in
either rdfs:Resource or the Graph table. That already helped with
identifying and not deleting the graph URNs during garbage collection,
and seems useful for graph management, but I think I can't just add
the same RESTRICT action as CLEAR/DROP GRAPH will want pretty much the
opposite.

Personally think some sort of reference counting will be needed for
anonymous nodes references by different graphs..

- I also wondered if it's more desirable, or allowed by the sparql
spec, that we actually garbage collect the inconsistent nodes. IMO not
leaving this type of data coherence up to miners/apps being educated
when deleting would be a win, but I've only seen mentions in the
sparql spec about impls being free to drop empty graphs, nothing about
triples with no longer bound elements.

I also don't think it's a problem to rid ourselves of orphaned anonymous
nodes.

Without a graph to be owned by, they can't be referenced other than by
their uuid anyway.

[cut]

- Double checking ontology migration code, ensure it can handle weird
ontology changes more or less elegantly.

You will have a lot, a lot of fun with that code :-)

Already visited it briefly :P.

:-)

Kinda replying here to your other email, there will be indeed
situations of precision or data loss, as long as 1) what is supported
and what is not is properly documented, 2) we do our best to ensure
the resulting database represents the current ontology or 3) error out
and rollback the ongoing changes, I think should be fine.

I guess we could write the old ontology in TTL format alongside the
unconverted data in a TTL file, to allow the user to process it manually
later.

That would allow a distribution, device maker and/or application
developer to provide data conversion tooling.

I think this will be mainly useful for apps using private
databases/ontologies, if they are in control of the both the ontology
and the data, they can also look for ways to preserve or re-extract
what's interesting while changing their ontology.

Yes, indeed.



- Library-fying tracker-store, and separating ontology for good, so
eg. an irc client wanting to store conversation logs privately can eg.
do:

Yes! :) Want!

Cool :), I think this will be a win for apps considering their data
precious, not all data is equally disposable. I already feel shivers
each time I have to recommend tracker reset -r ...


Exactly.


Kind regards,

Philip

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]