On Sun, 2016-12-11 at 17:07 +0100, Carlos Garnacho wrote: [cut]
- Getting as close to supporting the full sparql 1.1 spec as possible in libtracker-data: * property paths: last weekend got halfway with this \o/ * graph management: for DROP GRAPH I think triggers will performDid ever something happen to cleaning up anonymous nodes of deleted subjects/context, and or do reference counting on them (and clear them once they reach zero references)? If not then we are still leaking those in the db afaik. We always wanted to do something about that.Yeah, we still do leak those... I remember you/Jürg suggested setting a foreign key with ON DELETE RESTRICT action from the various tables' ID rows to the Resource table, so the cleaning up the Resource table would fail for the still referenced nodes. I got as far as seeing that:
- Performance would be just fine for the common ops, the trigger would only run when trying to delete the parent key in the Resource table, which is the once-in-a-while operation, modifications on the cols setting the foreign key would be just as fast as they're now.
nod
- It however wouldn't fix alone the other issue I saw happening before the revert (graph URNs being deleted). I had a patch around that added a Graph table, so IDs in the Resource table were ensured to be in either rdfs:Resource or the Graph table. That already helped with identifying and not deleting the graph URNs during garbage collection, and seems useful for graph management, but I think I can't just add the same RESTRICT action as CLEAR/DROP GRAPH will want pretty much the opposite.
Personally think some sort of reference counting will be needed for anonymous nodes references by different graphs..
- I also wondered if it's more desirable, or allowed by the sparql spec, that we actually garbage collect the inconsistent nodes. IMO not leaving this type of data coherence up to miners/apps being educated when deleting would be a win, but I've only seen mentions in the sparql spec about impls being free to drop empty graphs, nothing about triples with no longer bound elements.
I also don't think it's a problem to rid ourselves of orphaned anonymous nodes. Without a graph to be owned by, they can't be referenced other than by their uuid anyway. [cut]
- Double checking ontology migration code, ensure it can handle weird ontology changes more or less elegantly.You will have a lot, a lot of fun with that code :-)Already visited it briefly :P.
:-)
Kinda replying here to your other email, there will be indeed situations of precision or data loss, as long as 1) what is supported and what is not is properly documented, 2) we do our best to ensure the resulting database represents the current ontology or 3) error out and rollback the ongoing changes, I think should be fine.
I guess we could write the old ontology in TTL format alongside the unconverted data in a TTL file, to allow the user to process it manually later. That would allow a distribution, device maker and/or application developer to provide data conversion tooling.
I think this will be mainly useful for apps using private databases/ontologies, if they are in control of the both the ontology and the data, they can also look for ways to preserve or re-extract what's interesting while changing their ontology.
Yes, indeed.
- Library-fying tracker-store, and separating ontology for good, so eg. an irc client wanting to store conversation logs privately can eg. do:Yes! :) Want!Cool :), I think this will be a win for apps considering their data precious, not all data is equally disposable. I already feel shivers each time I have to recommend tracker reset -r ...
Exactly. Kind regards, Philip
Attachment:
signature.asc
Description: This is a digitally signed message part