[Tracker] Cleaning up entries in the Resource table



Hello everyone,

I reverted the commit that cleans up the Resource table.

https://git.gnome.org/browse/tracker/commit/?id=8d1855c132da504b408576e0655ca50946ba9633
https://git.gnome.org/browse/tracker/commit/?id=e4a0f7be0c138a733b74413ff7a11a9431c7fe08

The Resource table is used to store the URI of all resources. Whenever a
Resource gets deleted, we right now keep it in that table nonetheless
because it might still be referred to by another resource. We don't do
garbage collection of references when you delete a resource (yet).

For example if you have:

<a> a thing .
<b> a thing .
<a> property <b> .

And you delete <b>, then <a> property <b> is not gone, and <b> simply is
no longer a thing, but only a ghost resource.

The commit I reverted rightfully wanted to clean up those ghosts.
However, it didn't check for remaining <a> property <b> references.

We want to fix this.

An idea Jürg had on IRC GimpNET was to use SQLite's Foreign key support:
https://www.sqlite.org/foreignkeys.html . Proposed is to use RESTRICT
and to add a FOREIGN KEY to all ID columns. Then to try to delete from
the Resource table where RESTRICT will make it fail if any foreign key
still exists.


For this to work we need to first test the performance impact of doing
so generally: are inserts slower because of the foreign key? Are deletes
slower?

Then we need to test how to delete. Will the entire transaction of a
mass delete fail? Will we delete one by one (a lot of transactions)?
Will we try a delete on Resource table whenever we delete a <a> property
<b> reference? What is the best?

Then we need to add code that goes over all the tables and adds the
FOREIGN KEY. This can't be done using ALTER TABLE.

http://www.sqlite.org/omitted.html

Complete ALTER TABLE support:

                Only the RENAME TABLE and ADD COLUMN variants of the ALTER TABLE
command are supported. Other kinds of ALTER TABLE operations such as
DROP COLUMN, ALTER COLUMN, __ADD CONSTRAINT__, and so forth are omitted.

So we would have to leverage existing code in the ontology change coping
which I wrote a few years ago (oh the horror); that takes a table and
the renames it, then creates a new table, and then copies the renamed
table into the new table, then deletes the renamed table. Thank you
SQLite's limited ALTER-TABLE support :).

Then we add the periodic DELETE on the Resource table and/or the DELETE
on the Resource table whenever we delete a <a> property <b> reference.

I think we can do this incrementally as a team. So if somebody is
looking for some spare time to burn on Tracker development, pick one of
the tasks and go ahead. Please report on this ML your findings of any
experimentation with foreign key support of SQLite.

Kind regards,

Philip

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]