Re: [Tracker] Database access abstraction
- From: Martyn Russell <martyn imendio com>
- To: jamie mccrack gmail com
- Cc: Tracker-List <tracker-list gnome org>
- Subject: Re: [Tracker] Database access abstraction
- Date: Tue, 21 Oct 2008 09:36:07 +0100
Jamie McCracken wrote:
On Mon, 2008-10-20 at 20:07 +0200, JÃrg Billeter wrote:
Hi all,
hi juerg,
Hi :)
as a preparation to support decomposed database tables in Tracker, I've
been looking into how we could abstract database access so that the
database schema can change without affecting the code in trackerd and
tracker-indexer.
its also likely to affect the stored procs quite heavily and more so
than the code.
Not sure how you think it will affect stored procedures here?
From what I understand Juerg is talking about taking out all the SQL
construction and putting it in a common place. The SQL itself is
unlikely to change here (unless I misunderstood).
I dont think a full abstraction is necessary as saving/updating data is
only done in a few places and this code should be shared more
I think it is totally necessary. It has been on my TODO list for a long
time. Doing all that SQL construction in a common place is a superb idea
because then we know exactly where it is all done. Right now we have two
files (one in trackerd and one in the indexer) sure, but I like the idea
of calling an API from the indexer and/or the daemon which gets the
information and I know that if I want to change the way it works it is
all done completely outside of the trackerd and indexer implementations.
It also means we minimise and duplication of code/bugs/etc.
I'm proposing to introduce an additional (private) library that acts as
a high-level database interface, so it sits between libtracker-db and
trackerd/tracker-indexer. That library, let's call it libtracker-data
for now, is the only place where SQL queries get constructed.
well I have been asking for a libtracker-metadata to host shared
metadata support between trackerd and tracker-indexer so to some degree
you will have my approval :)
Yea, it is all on the list - just need to do it. :)
Also in the future i want to support direct access to sqlite via a
client lib so we can bypass dbus (and trackerd) for select queries where
speed is paramount and volume of data is too big for dbus to handle
optimally (think get all my 100,000 music tracks with metadata). So this
library would have to handle all querying and any future ones (like
sparql) - so you will have no problem from me for implementing that
support in a lib
Hmm, I would like to see the difference it makes using DBus and if it
really is an issue. We have an API like this in DBus now which Phillip
added - I really don't like the idea of people executing random SQL on
the databases. It can lead to much bigger problems. Phillip stresses
this in the .xml file where we document this API. I think quite rightly
so too.
Im not sure why you want to abstract *all* db access? i would have
thought indexer specific requests can quite nicely remain where they are
unless you have a good reason?
Why not?
It becomes then architecturally where all database abstraction is
(libtracker-db) and where all database SQL construction or SQL procedure
calls (libtracker-data) are kept.
It also means there is less duplication, less bugs, it is much more
maintainable, etc, etc. As far as I can see, there are no disadvantages
here, only advantages.
my preference is for sharing more routines rather than abstracting them
As a first step, we should probably just move relevant functions or
function parts from trackerd and tracker-indexer to the new library and
refactor and extend the library later. Looking at the current code, the
API of libtracker-data would be composed of the following parts:
* Ontology/Schema API
These functions don't query the actual data or metadata but only the
ontology/schema and its mapping to the database layout. They are
currently part of trackerd/tracker-db.c
tracker_db_metadata_get_related_names
tracker_db_metadata_get_table
tracker_db_get_field_name
tracker_db_get_metadata_field
tracker_db_create_array_of_services
tracker_db_xesam_get_metadata_names
tracker_db_xesam_get_all_text_metadata_names
tracker_db_xesam_get_service_names
* Service/Metadata Query API
These functions query information about a specific service/resource,
for example, path to id mapping and metadata retrieval. They are
currently part of trackerd/tracker-db.c and
tracker-indexer/tracker-indexer-db.c
tracker_db_metadata_get
tracker_db_metadata_get_all
tracker_db_metadata_get_array
tracker_db_metadata_get_delimited
tracker_db_get_all_metadata
tracker_db_get_parsed_metadata
tracker_db_get_unparsed_metadata
tracker_db_get_property_values
tracker_db_check_service
tracker_db_get_service_type
tracker_db_service_get_by_entity
tracker_db_file_get_id
tracker_db_file_get_id_as_string
* Search and General Query API
These functions perform arbitrary queries on the whole database. They
are currently part of tracker-db.c, tracker-metadata.c,
tracker-search.c, tracker-keywords.c, and tracker-files.c in trackerd
tracker_db_search_text
tracker_db_search_text_and_mime
tracker_db_search_text_and_location
tracker_db_search_text_and_mime_and_location
tracker_db_live_search_start
tracker_db_live_search_stop
tracker_db_live_search_get_all_ids
tracker_db_live_search_get_new_ids
tracker_db_live_search_get_deleted_ids
tracker_db_live_search_get_hit_data
tracker_db_live_search_get_hit_count
tracker_db_keywords_get_list
tracker_db_files_get
tracker_db_files_get_by_service
tracker_db_files_get_by_mime
tracker_db_create_event
tracker_db_xesam_delete_handled_events
tracker_data_get_unique_values
tracker_data_get_sum
tracker_data_get_count
tracker_data_get_unique_values_with_count
tracker_data_get_unique_values_with_count_and_sum
tracker_data_get_metadata_for_files_in_folder
tracker_data_keywords_search
tracker_data_search_query
The tracker_data_* signify the database access parts of the D-Bus
service methods. The actual D-Bus method implementations stay at
their place, of course.
* Update API
These functions are used to modify data and metadata, they are only
executed by the indexer and currently reside in
tracker-indexer/tracker-indexer-db.c
tracker_db_get_new_service_id
tracker_db_create_service
tracker_db_delete_service
tracker_db_delete_service_recursively
tracker_db_move_service
tracker_db_increment_stats
tracker_db_decrement_stats
tracker_db_set_metadata
tracker_db_delete_all_metadata
tracker_db_delete_metadata
tracker_db_set_text
tracker_db_get_text
tracker_db_delete_text
We also need to move trackerd/tracker-query-tree.c,
trackerd/tracker-rdf-query.c, trackerd/tracker-xesam-query, and
tracker-indexer/tracker-metadata.c to the new library as they generate
SQL queries or are used by other functions in libtracker-data.
Yea, right now tracker-query-tree.c is mostly for QDBM I think and the
_get_hit_count() API. Not sure how this will change with the new SQLite FTS.
I also think the tracker-xesam-query.c and tracker-rdf-query.c are very
similar (if that's teh XESAM file quick does RDF construction). I can't
remember if we absolved the duplication here with tracker-rdf-query.c or
not. That's another TODO item I think :)
Any comments or suggestions about this, does the grouping seem sensible?
Please note that I don't know Tracker's code base very well yet, so I
might be missing or misunderstanding some things.
Looks good to me. You might want to break down the searching a little
more into live and non-live - but I guess you will see if that is needed
as you get on.
Good analysis Juerg!
--
Regards,
Martyn
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]