Re: [Tracker] Database access abstraction



JÃrg Billeter wrote:
On Mon, 2008-10-20 at 20:07 +0200, JÃrg Billeter wrote:
as a preparation to support decomposed database tables in Tracker, I've
been looking into how we could abstract database access so that the
database schema can change without affecting the code in trackerd and
tracker-indexer.

I'm proposing to introduce an additional (private) library that acts as
a high-level database interface, so it sits between libtracker-db and
trackerd/tracker-indexer. That library, let's call it libtracker-data
for now, is the only place where SQL queries get constructed.

As a first step, we should probably just move relevant functions or
function parts from trackerd and tracker-indexer to the new library and
refactor and extend the library later. Looking at the current code, the
API of libtracker-data would be composed of the following parts:

I've put a test version of libtracker-data online at
      http://www.gnome.org/~juergbi/tracker.git/

The new and moved functions are in src/libtracker-data/tracker-data-*
Any comments or suggestions?

Hi JÃrg,

I have had a look at the API and have some comments.

- tracker_data_live_search_get_all_ids(), could this be just _get_ids()
to shorten it? Much of the XESAM API is unfinished I think, so I will
not comment so heavily there.

- tracker_data_manager_{get|set}_option_int(), I am not sure what this
is used for. I suspect user options in the common database? If so, the
API should really be renamed to something a bit more obvious. I don't
really know what it does from the name right now.

- tracker_data_query_metadata_get_all(), I would probably rename this to
_query_metadata_by_service().

- tracker_data_query_metadata_get(), this is a bit ambiguous to me and
the paramter suggests you need ID and key, but key seems to be field
name. I keep having to look at the source to see what these functions
actually do. I know this probably isn't your fault, but it would be good
to be a bit more concise with the naming if possible.

- tracker_data_query_metadata_get_array(), ironically, this returns a
result set, and the _get_all() returns an array. I think in all these
functions "key" or "keys" should be "field_name" or "fields".

- tracker_data_query_get_all_metadata(), is that "only_embedded" used?
Not sure what it does. Also, the "id" should really be "service_id" so
we know what it is. Variable names like "id" and "key" really don't tell
me what I should pass.

- tracker_data_query_service_get_by_entity(), what this really means
(after looking at the source) is,
tracker_data_query_service_by_service_id(). Again, as it is now I don't
really know what it means by "entity".

- tracker_data_query_file_get_id(), in all cases, "uri" should be
changed to "path". Right now we don't support URI properly. We support
paths. These two were interchanged a lot in the old code and we have
tried to reduce that as much as possible.

- tracker_data_query_check_service(), I would probably call that
tracker_data_query_file_exists(). We have this concept of "service"
meaning a file and I really dislike that, but I can understand why we
have a generic word to describe data. For this API, it is specific to
files so I think it would be fine to use "file" instead of "service".

- tracker_data_query_get_service_type(), maybe
tracker_data_query_file_get_service_type()?

- tracker_data_schema_create_array_of_services(), I would use
_create_services_array() just because it is shorter.

- tracker_data_schema_metadata_get_table(), at first I thought it meant
hash table :) - maybe it should be _db_table()?

- tracker_data_schema_xesam_get_all_text_metadata_names(), perhaps
remove the _all_, it is quite a long function name.

- tracker_data_search_files_get(), variable names like "folder_uri" I
would change to "path" just to be consistent.

- tracker_data_search_get_unique_values*(), I would possible remove the
_get_ here. Those API calls are quite long.

- tracker_data_search_get_sum() and tracker_data_search_get_count(),
what is the difference?

- tracker_data_search_get_metadata_for_files_in_folder(), again, I would
change folder and uri to path. I would remove the _get_ too. Perhaps
call it tracker_data_search_metadata_in_path()?

- tracker_data_update_delete_all_metadata(), something that is really
missing right now is
tracker_data_update_delete_all_metadata_recursively(). So this is the
same as the tracker_data_update_delete_service_recursively() but works
for metadata. When we remove a MMC right now, we only delete the
directory metadata, not every child underneath it. This is of course
very bad, but it just hasn't been implemented yet.

- tracker_data_update_set_text(), I wonder if we should use "content"
instead of "text" for these next APIs? It sounds more obvious to me.

- tracker_data_update_create_event(), if we have this, won't we need a
_delete_event() too?

These are just my thoughts after running over the API. I know your work
might consist mainly of what we already had, but this could be the
perfect opportunity to rename those functions to fit better to what they
actually do.

Great start JÃrg! Thanks for doing this.

-- 
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]