Re: [Tracker] Database access abstraction

From: Martyn Russell <martyn imendio com>
To: JÃrg Billeter <j bitron ch>
Cc: Tracker-List <tracker-list gnome org>
Subject: Re: [Tracker] Database access abstraction
Date: Mon, 10 Nov 2008 15:12:25 +0000

JÃrg Billeter wrote:

On Sat, 2008-11-01 at 01:11 +0000, Martyn Russell wrote:

I have had a look at the API and have some comments.


Thanks for the review.

These are just my thoughts after running over the API. I know your work
might consist mainly of what we already had, but this could be the
perfect opportunity to rename those functions to fit better to what they
actually do.


Yes, I have not changed the last part of most function names, but it
probably makes sense to improve the function names a bit more at the
same time.

- tracker_data_live_search_get_all_ids(), could this be just _get_ids()
to shorten it? Much of the XESAM API is unfinished I think, so I will
not comment so heavily there.


Yes, I will remove the _all.

- tracker_data_manager_{get|set}_option_int(), I am not sure what this
is used for. I suspect user options in the common database? If so, the
API should really be renamed to something a bit more obvious. I don't
really know what it does from the name right now.


These are integer options stored in the Options table in the database. I
only see the IntegrityCheck and InitialIndex options being used. Maybe
{get,set}_db_option_int() is a bit better?


Yea.

- tracker_data_query_metadata_get_all(), I would probably rename this to
_query_metadata_by_service().


Do you mean service in the sense of service type or file/resource here?
The terms are a bit mixed up in the tracker codebase. What about
tracker_data_query_service_metadata? (in the file/resource meaning)


Yea, I really hate that we have "service" to mean 2 things here.
I mean service as in "category", e.g. "Files, "Video", "Music", etc.

- tracker_data_query_metadata_get(), this is a bit ambiguous to me and
the paramter suggests you need ID and key, but key seems to be field
name. I keep having to look at the source to see what these functions
actually do. I know this probably isn't your fault, but it would be good
to be a bit more concise with the naming if possible.


Maybe tracker_data_query_service_metadata_by_field?


Yea, that's much better.

- tracker_data_query_get_all_metadata(), is that "only_embedded" used?
Not sure what it does. Also, the "id" should really be "service_id" so
we know what it is. Variable names like "id" and "key" really don't tell
me what I should pass.


Yes, only_embedded is used to filter the results. If only_embedded is
TRUE, it won't return values of fields where tracker_field_get_embedded
returns FALSE. However, it still retrieves all metadata values from the
database, as far as I can tell. Parameter renaming certainly makes
sense.


What does the tracker_field_get_embedded() actually mean though?

I would have suggested something more pertinent but I couldn't see what
it really meant in the first place.

- tracker_data_query_service_get_by_entity(), what this really means
(after looking at the source) is,
tracker_data_query_service_by_service_id(). Again, as it is now I don't
really know what it means by "entity".


Or maybe better tracker_data_query_service_type_by_service_id? As it
returns the name of the service type, not of a single file/resource.


Yea, this should be clearer when we rename service to a more meaningful
name.

- tracker_data_query_check_service(), I would probably call that
tracker_data_query_file_exists(). We have this concept of "service"
meaning a file and I really dislike that, but I can understand why we
have a generic word to describe data. For this API, it is specific to
files so I think it would be fine to use "file" instead of "service".


Can't this function also be used e.g. with emails?


Hmm, I think so. It can also be used for files which are virtual which
don't actually exist on disk. At least that was the original
implementation AFAIK.

Then maybe call the function tracker_data_query_service_exists?


Yea, that would work better I feel.

I also dislike the confusion between service, service type, and file,
it's really used inconsistently. I'd just consistently use RDF
terminology to make it easier to understand, but I don't know how other
people would feel about a terminology change, Jamie? In RDF terminology,
we'd have resources (services/entities), classes (service types), and
properties (metadata fields).


Yes please! :)

- tracker_data_query_get_service_type(), maybe
tracker_data_query_file_get_service_type()?


In some way tracker_data_query_file_type_id might make sense but it
might also be confusing as it doesn't contain "service_type" in the
name, so your proposal is probably better.

Something like GFile in GIO would be very nice, in my opinion. This
would hide the path/uri/id details. Then we could have a set of
functions named tracker_resource_query_... that all take a


That is quite an undertaking but I would fully endorse that. Definitely
using a GFile would make things easier for future migration to a URI API
instead of Path based one.

TrackerResource as first parameter. Then add two functions
tracker_resource_new_for_path and tracker_resource_new_for_uri and don't
pass raw paths around anymore. This should also make it easier to switch
to URI support at one day. Are you agreeing with that proposal?


Yea that sounds superb.

- tracker_data_search_get_unique_values*(), I would possible remove the
_get_ here. Those API calls are quite long.


Not sure about that one as it doesn't really search unique values, it
just performs a search according to the specified condition and then
returns a list of unique values.


Ah I see. Then the name you have is better placed. I was just concerned
with the name space consistency mainly here.

- tracker_data_search_get_sum() and tracker_data_search_get_count(),
what is the difference?


get_sum returns the sum for a field (for example, track length) over a
set of services while get_count just returns the number of matches, as
far as I understand it.


OK. I am just wondering if we should try to improve the _get_sum() API
to be more obvious.

- tracker_data_search_get_metadata_for_files_in_folder(), again, I would
change folder and uri to path. I would remove the _get_ too. Perhaps
call it tracker_data_search_metadata_in_path()?


in_path sounds good, not sure about the get, see above.

OK.

- tracker_data_update_create_event(), if we have this, won't we need a
_delete_event() too?


Maybe, I have to admit that I have no idea how this works ;)


This is mostly for live searching I think. I don't think it is really
finished yet though pending the XESAM work.

-- 
Regards,
Martyn

Follow-Ups:
- Re: [Tracker] Database access abstraction
  - From: =?ISO-8859-1?Q?J=FCrg?= Billeter

References:
- Re: [Tracker] Database access abstraction
  - From: Martyn Russell
- Re: [Tracker] Database access abstraction
  - From: =?ISO-8859-1?Q?J=FCrg?= Billeter

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]