Re: [Tracker] Database access abstraction



On Sat, 2008-11-01 at 01:11 +0000, Martyn Russell wrote:
I have had a look at the API and have some comments.

Thanks for the review.

These are just my thoughts after running over the API. I know your work
might consist mainly of what we already had, but this could be the
perfect opportunity to rename those functions to fit better to what they
actually do.

Yes, I have not changed the last part of most function names, but it
probably makes sense to improve the function names a bit more at the
same time.

- tracker_data_live_search_get_all_ids(), could this be just _get_ids()
to shorten it? Much of the XESAM API is unfinished I think, so I will
not comment so heavily there.

Yes, I will remove the _all.

- tracker_data_manager_{get|set}_option_int(), I am not sure what this
is used for. I suspect user options in the common database? If so, the
API should really be renamed to something a bit more obvious. I don't
really know what it does from the name right now.

These are integer options stored in the Options table in the database. I
only see the IntegrityCheck and InitialIndex options being used. Maybe
{get,set}_db_option_int() is a bit better?

- tracker_data_query_metadata_get_all(), I would probably rename this to
_query_metadata_by_service().

Do you mean service in the sense of service type or file/resource here?
The terms are a bit mixed up in the tracker codebase. What about
tracker_data_query_service_metadata? (in the file/resource meaning)

- tracker_data_query_metadata_get(), this is a bit ambiguous to me and
the paramter suggests you need ID and key, but key seems to be field
name. I keep having to look at the source to see what these functions
actually do. I know this probably isn't your fault, but it would be good
to be a bit more concise with the naming if possible.

Maybe tracker_data_query_service_metadata_by_field?

- tracker_data_query_metadata_get_array(), ironically, this returns a
result set, and the _get_all() returns an array. I think in all these
functions "key" or "keys" should be "field_name" or "fields".

The array in the name is my fault, it's used to implement the
Metadata.GetArray D-Bus method which is clearly misleading here. I'll
check what name might make more sense here.

- tracker_data_query_get_all_metadata(), is that "only_embedded" used?
Not sure what it does. Also, the "id" should really be "service_id" so
we know what it is. Variable names like "id" and "key" really don't tell
me what I should pass.

Yes, only_embedded is used to filter the results. If only_embedded is
TRUE, it won't return values of fields where tracker_field_get_embedded
returns FALSE. However, it still retrieves all metadata values from the
database, as far as I can tell. Parameter renaming certainly makes
sense.

- tracker_data_query_service_get_by_entity(), what this really means
(after looking at the source) is,
tracker_data_query_service_by_service_id(). Again, as it is now I don't
really know what it means by "entity".

Or maybe better tracker_data_query_service_type_by_service_id? As it
returns the name of the service type, not of a single file/resource.

- tracker_data_query_file_get_id(), in all cases, "uri" should be
changed to "path". Right now we don't support URI properly. We support
paths. These two were interchanged a lot in the old code and we have
tried to reduce that as much as possible.

Yes, also noticed this, makes sense to change everything to path if uri
is not supported atm.

- tracker_data_query_check_service(), I would probably call that
tracker_data_query_file_exists(). We have this concept of "service"
meaning a file and I really dislike that, but I can understand why we
have a generic word to describe data. For this API, it is specific to
files so I think it would be fine to use "file" instead of "service".

Can't this function also be used e.g. with emails? Then maybe call the
function tracker_data_query_service_exists?

I also dislike the confusion between service, service type, and file,
it's really used inconsistently. I'd just consistently use RDF
terminology to make it easier to understand, but I don't know how other
people would feel about a terminology change, Jamie? In RDF terminology,
we'd have resources (services/entities), classes (service types), and
properties (metadata fields).

- tracker_data_query_get_service_type(), maybe
tracker_data_query_file_get_service_type()?

In some way tracker_data_query_file_type_id might make sense but it
might also be confusing as it doesn't contain "service_type" in the
name, so your proposal is probably better.

Something like GFile in GIO would be very nice, in my opinion. This
would hide the path/uri/id details. Then we could have a set of
functions named tracker_resource_query_... that all take a
TrackerResource as first parameter. Then add two functions
tracker_resource_new_for_path and tracker_resource_new_for_uri and don't
pass raw paths around anymore. This should also make it easier to switch
to URI support at one day. Are you agreeing with that proposal?

- tracker_data_schema_create_array_of_services(), I would use
_create_services_array() just because it is shorter.

Yes, makes sense.

- tracker_data_schema_metadata_get_table(), at first I thought it meant
hash table :) - maybe it should be _db_table()?

Yes, I'll change that.

- tracker_data_schema_xesam_get_all_text_metadata_names(), perhaps
remove the _all_, it is quite a long function name.

Yes, all probably doesn't really help here.

- tracker_data_search_files_get(), variable names like "folder_uri" I
would change to "path" just to be consistent.

Sure.

- tracker_data_search_get_unique_values*(), I would possible remove the
_get_ here. Those API calls are quite long.

Not sure about that one as it doesn't really search unique values, it
just performs a search according to the specified condition and then
returns a list of unique values.

- tracker_data_search_get_sum() and tracker_data_search_get_count(),
what is the difference?

get_sum returns the sum for a field (for example, track length) over a
set of services while get_count just returns the number of matches, as
far as I understand it.

- tracker_data_search_get_metadata_for_files_in_folder(), again, I would
change folder and uri to path. I would remove the _get_ too. Perhaps
call it tracker_data_search_metadata_in_path()?

in_path sounds good, not sure about the get, see above.

- tracker_data_update_delete_all_metadata(), something that is really
missing right now is
tracker_data_update_delete_all_metadata_recursively(). So this is the
same as the tracker_data_update_delete_service_recursively() but works
for metadata. When we remove a MMC right now, we only delete the
directory metadata, not every child underneath it. This is of course
very bad, but it just hasn't been implemented yet.

Yes, makes sense but I think we should do this after the API renaming.

- tracker_data_update_set_text(), I wonder if we should use "content"
instead of "text" for these next APIs? It sounds more obvious to me.

Yes, I'm all for consistent naming and we use content in some places, if
I remember correctly.

- tracker_data_update_create_event(), if we have this, won't we need a
_delete_event() too?

Maybe, I have to admit that I have no idea how this works ;)

JÃrg




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]