Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
- From: "Chen, Zhenqiang" <zhenqiang chen intel com>
- To: 'Carlos Garnacho' <carlos lanedo com>, "'tracker-list gnome org'" <tracker-list gnome org>
- Subject: Re: [Tracker] Proposal to improve tracker-miner-fs "up-to-date" check performance
- Date: Tue, 30 Mar 2010 09:46:51 +0800
Carlos Garnacho wrote:
As Philip said, we should take into account memory usage as well, and
keeping a hashtable for each known item is not going to be nice...
TrackerCrawler guarantees that any directory will be processed after
its parent folder, and all the items in a directory will be processed
together, so we very probably can do this on a per-folder basis.
Agree. Combining with Philip and your suggestion, I prefer the logic as:
(1) get the total count of items with SPARQL's COUNT.
if count > 1000
do per-folder basis query with OFFSET and LIMIT
else
get all items once.
For most systems like netbook or handset, there are not much items.
(3) There is another issue in current implementation:
url for "Directory" files have form like "urn:software-category" not
"file:///" (see "miner_applications_process_file_cb" in
tracker-miner-applications.c). So we should change the uri format
before searching in hash table.
I suggest you to have a look at nie:url, which is meant to have
application readable URIs.
If you think the nie:url should be application readable URIs, there might have bugs in current
implementation.
Here are the code and dbus logs:
static gboolean
miner_applications_process_file_cb (gpointer user_data)
{
sparql = data->sparql;
if (name && g_ascii_strcasecmp (type, "Directory") == 0) {
gchar *canonical_uri = tracker_uri_printf_escaped (SOFTWARE_CATEGORY_URN_PREFIX "%s", path);
uri = canonical_uri;
} else if (name && g_ascii_strcasecmp (type, "Application") == 0) {
uri = g_file_get_uri (data->file);
} else if (name && g_str_has_suffix (type, "Applet")) {
/* The URI of the InformationElement should be a UUID URN */
uri = g_file_get_uri (data->file);
}
if (sparql && uri) {
/* The URL of the DataObject */
tracker_sparql_builder_predicate (sparql, "nie:url");
tracker_sparql_builder_object_string (sparql, uri);
}
}
dbus log:
method call sender=:1.41 -> dest=org.freedesktop.Tracker1 serial=148
path=/org/freedesktop/Tracker1/Resources; interface=org.freedesktop.Tracker1.Resources;
member=BatchSparqlUpdate
string "DROP GRAPH <file:///usr/share/desktop-directories/Utility.directory> INSERT INTO
<urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory> {
<urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory> a nfo:SoftwareCategory .
<urn:theme-icon:applications-accessories> a nfo:Image .
<urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory> nfo:softwareCategoryIcon
<urn:theme-icon:applications-accessories> ;
a nfo:FileDataObject , nie:DataObject ;
nie:title "Accessories" .
<urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory> nie:dataSource
<urn:nepomuk:datasource:84f20000-1241-11de-8c30-0800200c9a66> ;
nfo:fileName "Utility.directory" .
<file:///usr/share/desktop-directories/Utility.directory> a nfo:FileDataObject , nie:DataObject ;
nie:url "urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory" .
<urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory> nie:isStoredAs
<file:///usr/share/desktop-directories/Utility.directory> ;
nfo:fileLastModified "2010-03-15T03:18:49Z" .
}
You can find uri for Direcory is canonical_uri from the code. And in dbus log, it is
nie:url "urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory" .
And for query "SELECT nie:url(?f) WHERE {?f a nie:DataObject}", it returns two entries: one is an empty
string and another is urn:software-category:%2Fusr%2Fshare%2Fdesktop-directories%2FUtility.directory
Please have a try.
Thanks!
-Zhenqiang
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]