Re: [Tracker] [Patch] Correct locales support



Le mercredi 30 aoÃt 2006 Ã 17:01 +0100, Jamie McCracken a Ãcrit :
Laurent Aguerreche wrote:
Le mardi 29 aoÃt 2006 Ã 12:39 +0100, Jamie McCracken a Ãcrit :
Laurent Aguerreche wrote:

They correctly appear with 'g_async_queue_try_pop
(tracker->file_process_queue)' during process but are not indexed since:
- info->mtime == info->indextime == 0...
- action is TRACKER_ACTION_FILE_CHECK
so index_file() is not called.

this is likely a problem in the stat call - it wont find the file with a 
non-utf8 filename in those cases

The solution would be to use g_filename_from_utf8 
(http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html) 
   to convert utf8 filename back to user's encoding before calling stat 
and before calling the metadata extraction stuff.

its tricky because we must avoid passing any non utf8 string to any 
other glib function

I dont have time at the moment to fix these as I am at work so patches 
welcome.

I almost finished.


great!

Basically all calls to system (lstat, stat, g_file_test(G_FILE_EXISTS),
etc.) need to use strings in locale, not in UTF-8 in trackerd.

yes - we used g_filename_to_utf8  in tracker so you must use 
g_filename_from_utf8  to do the reverse


For clients, any sent string must be encoded in utf8.

It seems there is still some issues with tracker-extract but they should
go quickly.

tracker-extract uses glib and libextractor also uses glib so I dont know 
whether it needs changing - please experiment for whats best



But sometimes I wonder where to declare variables or why many lines are
left empty! 
Do you want to declare variables as close as their first use or at
beginning of block only?
It should be interesting to define some coding style rules and follow
them.

if they are used throughout the function then they should obviously be 
declared at the top. If they are local to a small code block then I dont 
mind

lines are left empty for neatness

coding rules are to follow whats there already (lower case functions and 
vars, 8 space tab indent, "{" on same line as if/while block etc)

thanks for investing your time in this area!



This is the patch.

* many g_filename_to_utf8() or g_locale_to_utf8() added;
* setlocale(LC_ALL, "") added to get a working g_get_charset() in any
program;
* some g_utf8_validate() removed when g_locale_to_utf8() or
g_filename_to_utf8() need to be called;
* some code refactoring in clients;
* some headers removed or added in clients...;
* realpath(path,NULL) instead of realpath(path,tmp) in clients because
some systems are not limited with deep of directories. NULL handling is
a GNU extension.
* in tracker-extract.c, there were lines like:
   EXTRACTOR_removeEmptyKeywords (keywords);
  but this function modify keywords list and send a new one! So it needs
to be use that way:
   keywords = EXTRACTOR_removeEmptyKeywords (keywords);
* stat() => g_stat(), lstat() => g_lstat()
* some tests removed:

char **foo;
for (foo = bar; *foo; foo++) {
   if (*foo) {   /* it is always true... */
     ...
   }
}


Now Tracker seems to fully work on non UTF-8 systems but FAM backend has
some problems:
- let it index and all your files and extract metadata;
- then do something like:
  $ echo 'foo' > bar
- fam_callback() is called but I can only see plenty of lines:

tracker_exec_sql failed: Table 'tmpfiles' already exists [Call
GetPendingFiles()]
tracker_exec_sql failed: Table 'tmpfiles' already exists [Call
GetPendingFiles()]
tracker_exec_sql failed: Table 'tmpfiles' already exists [Call
GetPendingFiles()]
etc.

and it never stops... Is is amazing to see that this bug doesn't happen
if I add breakpoints into fam_callback() and then run it in gdb...


Laurent.

Attachment: correct-locale-support-08-31.diff.tar.gz
Description: application/compressed-tar



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]