Le mercredi 30 aoÃt 2006 Ã 17:01 +0100, Jamie McCracken a Ãcrit :
Laurent Aguerreche wrote:Le mardi 29 aoÃt 2006 Ã 12:39 +0100, Jamie McCracken a Ãcrit :Laurent Aguerreche wrote:They correctly appear with 'g_async_queue_try_pop (tracker->file_process_queue)' during process but are not indexed since: - info->mtime == info->indextime == 0... - action is TRACKER_ACTION_FILE_CHECK so index_file() is not called.this is likely a problem in the stat call - it wont find the file with a non-utf8 filename in those cases The solution would be to use g_filename_from_utf8 (http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html) to convert utf8 filename back to user's encoding before calling stat and before calling the metadata extraction stuff. its tricky because we must avoid passing any non utf8 string to any other glib function I dont have time at the moment to fix these as I am at work so patches welcome.I almost finished.great!Basically all calls to system (lstat, stat, g_file_test(G_FILE_EXISTS), etc.) need to use strings in locale, not in UTF-8 in trackerd.yes - we used g_filename_to_utf8 in tracker so you must use g_filename_from_utf8 to do the reverseFor clients, any sent string must be encoded in utf8. It seems there is still some issues with tracker-extract but they should go quickly.tracker-extract uses glib and libextractor also uses glib so I dont know whether it needs changing - please experiment for whats bestBut sometimes I wonder where to declare variables or why many lines are left empty! Do you want to declare variables as close as their first use or at beginning of block only? It should be interesting to define some coding style rules and follow them.if they are used throughout the function then they should obviously be declared at the top. If they are local to a small code block then I dont mind lines are left empty for neatness coding rules are to follow whats there already (lower case functions and vars, 8 space tab indent, "{" on same line as if/while block etc) thanks for investing your time in this area!
This is the patch. * many g_filename_to_utf8() or g_locale_to_utf8() added; * setlocale(LC_ALL, "") added to get a working g_get_charset() in any program; * some g_utf8_validate() removed when g_locale_to_utf8() or g_filename_to_utf8() need to be called; * some code refactoring in clients; * some headers removed or added in clients...; * realpath(path,NULL) instead of realpath(path,tmp) in clients because some systems are not limited with deep of directories. NULL handling is a GNU extension. * in tracker-extract.c, there were lines like: EXTRACTOR_removeEmptyKeywords (keywords); but this function modify keywords list and send a new one! So it needs to be use that way: keywords = EXTRACTOR_removeEmptyKeywords (keywords); * stat() => g_stat(), lstat() => g_lstat() * some tests removed: char **foo; for (foo = bar; *foo; foo++) { if (*foo) { /* it is always true... */ ... } } Now Tracker seems to fully work on non UTF-8 systems but FAM backend has some problems: - let it index and all your files and extract metadata; - then do something like: $ echo 'foo' > bar - fam_callback() is called but I can only see plenty of lines: tracker_exec_sql failed: Table 'tmpfiles' already exists [Call GetPendingFiles()] tracker_exec_sql failed: Table 'tmpfiles' already exists [Call GetPendingFiles()] tracker_exec_sql failed: Table 'tmpfiles' already exists [Call GetPendingFiles()] etc. and it never stops... Is is amazing to see that this bug doesn't happen if I add breakpoints into fam_callback() and then run it in gdb... Laurent.
Attachment:
correct-locale-support-08-31.diff.tar.gz
Description: application/compressed-tar