Le dimanche 22 juillet 2007 Ã 01:39 +0100, jamie a Ãcrit :
On Sun, 2007-07-22 at 02:12 +0200, Laurent Aguerreche wrote:Hello, I propose a patch to fix a bunch of things: * Not-named email attachments are not ignored anymore Some inlined email attachment do not have a name so I propose to use name of their corresponding mail-part. * Fixed removing of saved email attachments from spam emails in Tracker's TMP dir Yes, email attachments from spam were not removed in /tmp/Tracker[...]. * Fixed message "saved email attachment..." When I call this function I never know URI of related email so I won't try to display it there anymore. Furthermore URI is displayed right after in a debug output... * Added "tracker->email_attachements_dir" variable This way it is not necessary to have numerous g_build_filename() to make path where to store email attachments. * Added email_make_tmp_name_for_mail_attachment() function This function adds a random number at the beginning of any email attachment name to *try* to avoid name collisions in /tmp/Tracker[...]. * Added tracker_is_empty_string() function To avoid things like "if (strlen(string)==0) {...}"... And another thing: to test if a slist is empty or not, we just have to test whether its variable is NULL or not so WE MUST AVOID things like "if (g_slist_length(list)==0) {...}" because length of a slist is retrieved by counting EACH element. * Fixed SEGFAULT when SIGTERM is sent to Tracker while emails are being indexed "tracker->dir_list" was not set to NULL after a call to g_slist_free(). * Fixed heuristics to detect a text file We were only reading a few bytes at the beginning of a file... So some binary files which stores characters at their top to be identified were incorrectly identified as text files. * Fixed calls to pango_get_log_attrs() There is a difference between "number of bytes" and "number of characters" with UTF8. So no more "pango_get_log_attrs: attrs_len should have been at least [...]" messages now. * Unified error messages formats I tried to correctly use tracker_error() and tracker_log()... Can I commit?everything except : 1)changes to text file detection - Im not sure how you have improved it? Also I copied the code from gnome vfs which we know works so I would like to keep it as is especially as im about to release
"is_utf8()" is not modified. In "is_text_file()", it was buffer_length = fread (buffer, 4096, 1, f); so we read one element of 4096 bytes and buffer_length was valued to one. So, after, we were only reading one byte and test whether it was a correct UTF8 character or not. With buffer_length = fread (buffer, 1, 1024, f); I read 1024 bytes and I ask to check that characters in those 1024 bytes are in UTF8. Question: is an empty file a text file?
2) these changes to tracker-db-sqlite.c @@ -2317,7 +2317,7 @@ while (fgets (buffer, 65565, file)) { unsigned int buffer_length; - buffer_length = strlen (buffer); + buffer_length = g_utf8_strlen (buffer, -1); if (buffer_length < 3) { continue; @@ -2331,7 +2331,7 @@ continue; } - if ((strlen (value) < buffer_length)) { + if ((g_utf8_strlen (value, -1) < buffer_length)) { g_free (value); continue; } above is wrong - we use buffer_length for the utf8 validate call which needs size in bytes not utf8 chars can you make sure none of the changes involving buffer_length get through pls pls commit rest though - thx
Ok.
jamie
Attachment:
signature.asc
Description: Ceci est une partie de message =?ISO-8859-1?Q?num=E9riquement?= =?ISO-8859-1?Q?_sign=E9e?=