Re: [Tracker] [Patch] Do not skip files having invalid UTF8




Hello,

Le lundi 30 octobre 2006 Ã 13:58 +0100, Julien a Ãcrit :
Hello,
a filename may be invalid (from a UTF8 point of view) but the
content
may be perfectly valid. This happens, for example, when importing
files
in a locale (FAT32) to locale (UTF8). This leads for example to
files
named as:
Vincent Delerm - 11 - L'heure du thï.mp3

instead of
Vincent Delerm - 11 - L'heure du thÃ.mp3


You really listen Vincent Delerm without falling asleep?   :-D

I am only indexing the file, not listening to the music :-)






Unfortunately, Tracker skips on such file. The attached patch fix
the
problem by replacing g_filename_to_utf8() with
g_filename_basename(),
which always returns a valid UTF8 filename.

Yes, it returns valid UTF8 but still not a correct URI if encoding was
bad :
- "/home/laurent/L'heure du thï.mp3"
gives me
- L'heure du th?.mp3


When running "tracker-search heure" on the patched version, I get the
correct full path:
/home/julien/Musique/Vincent Delerm/Vincent Delerm - 11 - L'heure du
thï .mp3



Character "ï" has been replaced by "?" (and according to its name,
g_filename_display_basename() only returns filename in a path).


Tracker works with two scenarios:
- filesystem is not in locale, so any filename must be converted into
UTF8,
- FS is already in UTF8, so nothing has to be done.

Encoding is not found with fstab for a FS but with locales. So if you
have a FS that uses a different encoding than your locales, you will
have problems...



In my case, I copied file from a FAT32 to my hard drive and during the
process the accentuation was lost.
This is on default Ubuntu Edgy. Therefore I do not think it's a problem
of configuration as this may happen for many end-users (think copying
files from a FAT32 usb stick).

However I agree that my patch is far from optimal, it was just to
suggest a solution.

Best,
Julien






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]