Le jeudi 16 novembre 2006 Ã 22:53 +0100, Laurent Aguerreche a Ãcrit :
Le jeudi 16 novembre 2006 Ã 22:46 +0100, Luca Ferretti a Ãcrit :Il giorno gio, 16/11/2006 alle 21.36 +0100, Laurent Aguerreche ha scritto:Le jeudi 16 novembre 2006 Ã 18:55 +0000, Jamie McCracken a Ãcrit :Luca Ferretti wrote:I suspect that the RTF format is currently not managed by tracker. We should manage it, 'cause it's the only format supported by all Word Processors. Read note [4] about metadata and non ASCII characters.package unrtf in debian/ubuntu universe might help with this - it has command line to convert to plain text - anyone wanna write a filter for this?hum, $ unrtf --text pooooo.rtf This is UnRTF, version 0.19.2 By Dave Davey and Marcos Serrou do Amaral Original Author: Zach T. Smith Processing pooooo.rtf... ### Translation from RTF performed by UnRTF, version 0.19.2 ### For information about this marvellous program, ### please go to http://www.gnu.org/software/unrtf/unrtf.html ### document uses ANSI character set ### font table contains 4 fonts total modello, ,schema, AUTHOR: Luca Ferretti ### creaton date: 16 November 2006 15:29 ### revision date: 1 January 1601 ### last printed: 1 January 1601 ### comments: StarWriter ----------------- Questo ?? un semplice esempio delle potenzialit?? di OO.o^^ it was "Ã" ^^ it was ÃA question: what is encoding of this string? UTF8, ISO-something, Win-something, etc.? Now, I can see that with libGSF and a RTF file: Doc.Comment="Questo file altro non \303\250 che un esempio di modello di file per OO.o Writer per testare l'indicizzazione di Tracker";
It is really strange... I also have the same problem on some of my DOC files but I do not know whether it impacts any DOC file. I do not see any way to resolve that problem except to contact authors of libgsf. I tried "wv" with wvSummay command to print the data that we are looking for but wv has the same problems and it also uses libgsf... Nevertheless I send a patch that makes tracker-extract to print not empty metadata (yeah!) and to have a better memory management. Laurent.
Attachment:
better-using-of-libgsf.diff.gz
Description: GNU Zip compressed data