Re: [Tracker] Text extraction on text formats



Le vendredi 17 novembre 2006 Ã 14:37 +0100, Luca Ferretti a Ãcrit :
Il giorno gio, 16/11/2006 alle 21.36 +0100, Laurent Aguerreche ha
scritto:
Le jeudi 16 novembre 2006 Ã 18:55 +0000, Jamie McCracken a Ãcrit :
Luca Ferretti wrote:
I'm trying to check and eventually expand info in
http://live.gnome.org/Tracker/SupportedFormats.


OTT (OpenDocument Text Template)
  content:              no (????)

Now yes.


Lauren, maybe a similar addition is needed for other OO.o 1&2 *template
mimetypes? 

So here a patch to add them. I do not add more support for Star Division
files because I just cannot make a file of that type!
I remove calls to "nice" because children of a processus inherit its
priority... so there are at 19.
I saw that MS Word filter uses vwText. According to the site of wvWare
( http://wvware.sourceforge.net/ ), Abiword is now preferred to this
tool. But I wonder if we could use libGSF directly to extract content of
Word files... If I remember correctly, Wv just uses libGSF.

I also propose a patch to:
* extract text content only
in /tmp/Tracker-user.pid/tmp_text_file_XXXXXX so now everything happens
in /tmp/Tracker-user.pid and it should ensure privacy of files,
* not make a not useful hierarchy like /home/user
in /tmp/Tracker-user.pid to store cache of SQLite.


Laurent.

Attachment: more-oasis-filters+fixes.diff.gz
Description: GNU Zip compressed data

Attachment: fix-using-of-tmp-directory.diff.gz
Description: GNU Zip compressed data

Attachment: signature.asc
Description: Ceci est une partie de message =?ISO-8859-1?Q?num=E9riquement?= =?ISO-8859-1?Q?_sign=E9e?=



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]