Re: [Tracker] html_filter generate a extra file
- From: Laurent Aguerreche <laurent aguerreche free fr>
- To: Jamie McCracken <jamiemcc blueyonder co uk>
- Cc: tracker-list gnome org
- Subject: Re: [Tracker] html_filter generate a extra file
- Date: Sun, 12 Nov 2006 22:59:07 +0100
Le dimanche 12 novembre 2006 Ã 21:42 +0000, Jamie McCracken a Ãcrit :
Laurent Aguerreche wrote:
Le dimanche 12 novembre 2006 Ã 18:33 +0000, Jamie McCracken a Ãcrit :
Carles Briansà wrote:
The html_filter generate a extra file that contains a tags of the
original file.
Example:
I save a file named debian.html in home directory
Then the trackerd detect and added correctly
Later in the directory of debian.html, have a debian_t.txt that contain
a tags of debian.html file.
I see that the file is generated by htmless program.
okay I will investigate (htmless is not my code)
I wonder if we should replace htmless by an internet browser in text
mode like links or w3m. Htmless does not seem to know é (which
should be replaced by 'Ã') for instance and seems to not treat UTF-8.
With links, accentuated characters like 'Ã' become ''e' but I saw it to
work correctly with UTF-8 files and not with ISO-8859-1 files.
With w3m the extracted text is in UTF8 and there are not things like
é, à, etc.
feel free to submit a patch - we should support utf8 so w3m looks best
I sent a new version of the html_filter executable in my previous
mail. ;-)
But if you meant that I should commit it in CVS, I just cannot do it
currently since I did not receive any answer for my CVS account request
for GNOME.
Laurent.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]