Re: Updated LaTeX filter - support indexing of tex files inside a compressed archive



On Tuesday 20 February 2007 16:54, D Bera wrote:
>
> From what I know it is terribly hard to detect
encodings i.e.
> differentiate between an iso-* encoding and utf8
encoding. Any
> document with any iso* encoding is also a valid utf8
encoded document.
>

I have found the program chardet at
http://chardet.feedparser.org/, which is 
based on statistical methods for detecting the
encoding of files and is an 
adaptation of the method used in netscape browsers,
written in python. This 
would be very useful for beagle, so I wonder whether
beagle will implement 
this algorithm (for a description of it, check 
http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html)
or 
should I propose this to the mono guys? I can start
working on it, though you 
shouldn't expect much, as I'm not a CS guy.

Regards



 
____________________________________________________________________________________
Finding fabulous fares is fun.  
Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains.
http://farechase.yahoo.com/promo-generic-14795097



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]