Re: [Tracker] Automatic Language Detection



On Mon, 2007-03-05 at 23:42 +0000, jamie wrote:
On Mon, 2007-03-05 at 18:19 -0500, Edward Duffy wrote:
Hi Guys -

I just wrote a patch for #377891[1], could I get some of you to test
it.  I ran some pdfs I found with google.fr and google.it, and it
seems to be working correctly...but more eyes the better.

great stuff but we only support utf-8 - are all those language modules
utf-8 based?

oh we can also support converting utf-8 to the charset supported by
libtextcat LM (if its not utf8) but only worth doing so for a small
chunk of text (like the first 4kb)

see:
http://developer.gnome.org/doc/API/2.0/glib/glib-Character-Set-Conversion.html

might need to maintain a config file for all of these so we can do it
smartly...

jamie.







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]