Re: [Tracker] Automatic Language Detection



On Wed, 2007-03-07 at 11:45 -0500, Edward Duffy wrote:
I think I got it -- new patch in bugzilla.

not quite what I had in mind

1) I would prefer it in-process if its stable

2) we only shove it the first 1 or 2kb of text for speed

3) We already convert and validate everything as utf-8 so no need to do
any more there

4) Its the language models that are not all utf-8 compliant - we need to
convert *from* utf-8 to whatever charset the LM requires

if anything is unclear let me know...

jamie.









[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]