Re: Raster's typing - was Re: gtkstep info



On Tue, Sep 22, 1998 at 01:00:48AM +0200, bert hubert wrote:
> Sounds like The Right Thing to me. Can we internationalize it?

Internationalization is tricky in any situation which involves any
non-trivial level of NLP; this may or may not be one of those
situations.  You see, the fact is that to do even a passable job of
typo correction, you need to have some knowledge of grammar.  For
example, if the user types

	"Ths is a test"

the typo corrector will come up with a number of permutations of "ths"
which are valid words, notably "the" and "this".  In order for the
typo corrector to know that "the" is a poor replacement, it has to
realize that the word "the" is rarely followed by a verb.  Simply
choosing words based on their usage frequency fails badly in this
case, as "the" is a more frequently used word than "this".  There are
many similar cases.


The algorithm which I use to handle these situations is (poorly)
documented in notyop/doc/algorithm.txt.  I'll try to get some better
documentation in there as soon as I catch up on my homework.

Certainly, all of the code is language-independent, and I would be
more than happy to work with anyone interested in building the
language-specific datafiles that the corrector would need to get the
job done.

Cheers,
-Nat



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]