Re: Raster's typing - was Re: gtkstep info
- From: Nat Friedman <ndf ALEPH1 MIT EDU>
- To: bert hubert <ahu vvtp tn tudelft nl>
- Cc: gnome-list gnome org
- Subject: Re: Raster's typing - was Re: gtkstep info
- Date: Mon, 21 Sep 1998 19:08:22 -0400
On Tue, Sep 22, 1998 at 01:00:48AM +0200, bert hubert wrote:
> Sounds like The Right Thing to me. Can we internationalize it?
Internationalization is tricky in any situation which involves any
non-trivial level of NLP; this may or may not be one of those
situations. You see, the fact is that to do even a passable job of
typo correction, you need to have some knowledge of grammar. For
example, if the user types
"Ths is a test"
the typo corrector will come up with a number of permutations of "ths"
which are valid words, notably "the" and "this". In order for the
typo corrector to know that "the" is a poor replacement, it has to
realize that the word "the" is rarely followed by a verb. Simply
choosing words based on their usage frequency fails badly in this
case, as "the" is a more frequently used word than "this". There are
many similar cases.
The algorithm which I use to handle these situations is (poorly)
documented in notyop/doc/algorithm.txt. I'll try to get some better
documentation in there as soon as I catch up on my homework.
Certainly, all of the code is language-independent, and I would be
more than happy to work with anyone interested in building the
language-specific datafiles that the corrector would need to get the
job done.
Cheers,
-Nat
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]