Re: [Tracker] Syntax of searchings



On 11/15/06/11/06 22:02 +0100, Laurent Aguerreche wrote:

I have begun to search algorithms and I found:

* N-grams
 http://en.wikipedia.org/wiki/N-gram
* levenshtein
 http://www.php.net/manual/en/function.levenshtein.php
* similar text
 http://www.php.net/manual/en/function.similar-text.php
* soundex
 http://www.php.net/manual/en/function.soundex.php
soundex allows you to find term that *sound* similar to an indexed term, so that might actually solve the french/swedish/danish transliteration problem.

I'll ask a computational linguist colleague tomorrow, maybe he has some ideas. I do see one problem, namely that in one context (programming code) people seem to prefer exact matches, without stemming or similarity-matching, while in other contexts (words in text, file names) people do want stemming and some form of similarity search regarding the orthography (spelling). There is probably not one solution that fits these two uses, but probably a search based on similarity would be fine also for source code.

-eyal



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]