Re: [Tracker] Syntax of searchings
- From: Eyal Oren <eyal oren deri org>
- To: tracker-list gnome org
- Subject: Re: [Tracker] Syntax of searchings
- Date: Wed, 15 Nov 2006 21:25:25 +0000
On 11/15/06/11/06 22:02 +0100, Laurent Aguerreche wrote:
I have begun to search algorithms and I found:
* N-grams
http://en.wikipedia.org/wiki/N-gram
* levenshtein
http://www.php.net/manual/en/function.levenshtein.php
* similar text
http://www.php.net/manual/en/function.similar-text.php
* soundex
http://www.php.net/manual/en/function.soundex.php
soundex allows you to find term that *sound* similar to an indexed term, so
that might actually solve the french/swedish/danish transliteration
problem.
I'll ask a computational linguist colleague tomorrow, maybe he has some
ideas.
I do see one problem, namely that in one context (programming code) people
seem to prefer exact matches, without stemming or similarity-matching,
while in other contexts (words in text, file names) people do want stemming
and some form of similarity search regarding the orthography (spelling).
There is probably not one solution that fits these two uses, but probably a
search based on similarity would be fine also for source code.
-eyal
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]