Re: [Tracker] text search is implicitly always OR



Samuel wrote:
Also, as I said earlier, there's a difference when matching
"wallpaper" vs "wallpapers", or "pictures" and "picture", or
"accommodate" vs "accomodate". Ideally, there should be some sort of
stemming (Porter, for example) and maybe the double metaphone
algorithm to do fuzzier matching. Of course, the Porter stemming would
bring up issues when using Tracker in other languages, though I think
there are also algorithms for French and German (that I know of).


yeah until we have language independent stemming it would be tricky.

For now you can use the * wildcard at the end of a search term to get plurals and other related stuff

I dont really like the fuzzy sort of searches and they are best left to a dedicated indexer (which will do a better faster job)

Long term I may provide the option to store text contents in CLucene(1) which has all those extra bells and whistles for those with bigger machines (the duping of metadata but not contents and the increased size of lucene's indexes means a lot more disk space and RAM would be required though)

(1) http://clucene.sourceforge.net/index.php/Main_Page

(this would make a nice project for any budding c++ programmer)


--
Mr Jamie McCracken
http://jamiemcc.livejournal.com/




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]