Re: Stemmed search configuration

From: "Karsten Rasmussen" <frommetoyou comxnet dk>
To: <Dashboard-hackers gnome org>
Subject: Re: Stemmed search configuration
Date: Sun, 23 Mar 2008 08:09:36 +0100

how to decide the language of the data/metadata for each document.

Proberly not easy. But i gues word/oo documents must have this informationembedded, How else can they select the right spellchecker. Maybe beaglefilters should dump html (not text) - this would allow the the text tocontain the html <lang> markup.

Beagle has the means to use a different stemmer for each document

Does this mean the stemmer are used while indexing data, and not whilesearching data?

For most documents, only some data/metadata fields are in a differentlanguage and the others are generally in English

I do not think this is a problem. My geas is most searches in my organitionwill be keywords from the contents of the documents, not the tecknical metadata. 98% of the search will be after danish words.

If you are using 0.3.x ... change ... DEFAULT_STEMMER = "Danish";

Sorry have not been able to compile (configure) 0.3.2 on my fedora FC6 (only0.2.x)

I have installed:
   ndesk-dbus-0.6.1a-2.fc9
   ndesk-dbus-glib-devel-0.4.1-3.fc9
   ndesk-dbus-devel-0.6.1a-2.fc9

But configuration fails:

configure: error: Package requirements (ndesk-dbus-glib-1.0 >= 0.3.0)were not met:Consider adjusting the PKG_CONFIG_PATH environment variable if youinstalled software in a non-standard prefix.Alternatively, you may set the environment variables NDESK_DBUS_CFLAGSand NDESK_DBUS_LIBS to avoid the need to call pkg-config.

/knr

Follow-Ups:
- Re: Stemmed search configuration
  - From: D Bera

References:
- Stemmed search configuration
  - From: Karsten Rasmussen
- Re: Stemmed search configuration
  - From: Debajyoti Bera

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]